An Algorithm for Fuzzification of WordNets and its Application in Sentiment Analysis

Document Type : Original Article

Authors

1 Iran University of Science and Technology

2 Ferdowsi University of Mashhad

3 University of Cagliari

4 University of Bologna

Abstract

WordNet-like Lexical Databases (WLDs) group English words into sets of synonyms called “synsets.” Synsets are utilized for several applications in the field of text mining. However, they were also open to criticism because although, in theory, not all the members (i.e. word senses) of a synset represent the meaning of that synset with the same degree, in practice, in WLDs they are considered as members of the synset identically. Correspondingly, the fuzzy version of synonym sets, called fuzzy-synsets were proposed. But, to the best or our knowledge. In this study, we present an algorithm for constructing fuzzy version of WLDs of any language, given a corpus of documents and a word-sense-disambiguation system of that language. A theoretical proof is also proposed for the validity of results of the proposed algorithm. Then, inputting the open-American-online-corpus (OANC) and UKB word-sense-disambiguation to the algorithm, we construct and publish online the fuzzified version English WordNet (FWN), and apply them in a Sentiment Analysis problem.

Keywords

Main Subjects


Baccianella, S., Esuli, A., & Sebastiani, F. (2010, May). Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In Lrec (Vol. 10, No. 2010, pp. 2200-2204). DOI:10.1016/S0165-0114(98)00137-7
Basile, P., Degemmis, M., Gentile, A. L., Lops, P., & Semeraro, G. (2007). The jigsaw algorithm for word sense disambiguation and semantic indexing of documents. In AI* IA 2007: Artificial Intelligence and Human-Oriented Computing: 10th Congress of the Italian Association for Artificial Intelligence, Rome, Italy, September 10-13, 2007. Proceedings 10 (pp. 314-325). Springer Berlin Heidelberg. DOI:10.1016/S0165-0114(98)00137-7
Benarab, A., Sun, J., Rafique, F., & Refoufi, A. (2023). Global Ontology Entities Embeddings. IEEE Transactions on Knowledge and Data Engineering. DOI:10.1016/S0165-0114(98)00137-7
Bender, E. M., & Koller, A. (2020, July). Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of the 58th annual meeting of the association for computational linguistics (pp. 5185-5198). DOI:10.1016/S0165-0114(98)00137-7
Bimson, K. D., Hull, R. D., & Nieten, D. (2016). The lexical bridge: A methodology for bridging the semantic gaps between a natural language and an ontology. Semantic Web: Implications for Technologies and Business Practices, 137-151. DOI:10.1016/S0165-0114(98)00137-7
Birjali, M., Kasri, M., & Beni-Hssane, A. (2021). A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowledge-Based Systems, 226, 107134. DOI:10.1016/S0165-0114(98)00137-7
Bond, F., & Paik, K. (2012, January). A survey of wordnets and their licenses. In Proceedings of the 6th Global WordNet Conference (GWC 2012) (pp. 64-71). DOI:10.1016/S0165-0114(98)00137-7
Borin, L. (2005). Mannen är faderns mormor: Svenskt associationslexikon reinkarnerat. LexicoNordica, (12). DOI:10.1016/S0165-0114(98)00137-7
Borin, L., & Forsberg, M. (2009, May). All in the family: A comparison of SALDO and WordNet. In Proceedings of the Nodalida 2009 Workshop on WordNets and other Lexical Semantic Resources–between Lexical Semantics, Lexicography, Terminology and Formal Ontologies (pp. 7-12). DOI:10.1016/S0165-0114(98)00137-7
Borin, L., & Forsberg, M. (2010). Beyond the synset: Swesaurus–a fuzzy Swedish wordnet. In Workshop on Re-thinking synonymy: Semantic sameness and similarity in languages and their description. Helsinki. DOI:10.1016/S0165-0114(98)00137-7
Borin, L., & Forsberg, M. (2010, May). From the people’s synonym dictionary to fuzzy synsets-first steps. In Proceedings of the LREC 2010 workshop Semantic relations. Theory and Applications (pp. 18-25). DOI:10.1016/S0165-0114(98)00137-7
Chu, J. S., & Evans, J. A. (2021). Slowed canonical progress in large fields of science. Proceedings of the National Academy of Sciences118(41), e2021636118. DOI:10.1016/S0165-0114(98)00137-7
De Gemmis, M., Lops, P., Semeraro, G., & Basile, P. (2008, October). Integrating tags in a semantic content-based recommender. In Proceedings of the 2008 ACM conference on Recommender systems (pp. 163-170). DOI:10.1016/S0165-0114(98)00137-7
Hajiali, M. (2020). Big data and sentiment analysis: A comprehensive and systematic literature review. Concurrency and Computation: Practice and Experience, 32(14), e5671. DOI:10.1016/S0165-0114(98)00137-7
Hassani, H., Beneki, C., Unger, S., Mazinani, M. T., & Yeganegi, M. R. (2020). Text mining in big data analytics. Big Data and Cognitive Computing, 4(1), 1. DOI:10.1016/S0165-0114(98)00137-7
Hershcovich, D., & Donatelli, L. (2021). It’s the meaning that counts: the state of the art in NLP and semantics. KI-Künstliche Intelligenz, 35(3-4), 255-270. DOI:10.1016/S0165-0114(98)00137-7
Hossayni, S. A., Akbarzadeh-T, M. R., Recupero, D. R., Gangemi, A., & Josep Lluís de la Rosa i Esteva. (2016). Fuzzy Synsets, and Lexicon-Based Sentiment Analysis. In EMSA-RMed@ ESWC. DOI:10.1016/S0165-0114(98)00137-7
Hurford, J. R. (2003). Why synonymy is rare: Fitness is in the speaker. In Advances in Artificial Life: 7th European Conference, ECAL 2003, Dortmund, Germany, September 14-17, 2003. Proceedings 7 (pp. 442-451). Springer Berlin Heidelberg. DOI:10.1016/S0165-0114(98)00137-7
Ivasic-Kos, M., Pobar, M., & Ribaric, S. (2016). Two-tier image annotation model based on a multi-label classifier and fuzzy-knowledge representation scheme. Pattern recognition52, 287-305. DOI:10.1016/S0165-0114(98)00137-7
Kann, V., & Rosell, M. (2006, May). Free construction of a free Swedish dictionary of synonyms. In Proceedings of the 15th Nordic Conference of Computational Linguistics (NODALIDA 2005) (pp. 105-110). DOI:10.1016/S0165-114(98)00137-7
Khurana, D., Koli, A., Khatter, K., & Singh, S. (2023). Natural language processing: Sta te of the art, current trends and challenges. Multimedia tools and applications, 82(3), 3713-3744. DOI:10.1016/S0165-0114(98)00137-7
Lops, P., Degemmis, M., & Semeraro, G. (2007). Improving social filtering techniques through wordnet-based user profiles. In User Modeling 2007: 11th International Conference, UM 2007, Corfu, Greece, July 25-29, 2007. Proceedings 11 (pp. 268-277). Springer Berlin Heidelberg. DOI:10.1016/S0165-0114(98)00137-7
Lops, P., Musto, C., Narducci, F., De Gemmis, M., Basile, P., & Semeraro, G. (2010, September). Mars: a multilanguage recommender system. In Proceedings of the 1st International Workshop on Information Heterogeneity and Fusion in Recommender Systems (pp. 24-31). DOI:10.1016/S0165-0114(98)00137-7
Madalli, D., Sulochana, A., & Singh, A. K. (2016). COMAT: core ontology of matter. Program50(1), 103-117. DOI:10.1016/S0165-0114(98)00137-7
Manjula, D., Aghila, G., & Geetha, T. V. (2003, April). Document knowledge representation using description logics for information extraction and querying. In Proceedings ITCC 2003. International conference on information technology: Coding and computing (pp. 189-193). IEEE. DOI:10.1016/S0165-0114(98)00137-7
Miller, G. A. (1995). WordNet: a lexical database for English. Communications of the ACM38(11), 39-41. DOI:10.1016/S0165-0114(98)00137-7
Miller, G. A. (1998). WordNet: An electronic lexical database. MIT press. DOI:10.1016/S0165-0114(98)00137-7
Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. J. (1990). Introduction to WordNet: An on-line lexical database. International journal of lexicography3(4), 235-244. DOI:10.1088/S0165-0114(98)32014-7
Oliveira, H. G., & Gomes, P. (2011, June). Automatic discovery of fuzzy synsets from dictionary definitions. In Twenty-Second International Joint Conference on Artificial Intelligence. DOI:10.2015/S0165-0114(98)00362-7
Pouramini, J., & Minaei-Bidgoli, B. (2016). A New Synthetic Oversampling Method Using Ontology and Feature Selection in Order to Improve Imbalanced Textual Data Classification in Persian Texts. Bulletin de la Société Royale des Sciences de Liège85, 358-375. DOI:10.2014/S0165-0114(98)326001-7
Reforgiato Recupero, D., Presutti, V., Consoli, S., Gangemi, A., & Nuzzolese, A. G. (2015). Sentilo: frame-based sentiment analysis. Cognitive Computation7, 211-225. DOI:10.1016/S0165-0114(98)00137-7
Saedi, C., Branco, A., Rodrigues, J., & Silva, J. (2018, July). Wordnet embeddings. In Proceedings of the third workshop on representation learning for NLP (pp. 122-131). DOI:10.2102/S0165-0114(98)00136-7
Semeraro, G., Degemmis, M., Lops, P., & Basile, P. (2007, January). Combining Learning and Word Sense Disambiguation for Intelligent User Profiling. In IJCAI (Vol. 7, pp. 2856-2861). DOI:10.3021/S0165-0114(98)21500-7
Semeraro, G., Lops, P., & Degemmis, M. (2005, November). WordNet-based user profiles for neighborhood formation in hybrid recommender systems. In Fifth International Conference on Hybrid Intelligent Systems (HIS'05) (pp. 6-pp). IEEE. DOI:10.1016/S0165-0114(98)00137-7
Sharma, S., & Jain, A. (2020). Role of sentiment analysis in social media security and analytics. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10(5), e1366. DOI:10.3265/S0165-0114(98)032154-7
Smeaton, A. F., Kelledy, F., O'Donnell, R., Quigley, I., Richardson, R., & Townsend, E. (1995, October). Low Level Language Processing for Large Scale Information Retrieval: What Techniques Actually Work. In Proceedings of a Workshop: Terminology, Information Retrieval, and Linguistics. DOI:10.2158/S0165-0114(98)036214-7
Velldal, E. (2005). A fuzzy clustering approach to word sense discrimination. In Proceedings of the 7th International conference on Terminology and Knowledge Engineering (pp. 279-292). DOI:10.1025/S0165-0114(98)00362-7
Vossen, P. (1998). Introduction to eurowordnet. EuroWordNet: A multilingual database with lexical semantic networks, 1-17. DOI:10.0125/S0165-0114(98)326517-7
Vossen, P. (2004). Eurowordnet: a multilingual database of autonomous and language-specific wordnets connected via an inter-lingualindex. international journal of Lexicography17(2), 161-173. DOI:10.0110/S0165-0114(98)625145-7
Wei, T., Lu, Y., Chang, H., Zhou, Q., & Bao, X. (2015). A semantic approach for text clustering using WordNet and lexical chains. Expert Systems with Applications42(4), 2264-2275. DOI:10.2158/S0165-0114(98)326914-7
Whaley, J. M. (1999). An application of word sense disambiguation to information retrieval. Dartmouth College, Department of Computer Science. DOI:10.1016/S0165-0114(98)021574-7
Yan, J., Wang, C., Cheng, W., Gao, M., & Zhou, A. (2018). A retrospective of knowledge graphs. Frontiers of Computer Science12, 55-74. DOI:10.6574/S0165-0114(98)00654-7
Ye, P., & Baldwin, T. (2006, November). Verb sense disambiguation using selectional preferences extracted with a state-of-the-art semantic role labeler. In Proceedings of the Australasian Language Technology Workshop 2006 (pp. 139-148). DOI:10.1857/S0165-0114(98)00137-7
Zhang, Y., Jin, R., & Zhou, Z. H. (2010). Understanding bag-of-words model: a statistical framework. International journal of machine learning and cybernetics1, 43-52. DOI:10.1016/S0165-0114(98)00137-7
 
CAPTCHA Image