ارائه الگوریتمی جهت فازی کردن وردنت و کاربرد آن در تحلیل احساسات

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشگاه علم و صنعت

2 دانشگاه علم و صنعت تهران

3 دانشگاه علم و صنعت ایران

4 دانشگاه فردوسی مشهد

5 دانشگاه کالیاری

6 دانشگاه بولونیا

چکیده

پایگاه‌های داده واژگانی شبه وردنت (WLD) کلمات انگلیسی را در مجموعه‌ای از مترادف‌ها به نام هم‌نشیم گروه‌بندی می‌کنند. اگرچه WLDهای استاندارد در بسیاری از برنامه‌های کاربردی موفق متن‌کاوی مورداستفاده قرار می‌گیرند، اما این محدودیت را دارند که حس کلمه به‌عنوان نمایش‌دهنده معنای مرتبط با هم‌نشیمهای متناظر آنها به یک‌میزان در نظر گرفته می‌شود که به‌طورکلی درست نیست. به‌منظور غلبه بر این محدودیت، چندین نسخه فازی از هم‌نشیم ارائه شده است. یکی از ویژگی‌های مشترک این مطالعات این است که هدف آن‌ها تولید نسخه‌های فازی شده از WLDهای موجود نیست، بلکه ساخت WLDهای جدید را از ابتدا انجام می‌دهند. در این مطالعه، ما الگوریتمی را برای ساخت نسخه‌های فازی WLD از هر زبان، باتوجه‌به مجموعه‌ای از اسناد و یک سیستم ابهام‌زدایی حس کلمه (WSD) برای آن زبان ارائه می‌کنیم. سپس، با استفاده از پیکره OANC و UKB WSD به‌عنوان ورودی‌های الگوریتم، نسخه فازی شده WordNet انگلیسی را ساخته و به‌صورت آنلاین منتشر می‌کنیم. ما همچنین یک اثبات عملی برای اعتبار نتایج آن پیشنهاد می‌کنیم.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

An Algorithm for Fuzzification of WordNets and its Application in Sentiment Analysis

نویسندگان [English]

  • Yousef Alizadeh-Q 1
  • ‌‌Behrouz Minaei Bidgoli 2
  • Sayyed Ali Hossayni 3
  • Mohammad-R Akbarzadeh-T 4
  • Diego Reforgiato Recupero 5
  • Aldo Gangemi 6
1 Iran University of Science and Technology
2 Iran University of Science and Technology
3 Iran University of Science and Technology
4 Ferdowsi University of Mashhad
5 University of Cagliari
6 University of Bologna
چکیده [English]

WordNet-like Lexical Databases (WLDs) group English words into sets of synonyms called “synsets.” Synsets are utilized for several applications in the field of text mining. However, they were also open to criticism because although, in theory, not all the members (i.e. word senses) of a synset represent the meaning of that synset with the same degree, in practice, in WLDs they are considered as members of the synset identically. Correspondingly, the fuzzy version of synonym sets, called fuzzy-synsets were proposed. But, to the best or our knowledge. In this study, we present an algorithm for constructing fuzzy version of WLDs of any language, given a corpus of documents and a word-sense-disambiguation system of that language. A theoretical proof is also proposed for the validity of results of the proposed algorithm. Then, inputting the open-American-online-corpus (OANC) and UKB word-sense-disambiguation to the algorithm, we construct and publish online the fuzzified version English WordNet (FWN), and apply them in a Sentiment Analysis problem.

کلیدواژه‌ها [English]

  • Fuzzy WordNet
  • Possibility Theory
  • Sentiment Analysis
  • Uncertainty Handling
Baccianella, S., Esuli, A., & Sebastiani, F. (2010, May). Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In Lrec (Vol. 10, No. 2010, pp. 2200-2204). DOI:10.1016/S0165-0114(98)00137-7
Basile, P., Degemmis, M., Gentile, A. L., Lops, P., & Semeraro, G. (2007). The jigsaw algorithm for word sense disambiguation and semantic indexing of documents. In AI* IA 2007: Artificial Intelligence and Human-Oriented Computing: 10th Congress of the Italian Association for Artificial Intelligence, Rome, Italy, September 10-13, 2007. Proceedings 10 (pp. 314-325). Springer Berlin Heidelberg. DOI:10.1016/S0165-0114(98)00137-7
Benarab, A., Sun, J., Rafique, F., & Refoufi, A. (2023). Global Ontology Entities Embeddings. IEEE Transactions on Knowledge and Data Engineering. DOI:10.1016/S0165-0114(98)00137-7
Bender, E. M., & Koller, A. (2020, July). Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of the 58th annual meeting of the association for computational linguistics (pp. 5185-5198). DOI:10.1016/S0165-0114(98)00137-7
Bimson, K. D., Hull, R. D., & Nieten, D. (2016). The lexical bridge: A methodology for bridging the semantic gaps between a natural language and an ontology. Semantic Web: Implications for Technologies and Business Practices, 137-151. DOI:10.1016/S0165-0114(98)00137-7
Birjali, M., Kasri, M., & Beni-Hssane, A. (2021). A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowledge-Based Systems, 226, 107134. DOI:10.1016/S0165-0114(98)00137-7
Bond, F., & Paik, K. (2012, January). A survey of wordnets and their licenses. In Proceedings of the 6th Global WordNet Conference (GWC 2012) (pp. 64-71). DOI:10.1016/S0165-0114(98)00137-7
Borin, L. (2005). Mannen är faderns mormor: Svenskt associationslexikon reinkarnerat. LexicoNordica, (12). DOI:10.1016/S0165-0114(98)00137-7
Borin, L., & Forsberg, M. (2009, May). All in the family: A comparison of SALDO and WordNet. In Proceedings of the Nodalida 2009 Workshop on WordNets and other Lexical Semantic Resources–between Lexical Semantics, Lexicography, Terminology and Formal Ontologies (pp. 7-12). DOI:10.1016/S0165-0114(98)00137-7
Borin, L., & Forsberg, M. (2010). Beyond the synset: Swesaurus–a fuzzy Swedish wordnet. In Workshop on Re-thinking synonymy: Semantic sameness and similarity in languages and their description. Helsinki. DOI:10.1016/S0165-0114(98)00137-7
Borin, L., & Forsberg, M. (2010, May). From the people’s synonym dictionary to fuzzy synsets-first steps. In Proceedings of the LREC 2010 workshop Semantic relations. Theory and Applications (pp. 18-25). DOI:10.1016/S0165-0114(98)00137-7
Chu, J. S., & Evans, J. A. (2021). Slowed canonical progress in large fields of science. Proceedings of the National Academy of Sciences118(41), e2021636118. DOI:10.1016/S0165-0114(98)00137-7
De Gemmis, M., Lops, P., Semeraro, G., & Basile, P. (2008, October). Integrating tags in a semantic content-based recommender. In Proceedings of the 2008 ACM conference on Recommender systems (pp. 163-170). DOI:10.1016/S0165-0114(98)00137-7
Hajiali, M. (2020). Big data and sentiment analysis: A comprehensive and systematic literature review. Concurrency and Computation: Practice and Experience, 32(14), e5671. DOI:10.1016/S0165-0114(98)00137-7
Hassani, H., Beneki, C., Unger, S., Mazinani, M. T., & Yeganegi, M. R. (2020). Text mining in big data analytics. Big Data and Cognitive Computing, 4(1), 1. DOI:10.1016/S0165-0114(98)00137-7
Hershcovich, D., & Donatelli, L. (2021). It’s the meaning that counts: the state of the art in NLP and semantics. KI-Künstliche Intelligenz, 35(3-4), 255-270. DOI:10.1016/S0165-0114(98)00137-7
Hossayni, S. A., Akbarzadeh-T, M. R., Recupero, D. R., Gangemi, A., & Josep Lluís de la Rosa i Esteva. (2016). Fuzzy Synsets, and Lexicon-Based Sentiment Analysis. In EMSA-RMed@ ESWC. DOI:10.1016/S0165-0114(98)00137-7
Hurford, J. R. (2003). Why synonymy is rare: Fitness is in the speaker. In Advances in Artificial Life: 7th European Conference, ECAL 2003, Dortmund, Germany, September 14-17, 2003. Proceedings 7 (pp. 442-451). Springer Berlin Heidelberg. DOI:10.1016/S0165-0114(98)00137-7
Ivasic-Kos, M., Pobar, M., & Ribaric, S. (2016). Two-tier image annotation model based on a multi-label classifier and fuzzy-knowledge representation scheme. Pattern recognition52, 287-305. DOI:10.1016/S0165-0114(98)00137-7
Kann, V., & Rosell, M. (2006, May). Free construction of a free Swedish dictionary of synonyms. In Proceedings of the 15th Nordic Conference of Computational Linguistics (NODALIDA 2005) (pp. 105-110). DOI:10.1016/S0165-114(98)00137-7
Khurana, D., Koli, A., Khatter, K., & Singh, S. (2023). Natural language processing: Sta te of the art, current trends and challenges. Multimedia tools and applications, 82(3), 3713-3744. DOI:10.1016/S0165-0114(98)00137-7
Lops, P., Degemmis, M., & Semeraro, G. (2007). Improving social filtering techniques through wordnet-based user profiles. In User Modeling 2007: 11th International Conference, UM 2007, Corfu, Greece, July 25-29, 2007. Proceedings 11 (pp. 268-277). Springer Berlin Heidelberg. DOI:10.1016/S0165-0114(98)00137-7
Lops, P., Musto, C., Narducci, F., De Gemmis, M., Basile, P., & Semeraro, G. (2010, September). Mars: a multilanguage recommender system. In Proceedings of the 1st International Workshop on Information Heterogeneity and Fusion in Recommender Systems (pp. 24-31). DOI:10.1016/S0165-0114(98)00137-7
Madalli, D., Sulochana, A., & Singh, A. K. (2016). COMAT: core ontology of matter. Program50(1), 103-117. DOI:10.1016/S0165-0114(98)00137-7
Manjula, D., Aghila, G., & Geetha, T. V. (2003, April). Document knowledge representation using description logics for information extraction and querying. In Proceedings ITCC 2003. International conference on information technology: Coding and computing (pp. 189-193). IEEE. DOI:10.1016/S0165-0114(98)00137-7
Miller, G. A. (1995). WordNet: a lexical database for English. Communications of the ACM38(11), 39-41. DOI:10.1016/S0165-0114(98)00137-7
Miller, G. A. (1998). WordNet: An electronic lexical database. MIT press. DOI:10.1016/S0165-0114(98)00137-7
Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. J. (1990). Introduction to WordNet: An on-line lexical database. International journal of lexicography3(4), 235-244. DOI:10.1088/S0165-0114(98)32014-7
Oliveira, H. G., & Gomes, P. (2011, June). Automatic discovery of fuzzy synsets from dictionary definitions. In Twenty-Second International Joint Conference on Artificial Intelligence. DOI:10.2015/S0165-0114(98)00362-7
Pouramini, J., & Minaei-Bidgoli, B. (2016). A New Synthetic Oversampling Method Using Ontology and Feature Selection in Order to Improve Imbalanced Textual Data Classification in Persian Texts. Bulletin de la Société Royale des Sciences de Liège85, 358-375. DOI:10.2014/S0165-0114(98)326001-7
Reforgiato Recupero, D., Presutti, V., Consoli, S., Gangemi, A., & Nuzzolese, A. G. (2015). Sentilo: frame-based sentiment analysis. Cognitive Computation7, 211-225. DOI:10.1016/S0165-0114(98)00137-7
Saedi, C., Branco, A., Rodrigues, J., & Silva, J. (2018, July). Wordnet embeddings. In Proceedings of the third workshop on representation learning for NLP (pp. 122-131). DOI:10.2102/S0165-0114(98)00136-7
Semeraro, G., Degemmis, M., Lops, P., & Basile, P. (2007, January). Combining Learning and Word Sense Disambiguation for Intelligent User Profiling. In IJCAI (Vol. 7, pp. 2856-2861). DOI:10.3021/S0165-0114(98)21500-7
Semeraro, G., Lops, P., & Degemmis, M. (2005, November). WordNet-based user profiles for neighborhood formation in hybrid recommender systems. In Fifth International Conference on Hybrid Intelligent Systems (HIS'05) (pp. 6-pp). IEEE. DOI:10.1016/S0165-0114(98)00137-7
Sharma, S., & Jain, A. (2020). Role of sentiment analysis in social media security and analytics. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10(5), e1366. DOI:10.3265/S0165-0114(98)032154-7
Smeaton, A. F., Kelledy, F., O'Donnell, R., Quigley, I., Richardson, R., & Townsend, E. (1995, October). Low Level Language Processing for Large Scale Information Retrieval: What Techniques Actually Work. In Proceedings of a Workshop: Terminology, Information Retrieval, and Linguistics. DOI:10.2158/S0165-0114(98)036214-7
Velldal, E. (2005). A fuzzy clustering approach to word sense discrimination. In Proceedings of the 7th International conference on Terminology and Knowledge Engineering (pp. 279-292). DOI:10.1025/S0165-0114(98)00362-7
Vossen, P. (1998). Introduction to eurowordnet. EuroWordNet: A multilingual database with lexical semantic networks, 1-17. DOI:10.0125/S0165-0114(98)326517-7
Vossen, P. (2004). Eurowordnet: a multilingual database of autonomous and language-specific wordnets connected via an inter-lingualindex. international journal of Lexicography17(2), 161-173. DOI:10.0110/S0165-0114(98)625145-7
Wei, T., Lu, Y., Chang, H., Zhou, Q., & Bao, X. (2015). A semantic approach for text clustering using WordNet and lexical chains. Expert Systems with Applications42(4), 2264-2275. DOI:10.2158/S0165-0114(98)326914-7
Whaley, J. M. (1999). An application of word sense disambiguation to information retrieval. Dartmouth College, Department of Computer Science. DOI:10.1016/S0165-0114(98)021574-7
Yan, J., Wang, C., Cheng, W., Gao, M., & Zhou, A. (2018). A retrospective of knowledge graphs. Frontiers of Computer Science12, 55-74. DOI:10.6574/S0165-0114(98)00654-7
Ye, P., & Baldwin, T. (2006, November). Verb sense disambiguation using selectional preferences extracted with a state-of-the-art semantic role labeler. In Proceedings of the Australasian Language Technology Workshop 2006 (pp. 139-148). DOI:10.1857/S0165-0114(98)00137-7
Zhang, Y., Jin, R., & Zhou, Z. H. (2010). Understanding bag-of-words model: a statistical framework. International journal of machine learning and cybernetics1, 43-52. DOI:10.1016/S0165-0114(98)00137-7
 
CAPTCHA Image