Presenting a Novel Hybrid Approach of Text Mining Sentiment Analysis in Twitter Using CART Decision Tree

Authors

1 MSc. Computer Engineering, Factuly of Electrical and computer Engineering, Azad Mashhad University, Mashhad, Iran

2 Department of Computer, Mashhad Branch, Islamic AzadUniversity, Mashhad, Iran

Abstract

Today, with the enormous growth of the Internet and social networks as virtual communities and mass media, and increased use of them, a huge amount of user feedback comes from a variety of topics. Therefore, the use of novel approaches for analyzing them seems to be necessary. Text mining, as a special strategy, drives the knowledge discovery process, which uses non-verbal and attractive patterns of natural language processing. In this paper, a new hybrid approach of machine learning and vocabulary-based method to text-mining sentiment analysis on Twitter. To improve text-mining and sentiment analysis, the CART decision tree is used as a machine learning method for classification, also for extracting more precisely sentiment, we use from the list of SentiStrength algorithms as a lexicon-based method. CART is very effective in processing discrete and continuous data in text mining. The unique CART feature is a complex data structure analysis that can support regression as well as classification operations, according to the input of the problem. The ability and power of the SentiStrength algorithm to detect sentiment has also led to a thorough analysis of sentiment in tweets. The results of the implementation in the polarity recognition show improvement of classification in the most feature.

Keywords


منابع

ـ Baumer, E. P., Sinclair, J., & Tomlinson, B. (2010, April). America is like Metamucil: fostering critical and creative thinking about metaphor in political blogs. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1437-1446). ACM.
ـ Charalampakis, B., Spathis, D., Kouslis, E., & Kermanidis, K. (2016). A comparison between semi-supervised and supervised text mining techniques on detecting irony in greek political tweets. Engineering Applications of Artificial Intelligence, 51, 50-57.
ـ Evans, B. M., Kairam, S., & Pirolli, P. (2010). Do your friends make you smarter?: An analysis of social strategies in online information seeking. Information Processing & Management, 46(6), 679-692.
ـ Ghiassi, M., Skinner, J., & Zimbra, D. (2013). Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic artificial neural network. Expert Systems with applications40(16), 6266-6282.
ـ Giachanou, A., & Crestani, F. (2016). Like it or not: A survey of twitter sentiment analysis methods. ACM Computing Surveys (CSUR), 49(2), 28.
ـ Lahuerta-Otero, E., & Cordero-Gutiérrez, R. (2016). Looking for the perfect tweet. The use of data mining techniques to find influencers on Twitter. Computers in Human Behavior, 64, 575-583.
ـ Li, J., Li, Q., Khan, S. U., & Ghani, N. (2011, June). Community-based cloud for emergency management. In 2011 6th International Conference on System of Systems Engineering (pp. 55-60). IEEE.
ـ Liu, B., & Zhang, L. (2012). A survey of opinion mining and sentiment analysis. In Mining text data (pp. 415-463). Springer, Boston, MA.
ـ Saif, H., He, Y., Fernandez, M., & Alani, H. (2016). Contextual semantics for sentiment analysis of Twitter. Information Processing & Management, 52(1), 5-19.
ـ Serrano-Guerrero, J., Olivas, J. A., Romero, F. P., & Herrera-Viedma, E. (2015). Sentiment analysis: A review and comparative analysis of web services. Information Sciences, 311, 18-38.
ـ Sorensen, L. (2009, May). User managed trust in social networking-Comparing Facebook, MySpace and Linkedin. In 2009 1st International Conference on Wireless Communication, Vehicular Technology, Information Theory and Aerospace & Electronic Systems Technology (pp. 427-431). IEEE.
ـ Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., & Kappas, A. (2010). Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 61(12), 2544-2558.
ـ Zhang, L., Ghosh, R., Dekhil, M., Hsu, M., & Liu, B. (2011). Combining lexicon-based and learning-based methods for Twitter sentiment analysis. HP Laboratories, Technical Report HPL-2011, 89.
CAPTCHA Image