ارائه روشی برای تحلیل احساسات با استفاده از معیار وزن‌دهی TF-IGM و میدان تصادفی شرطی

نویسندگان

دانشکده فنی و مهندسی، دانشگاه آزاد اسلامی واحد اصفهان (خوراسگان)، اصفهان، ایران

چکیده

رشد چشم‌گیر شبکه‌های اجتماعی باعث ایجاد سطح بالایی از اطلاعات در فضای اینترنت می‌شود که این حجم بالای اطلاعات می‌تواند شامل ثبت نظرات کاربران، نیازهای کاربران و یا احساسات آن‌ها باشد. تحلیل نیازهای کاربران سبب معرفی حوزه‌ای بنام تحلیل احساس شد که هدف آن شناسایی احساسات (مثبت، منفی،خنثی) کاربران بر اساس نظرات ثبت شده آن‌ها می‌باشد. در این روش‌ها عموماً از یک الگوریتم دسته‌بندی و معیار وزن‌دهی سنتی TF-IDF استفاده می‌شود که در فرآیند وزن‌دهی به کلمات از اطلاعات کلاس داده‌های آموزشی استفاده نمی‌کند و این اطلاعات را در فرآیند وزن دهی دخیل نمی‌کند، ازاین‌رو نتایج حاصل شده به اندازه کافی مطلوب نمی‌باشد. در این پژوهش از معیار وزن دهی جدیدی تحت عنوان TF-IGM برای وزن‌دهی کلمات استفاده شده است که یک معیار وزن‌دهی با ناظر می‌باشد. علاوه بر این برخلاف روش‌های پیشین در این پژوهش از ترکیب دو روش مدل مخفی مارکوف و میدان تصادفی شرطی که حاصل ترکیب این دو، میدان تصادفی شرطی مخفی می‌باشد برای تحلیل احساس استفاده شده است. نتایج حاصل از اجرای روش پیشنهادی بر روی پایگاه داده نظرات کاربران شبکه توئیتر که شامل ۱۲۰۰۰ توئیت می‌باشد، حاکی از آن است که صحت مدل پیشنهادی در مقایسه با روش سنتی مبتنی بر TF-IDF دارای 82/5‌% بهبود می‌باشد. درواقع نتایج نشان می‌دهند که استفاده از الگوریتم‌های دسته‌بندی ترکیبی در کنار روش‌ وزن‌دهی با ناظر نتایج بهتری در مقایسه با روش‌های ساده ارائه می‌دهد.

کلیدواژه‌ها

  • [1] W. Medhat, A. Hassan, and H. Korashy, "Sentiment analysis algorithms and applications: A survey," Ain Shams Engineering Journal, vol. 5, pp. 1093-1113, 2014.
  • [2] B. Liu and L. Zhang, "A survey of opinion mining and sentiment analysis," Mining text data, Springer, pp. 415-463, 2012.
  • [3] G. Vinodhini and R. Chandrasekaran, "Sentiment analysis and opinion mining: a survey," International Journal, vol. 2, pp. 282-292, 2012.
  • [4] K. Ravi and V. Ravi, "A survey on opinion mining and sentiment analysis: tasks, approaches and applications," Knowledge-Based Systems, vol. 89, pp. 14-46, 2015.
  • [5] K. Dave, S. Lawrence, and D. M. Pennock, "Mining the peanut gallery: Opinion extraction and semantic classification of product reviews," in Proceedings of the 12th international conference on World Wide Web, pp. 519-528, 2003.
  • [6] B. Pang and L. Lee, Opinion mining and sentiment analysis Foundations and Trends in Information Retrieval, vol. 2, 2008.
  • [7] E. Kouloumpis, T. Wilson, and J. D. Moore, "Twitter sentiment analysis: The good the bad and the omg!," In Fifth International AAAI conference on weblogs and social media (Icwsm), vol. 11, pp. 538-541, 2011.
  • [8] W. Wei, "Analyzing text data for opinion mining," in International Conference on Application of Natural Language to Information Systems, pp. 330-335, 2011.
  • [9] A. Z. Khan, M. Atique, and V. Thakare, "Combining lexicon-based and learning-based methods for Twitter sentiment analysis," International Journal of Electronics, Communication and Soft Computing Science & Engineering (IJECSCSE), 2011.
  • [10] K. Chen, Z. Zhang, J. Long, and H. Zhang, "Turning from TF-IDF to TF-IGM for term weighting in text classification," Expert Systems with Applications, vol. 66, pp. 245-260, 2016.
  • [11] A. S. Manek, P. D. Shenoy, M. C. Mohan, and K. Venugopal, "Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier," World wide web, vol. 20, pp. 135-154, 2017.
  • [12] P. D. Turney and M. L. Littman, "Unsupervised learning of semantic orientation from a hundred-billion-word corpus," arXiv preprint cs/0212012, 2002.
  • [13] B. Pang and L. Lee, "A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts," in Proceedings of the 42nd annual meeting on Association for Computational Linguistics, 2004, p. 271.
  • [14] Y. Choi, C. Cardie, E. Riloff, and S. Patwardhan, "Identifying sources of opinions with conditional random fields and extraction patterns," in Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 355-362, 2005.
  • [15] A. Kennedy and D. Inkpen, "Sentiment classification of movie reviews using contextual valence shifters," Computational intelligence, vol. 22, pp. 110-125, 2006.
  • [16] J. Blitzer, M. Dredze, and F. Pereira, "Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification," In Proceedings of the 45th annual meeting of the association of computational linguistics (ACL), pp. 440-447, 2007.
  • [17] R. McDonald, K. Hannan, T. Neylon, M. Wells, and J. Reynar, "Structured models for fine-to-coarse sentiment analysis," in Annual Meeting-Association For Computational Linguistics, p. 432-439, 2007.
  • [18] M. Sharifi and W. Cohen, "Finding domain specific polar words for sentiment classification," in Language Technologies Institute Student Research Symposium, 2008.
  • [19] T. Nakagawa, K. Inui, and S. Kurohashi, "Dependency tree-based sentiment classification using CRFs with hidden variables," in Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 786-794, 2010.
  • [20] B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up?: sentiment classification using machine learning techniques," in Proceedings of the ACL-02 conference on Empirical methods in natural language processing, Vol. 10, pp. 79-86, 2002.
  • [21] M. Taboada and J. Grieve, "Analyzing appraisal automatically," in Proceedings of AAAI Spring Symposium on Exploring Attitude and Affect in Text (AAAI Technical Re# port SS# 04# 07), Stanford University, CA, pp. 158q161. AAAI Press, 2004.
  • [22] M. R. Saleh, M. T. Martín-Valdivia, A. Montejo-Ráez, and L. Ureña-López, "Experiments with SVM to classify opinions in different domains," Expert Systems with Applications, vol. 38, pp. 14799-14804, 2011.
  • [23] Z. Zhang, Q. Ye, Z. Zhang, and Y. Li, "Sentiment classification of Internet restaurant reviews written in Cantonese," Expert Systems with Applications, vol. 38, pp. 7674-7682, 2011.
  • [24] R. Moraes, J. F. Valiati, and W. P. G. Neto, "Document-level sentiment classification: An empirical comparison between SVM and ANN," Expert Systems with Applications, vol. 40, pp. 621-633, 2013.
  • [25] A. S. H. Basari, B. Hussin, I. G. P. Ananta, and J. Zeniarja, "Opinion mining of movie review using hybrid method of support vector machine and particle swarm optimization," Procedia Engineering, vol. 53, pp. 453-462, 2013.
  • [26] B. G. Patra, S. Mandal, D. Das, and S. Bandyopadhyay, "Ju_cse: A conditional random field (crf) based approach to aspect based sentiment analysis," in Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pp. 370-374, 2014.
  • [27] M. Adnan and M. Rafi, "Document clustering with explicit semantic analysis (ESA)," Journal of Independent Studies and Research, vol. 12, p. 50, 2014.
  • [28] P. Kalaivani and K. Shunmuganathan, "Feature reduction based on genetic algorithm and hybrid model for opinion mining," Scientific Programming, vol. 2015, p. 12, 2015.
  • [29] K. Umamaheswari, S. Rajamohana, and G. Aishwaryalakshmi, "Opinion Mining using Hybrid Methods," International Journal of Computer Application, pp. 18-21, 2015.
  • [30] A. Severyn and A. Moschitti, "Twitter sentiment analysis with deep convolutional neural networks," in Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 959-962, 2015.
  • [31] M. Giatsoglou, M. G. Vozalis, K. Diamantaras, A. Vakali, G. Sarigiannidis, and K. C. Chatzisavvas, "Sentiment analysis leveraging emotions and word embeddings," Expert Systems with Applications, vol. 69, pp. 214-224, 2017.
  • [32] V. Haralampieva and G. Brown, Evaluation of Mutual information versus Gini index for stable feature selection, 2016.
  • [33] A. G. Karegowda, A. Manjunath, and M. Jayaram, "Comparative study of attribute selection using gain ratio and correlation based feature selection," International Journal of Information Technology and Knowledge Management, vol. 2, pp. 271-277, 2010.
  • [34] Y. Yang and J. O. Pedersen, "A comparative study on feature selection in text categorization," in Icml, pp. 412-420, 1997.
  • [35] T. Chen, R. Xu, Y. He, and X. Wang, "Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN," Expert Systems with Applications, vol. 72, pp. 221-230, 2017.
  • [36] H. Hamdan, P. Bellot, and F. Bechet, "Lsislif: Crf and logistic regression for opinion target extraction and sentiment polarity analysis," in Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), pp. 753-758, 2015.
  • [37] T. Alvarez-López, J. Juncal-Martinez, M. Fernández-Gavilanes, E. Costa-Montenegro, and F. J. González-Castano, "Gti at semeval-2016 task 5: Svm and crf for aspect detection and unsupervised aspect-based sentiment analysis," in Proceedings of the 10th international workshop on semantic evaluation (SemEval), pp. 306-311, 2016.
  • [38] H. Xu, H. Lu, G. Yang, and C. Zhang, "Sentiment Analysis of Chinese Version Using SVM & RNN," in Proceedings of the 6th International Conference on Information Engineering, pp.1-5, 2017.
  • [39] T. M. Mitchell, "Machine learning. 1997," Burr Ridge, IL: McGraw Hill, vol. 45, pp. 870-877, 1997.
  • [40] L. R. Welch, "Hidden Markov models and the Baum-Welch algorithm," IEEE Information Theory Society Newsletter, vol. 53, pp. 10-13, 2003.
  • [41] L. Moss, "Example of the Baum-Welch Algorithm," Indiana University, Bloomington, Spring, 2008.
  • [42] M. Riedmiller and H. Braun, "RPROP-A fast adaptive learning algorithm," in Proc. of ISCIS VII), Universitat, 1992.
  • [43] M. Mahajan, A. Gunawardana, and A. Acero, "Training algorithms for hidden conditional random fields," in Acoustics, Speech and Signal Processing (ICASSP), Vol. 1, 2006.
  • [44] M. Riedmiller and H. Braun, "A direct adaptive method for faster backpropagation learning: The RPROP algorithm," in Neural Networks, pp. 586-591, 1993.
  • [45] D. M. Powers, "Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation," Journal of Machine Learning Technologies, 2011.
  • [46] J. Davis and M. Goadrich, "The relationship between Precision-Recall and ROC curves," in Proceedings of the 23rd international conference on Machine learning, pp. 233-240, 2006.
  • [47] A Mosalanezhad and M. javad, “Provide an efficient rhythmic template for extracting semantic relationships in documentation, based on Wikipedia"s tacit knowledge base.” Presented at the 23rd Iranian Conference on Electrical Engineering, Sharif University of Technology, 1394.
  • [48] H Sadr, E Atani and M Yamghani, “Calculating the Semantic Relationship of Texts Using the Improvement of Explicitly Developed Semantic Analysis Algorithm”, The First National Conference on New Approaches in Computer Engineering and Information Retrieval, 1392.
  • [49] J. Lafferty, A. McCallum, and F. C. Pereira, Conditional random fields: Probabilistic models for segmenting and labeling sequence data, 2001.
دوره 17، شماره 1
بهار و تابستان
اردیبهشت 1398