survey

A Survey and Comparative Study of Tweet Sentiment Analysis via Semi-Supervised Learning

Authors:

Nadia Felix F. Da Silva,

Luiz F. S. Coletta,

Eduardo R. HruschkaAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 49, Issue 1

Article No.: 15, Pages 1 - 26

https://doi.org/10.1145/2932708

Published: 29 June 2016 Publication History

Abstract

Twitter is a microblogging platform in which users can post status messages, called “tweets,” to their friends. It has provided an enormous dataset of the so-called sentiments, whose classification can take place through supervised learning. To build supervised learning models, classification algorithms require a set of representative labeled data. However, labeled data are usually difficult and expensive to obtain, which motivates the interest in semi-supervised learning. This type of learning uses unlabeled data to complement the information provided by the labeled data in the training process; therefore, it is particularly useful in applications including tweet sentiment analysis, where a huge quantity of unlabeled data is accessible. Semi-supervised learning for tweet sentiment analysis, although appealing, is relatively new. We provide a comprehensive survey of semi-supervised approaches applied to tweet classification. Such approaches consist of graph-based, wrapper-based, and topic-based methods. A comparative study of algorithms based on self-training, co-training, topic modeling, and distant supervision highlights their biases and sheds light on aspects that the practitioner should consider in real-world applications.

References

[1]

Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow, and Rebecca Passonneau. 2011. Sentiment analysis of twitter data. In Proceedings of the Workshop on Languages in Social Media (LSM’11). Association for Computational Linguistics, Stroudsburg, PA, 30--38.

Digital Library

[2]

Maria-Florina Balcan and Avrim Blum. 2010. A discriminative model for semi-supervised learning. J. ACM 57, 3, Article 19 (2010), 46 pages.

Digital Library

[3]

Wesley Baugh. 2013. bwbaugh: Hierarchical sentiment analysis with partial self-training. In Second Joint Conference on Lexical and Computational Semantics (&ast;SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013). Association for Computational Linguistics, Atlanta, GA, 539--542.

[4]

Lee Becker, George Erhart, David Skiba, and Valentine Matula. 2013. AVAYA: Sentiment analysis on twitter with self-training and polarity lexicon expansion. In Second Joint Conference on Lexical and Computational Semantics (&ast;SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013). Association for Computational Linguistics, Atlanta, GA, 333--340.

[5]

Albert Bifet and Eibe Frank. 2010. Sentiment knowledge discovery in twitter streaming data. In Proceedings of the 13th International Conference on Discovery Science (DS’10). Springer-Verlag, Berlin, 1--15.

Digital Library

[6]

Albert Bifet, Geoffrey Holmes, and Bernhard Pfahringer. 2011. MOA-TweetReader: Real-time analysis in twitter streaming data. In Proceedings of the 14th International Conference on Discovery Science (DS’11). Springer-Verlag, Berlin, 46--60.

Digital Library

[7]

David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3 (March 2003), 993--1022.

[8]

Avrim Blum and Tom Mitchell. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory (COLT’98). ACM, New York, NY, 92--100.

Digital Library

[9]

Pedro Henrique Calais Guerra, Adriano Veloso, Wagner Meira, Jr., and Virgílio Almeida. 2011. From bias to opinion: A transfer-learning approach to real-time sentiment analysis. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11). ACM, New York, NY, 150--158.

Digital Library

[10]

Paula Carvalho, Luís Sarmento, Mário J. Silva, and Eugénio de Oliveira. 2009. Clues for detecting irony in user-generated contents: Oh...&excl;&excl; It’s “So Easy” ;-). In Proceedings of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion (TSA’09). ACM, New York, NY, 53--56.

Digital Library

[11]

O. Chapelle, B. Schölkopf, and A. Zien. 2006. Semi-Supervised Learning. MIT Press, Cambridge, MA.

Digital Library

[12]

Marc Cheong and Vincent C. Lee. 2011. A microblogging-based approach to terrorism informatics: Exploration and chronicling civilian sentiment and response to terrorism events via twitter. Informat. Syst. Front. 13, 1 (March 2011), 45--59.

Digital Library

[13]

Terence Tai-Leung Chong, Bingqing Cao, and Wing-Keung Wong. 2014. A new principal-component approach to measure the investor sentiment. In IGEF Working Paper.

[14]

Sam Clark and Rich Wicentwoski. 2013. SwatCS: Combining simple classifiers with estimated accuracy. In Second Joint Conference on Lexical and Computational Semantics (&ast;SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013). Association for Computational Linguistics, Atlanta, GA, 425--429.

[15]

Luiz F. S. Coletta, Eduardo R. Hruschka, Ayan Acharya, and Joydeep Ghosh. 2015. Using metaheuristics to optimize the combination of classifier and cluster ensembles. Integr. Comput.-Aid. Eng. 22, 3 (2015), 229--242.

[16]

Koby Crammer, Alex Kulesza, and Mark Dredze. 2009. Adaptive regularization of weight vectors. In NIPS. 414--422.

[17]

Nadia F. F. da Silva, Luiz F. S. Coletta, Eduardo R. Hruschka, and Estevam R. Hruschka Jr. 2016. Using unsupervised information to improve semi-supervised tweet sentiment classification. Informat. Sci. (2016), 1--18.

[18]

Nadia F. F. da Silva, Eduardo R. Hruschka, and Estevam R. Hruschka Jr. 2014. Tweet sentiment analysis with classifier ensembles. Decision Support Syst. 66 (2014), 170--179.

Digital Library

[19]

N. Dahal, O. Abuomar, R. King, and V. Madani. 2015. Event stream processing for improved situational awareness in the smart grid. Expert Syst. Appl. 42, 20 (2015), 6853--6863.

Digital Library

[20]

Sajib Dasgupta and Vincent Ng. 2009. Mine the easy, classify the hard: A semi-supervised approach to automatic sentiment classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2 (ACL’09). Association for Computational Linguistics, Stroudsburg, PA, 701--709.

Digital Library

[21]

Dmitry Davidov, Oren Tsur, and Ari Rappoport. 2010. Enhanced sentiment learning using twitter hashtags and smileys. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters (COLING’10). Association for Computational Linguistics, Stroudsburg, PA, 241--249.

Digital Library

[22]

Larissa A. de Freitas, Aline A. Vanin, Denise N. Hogetop, Marco N. Bochernitsan, and Renata Vieira. 2014. Pathways for irony detection in tweets. In Proceedings of the 29th Annual ACM Symposium on Applied Computing (SAC’14). ACM, New York, NY, 628--633.

Digital Library

[23]

Nicholas A. Diakopoulos and David A. Shamma. 2010. Characterizing debate performance via aggregated twitter sentiment. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’10). ACM, New York, NY, 1195--1198.

Digital Library

[24]

Thomas G. Dietterich and Ghulum Bakiri. 1995. Solving multiclass learning problems via error-correcting output codes. J. Artif. Intell. Res. 2 (1995), 263--286.

[25]

Pedro Domingos and Geoff Hulten. 2000. Mining high-speed data streams. In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’00). ACM, New York, NY, 71--80.

Digital Library

[26]

Ronen Feldman. 2013. Techniques and applications for sentiment analysis. Commun. ACM 56, 4 (April 2013), 82--89.

Digital Library

[27]

Alec Go, Richa Bhayani, and Lei Huang. 2009. Twitter sentiment classification using distant supervision. Unpublished Manuscript, Stanford University (2009), 1--6.

[28]

A. B. Goldberg. 2010. New Directions in Semi-Supervised Learning. Ph.D. Dissertation. University of Wisconsin—Madison. http://books.google.com.br/books?id=fO3RXwAACAAJ.

Digital Library

[29]

Andrew B. Goldberg and Xiaojin Zhu. 2006. Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization. In Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing (TextGraphs-1). Association for Computational Linguistics, Stroudsburg, PA, 45--52.

Digital Library

[30]

Roberto González-Ibáñez, Smaranda Muresan, and Nina Wacholder. 2011. Identifying sarcasm in twitter: A closer look. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers - Volume 2 (HLT’11). Association for Computational Linguistics, Stroudsburg, PA, 581--586.

Digital Library

[31]

Geoffrey J. Gordon, David Blei Dunson, and Miroslav Dudík (Eds.). 2011. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2011, Fort Lauderdale, USA, April 11-13, 2011. JMLR Proceedings, Vol. 15. JMLR.org.

[32]

Yoav Haimovitch, Koby Crammer, and Shie Mannor. 2012. More is better: Large scale partially-supervised sentiment classification - appendix. CoRR abs/1209.6329 (2012).

[33]

Ammar Hassan, Ahmed Abbasi, and Daniel Zeng. 2013. Twitter sentiment analysis: A bootstrap ensemble framework. In SocialCom. IEEE, Los Alamitos, CA, 357--364.

Digital Library

[34]

Yulan He, Chenghua Lin, Wei Gao, and Kam-Fai Wong. 2012. Tracking sentiment and topic dynamics from social media. In ICWSM.

[35]

Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’04). ACM, New York, NY, 168--177.

Digital Library

[36]

Xia Hu, Lei Tang, Jiliang Tang, and Huan Liu. 2013. Exploiting social relations for sentiment analysis in microblogging. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining.

Digital Library

[37]

Bernard J. Jansen, Mimi Zhang, Kate Sobel, and Abdur Chowdury. 2009. Twitter power: Tweets as electronic word of mouth. J. Am. Soc. Inf. Sci. Technol. 60, 11 (Nov. 2009), 2169--2188.

Digital Library

[38]

Yohan Jo and Alice H. Oh. 2011. Aspect and sentiment unification model for online review analysis. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (WSDM’11). ACM, New York, NY, 815--824.

Digital Library

[39]

Thorsten Joachims. 1999. Advances in kernel methods. MIT Press, Cambridge, MA, 169--184.

Digital Library

[40]

Christopher Johnson, Parul Shukla, and Shilpa Shukla. 2012. On Classifying the Political Sentiment of Tweets. www.cs.utexas.edu.

[41]

Hwi-Gang Kim, Seongjoo Lee, and Sunghyon Kyeong. 2013. Discovering hot topics using twitter streaming data: Social topic detection and geographic clustering. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’13). ACM, New York, NY, 1215--1220.

Digital Library

[42]

Soo-Min Kim and Eduard Hovy. 2004. Determining the sentiment of opinions. In Proceedings of the 20th International Conference on Computational Linguistics (COLING’04). Association for Computational Linguistics, Stroudsburg, PA.

Digital Library

[43]

Milosz R. Kmieciak and Jerzy Stefanowski. 2011. Handling sudden concept drift in enron messages data stream. Control Cybernet. 40(3) (2011), 667--695.

[44]

P. F. Lazarsfeld and R. K. Merton. 1954. Friendship as a social process: A substantive and methodological analysis. In Freedom and Control in Modern Society, M. Berger, T. Abel, and C. Page (Eds.). Van Nostrand, New York, NY, 18--66.

[45]

Shoushan Li, Zhongqing Wang, Guodong Zhou, and Sophia Yat Mei Lee. 2011. Semi-supervised learning for imbalanced sentiment classification. In IJCAI. 1826--1831.

Digital Library

[46]

Tao Li, Vikas Sindhwani, Chris H. Q. Ding, and Yi Zhang. 2010. Bridging domains with words: Opinion analysis with matrix tri-factorizations. In SDM. 293--302.

[47]

Jimmy Lin and Alek Kolcz. 2012. Large-scale machine learning at twitter. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD’12). ACM, New York, NY, 793--804.

Digital Library

[48]

Bing Liu. 2012. Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers, London.

Digital Library

[49]

Shenghua Liu, Fuxin Li, Fangtao Li, Xueqi Cheng, and Huawei Shen. 2013b. Adaptive co-training SVM for sentiment classification on tweets. In Proceedings of the 22nd ACM International Conference on Conference on Information &&num;#38; Knowledge Management (CIKM’13). ACM, New York, NY, 2079--2088.

Digital Library

[50]

Shenghua Liu, Wenjun Zhu, Ning Xu, Fangtao Li, Xue-qi Cheng, Yue Liu, and Yuanzhuo Wang. 2013c. Co-training and visualizing sentiment evolvement for tweet events. In Proceedings of the 22nd International Conference on World Wide Web Companion (WWW’13 Companion). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 105--106.

Digital Library

[51]

Zhiguang Liu, Xishuang Dong, Yi Guan, and Jinfeng Yang. 2013a. Reserved self-training: A semi-supervised sentiment classification method for chinese microblogs. In Proceedings of the Sixth International Joint Conference on Natural Language Processing. Asian Federation of Natural Language Processing, Nagoya, Japan, 455--462.

[52]

Roberto Lourenco Jr., Adriano Veloso, Adriano Pereira, Wagner Meira Jr., Renato Ferreira, and Srinivasan Parthasarathy. 2014. Economically-efficient sentiment stream analysis. In Proceedings of the 37th International ACM SIGIR Conference on Research &&num;#38; Development in Information Retrieval (SIGIR’14). ACM, New York, NY, 637--646.

Digital Library

[53]

Yue Lu, Panayiotis Tsaparas, Alexandros Ntoulas, and Livia Polanyi. 2010. Exploiting social context for review quality prediction. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, NY, 691--700.

Digital Library

[54]

Mohammad Masud, Jing Gao, Latifur Khan, Jiawei Han, and Xiaohu Li. 2008. A practical approach to classify evolving data streams: Training with limited amount of labeled data. Proc. 2008 Int. Conf. on Data Mining (ICDM’08), Pisa, Italy, Dec. 2008 (December 2008).

Digital Library

[55]

Miller McPherson, Lynn Smith-Lovin, and James M. Cook. 2001. Birds of a feather: Homophily in social networks. Annu. Rev. Sociol. 27, 1 (2001), 415--444.

[56]

Walaa Medhat, Ahmed Hassan, and Hoda Korashy. 2014. Sentiment analysis algorithms and applications: A survey. Ain Shams Eng. J. 5, 4 (2014), 1093--1113.

[57]

Qiaozhu Mei, Xu Ling, Matthew Wondra, Hang Su, and ChengXiang Zhai. 2007. Topic sentiment mixture: Modeling facets and opinions in weblogs. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). ACM, New York, NY, 171--180.

Digital Library

[58]

Yelena Mejova and Padmini Srinivasan. 2012. Political speech in social media streams: YouTube comments and twitter posts. In Proceedings of the 4th Annual ACM Web Science Conference (WebSci’12). ACM, New York, NY, 205--208.

Digital Library

[59]

Yasuhide Miura, Shigeyuki Sakaki, Keigo Hattori, and Tomoko Ohkuma. 2014. TeamX: A sentiment analyzer with enhanced lexicon mapping and weighting scheme for unbalanced data. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Association for Computational Linguistics and Dublin City University, Dublin, Ireland, 628--632.

[60]

Saif Mohammad. 2012. #Emotional tweets. In &ast;SEM 2012: The First Joint Conference on Lexical and Computational Semantics -- Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012). Association for Computational Linguistics, Montréal, Canada, 246--255.

Digital Library

[61]

Saif Mohammad, Svetlana Kiritchenko, and Xiaodan Zhu. 2013. NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets. In Proceedings of the Seventh International Workshop on Semantic Evaluation Exercises (SemEval-2013). Atlanta, GA.

[62]

AndréS Montoyo, Patricio MartíNez-Barco, and Alexandra Balahur. 2012. Subjectivity and sentiment analysis: An overview of the current state of the area and envisaged developments. Decis. Support Syst. 53, 4 (Nov. 2012), 675--679.

Digital Library

[63]

Preslav Nakov, Sara Rosenthal, Zornitsa Kozareva, Veselin Stoyanov, Alan Ritter, and Theresa Wilson. 2013. SemEval-2013 task 2: Sentiment analysis in twitter. In Second Joint Conference on Lexical and Computational Semantics (&ast;SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013). Association for Computational Linguistics, Atlanta, GA, 312--320.

[64]

Tim O’Keefe and Irena Koprinska. 2009. Feature selection and weighting methods in sentiment analysis. In Proceedings of 14th Australasian Document Computing Symposium.

[65]

Olutobi Owoputi, Chris Dyer, Kevin Gimpel, Nathan Schneider, and Noah A. Smith. 2013. Improved part-of-speech tagging for online conversational text with word clusters. In Proceedings of NAACL.

[66]

Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (Oct. 2010), 1345--1359.

Digital Library

[67]

Bo Pang and Lillian Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of ACL. 115--124.

Digital Library

[68]

Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of EMNLP. 79--86.

Digital Library

[69]

FedericoAlberto Pozzi, Daniele Maccagnola, Elisabetta Fersini, and Enza Messina. 2013. Enhance user-level sentiment analysis on microblogs with approval relations. In AI&ast;IA 2013: Advances in Artificial Intelligence, Matteo Baldoni, Cristina Baroglio, Guido Boella, and Roberto Micalizio (Eds.). Lecture Notes in Computer Science, Vol. 8249. Springer International Publishing, Berlin, 133--144.

[70]

Ashequl Qadir and Ellen Riloff. 2013. Bootstrapped learning of emotion hashtags #hashtags4you. In Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics, Atlanta, GA, 2--11.

[71]

Guang Qiu, Xiaofei He, Feng Zhang, Yuan Shi, Jiajun Bu, and Chun Chen. 2010. DASA: Dissatisfaction-oriented advertising based on sentiment analysis. Expert Syst. Appl. 37, 9 (2010), 6182--6191.

Digital Library

[72]

Likun Qiu, Weishi Zhang, Changjian Hu, and Kai Zhao. 2009. SELC: A self-supervised model for sentiment classification. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM’09). ACM, New York, NY, 929--936.

Digital Library

[73]

Delip Rao and David Yarowsky. 2009. Ranking and semi-supervised classification on large scale graphs using map-reduce. In Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing (TextGraphs-4). Association for Computational Linguistics, Stroudsburg, PA, 58--65.

Digital Library

[74]

Jonathon Read. 2005. Using emoticons to reduce dependency in machine learning techniques for sentiment classification. In Proceedings of the ACL Student Research Workshop (ACLstudent’05). Association for Computational Linguistics, Stroudsburg, PA, 43--48.

Digital Library

[75]

Jonathon Read and John Carroll. 2009. Weakly supervised techniques for domain-independent sentiment classification. In Proceedings of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion (TSA’09). ACM, New York, NY, 45--52.

Digital Library

[76]

Yong Ren, Nobuhiro Kaji, Naoki Yoshinaga, and Masaru Kitsuregawa. 2014. Sentiment classification in under-resourced languages using graph-based semi-supervised learning methods. IEICE Trans. 97-D, 4 (2014), 790--797.

[77]

Carlos Rodriguez-Penagos, Jordi Atserias, Joan Codina-Filba, David Garcia-Narbona, Jens Grivolla, Patrik Lambert, and Roser Sauri. 2013. FBM: Combining lexicon-based ML and heuristics for social media polarities. In Proceedings of SemEval-2013 -- International Workshop on Semantic Evaluation Co-located with &ast;Sem and NAACL. Atlanta, GA.

[78]

Sara Rosenthal, Preslav Nakov, Alan Ritter, and Veselin Stoyanov. 2014. SemEval-2014 task 9: Sentiment analysis in twitter. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 14), Preslav Nakov and Torsten Zesch (Eds.). Dublin, Ireland.

[79]

Hassan Saif, Yulan He, and Harith Alani. 2012. Semantic sentiment analysis of twitter. In Proceedings of the 11th International Conference on The Semantic Web - Volume Part I (ISWC’12). Springer-Verlag, Berlin, 508--524.

Digital Library

[80]

Hassan Saif, Yulan He, Miriam Fernández, and Harith Alani. 2016. Contextual semantics for sentiment analysis of Twitter. Inf. Process. Manage. 52, 1 (2016), 5--19. http://dx.doi.org/10.1016/j.ipm.2015.01.005.

Digital Library

[81]

Ted Sandler, John Blitzer, Partha Pratim Talukdar, and Lyle H. Ungar. 2008. Regularized learning with networks of features. In NIPS, Daphne Koller, Dale Schuurmans, Yoshua Bengio, and Lon Bottou (Eds.). Curran Associates, Inc., Red Hook, NY, 1401--1408.

[82]

H. Scudder, III 1965. Probability of error of some adaptive pattern-recognition machines. IEEE Trans. Inform. Theor. 11, 3 (Jul 1965), 363--371.

Digital Library

[83]

Fabrizio Sebastiani. 2002. Machine learning in automated text categorization. ACM Comput. Surv. 34, 1 (March 2002), 1--47.

Digital Library

[84]

Aliaksei Severyn and Alessandro Moschitti. 2015. Twitter sentiment analysis with deep convolutional neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’15). ACM, New York, NY, 959--962.

Digital Library

[85]

Jianfeng Si, Arjun Mukherjee, Bing Liu, Qing Li, Huayi Li, and Xiaotie Deng. 2013. Exploiting topic based twitter sentiment for stock prediction. In ACL (2). 24--29.

[86]

Nádia Silva, Estevam Hruschka, and Eduardo Hruschka. 2014. Biocom Usp: Tweet sentiment analysis with adaptive boosting ensemble. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Association for Computational Linguistics and Dublin City University, Dublin, Ireland, 123--128.

[87]

Vikas Sindhwani and Prem Melville. 2008. Document-word co-regularization for semi-supervised sentiment analysis. In ICDM. 1025--1030.

Digital Library

[88]

Jared Suttles and Nancy Ide. 2013. Distant supervision for emotion classification with discrete binary values. In Computational Linguistics and Intelligent Text Processing, Alexander Gelbukh (Ed.). Lecture Notes in Computer Science, Vol. 7817. Springer Berlin, 121--136.

Digital Library

[89]

Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, and Manfred Stede. 2011. Lexicon-based methods for sentiment analysis. Comput. Linguist. 37, 2 (June 2011), 267--307.

Digital Library

[90]

Chenhao Tan, Lillian Lee, Jie Tang, Long Jiang, Ming Zhou, and Ping Li. 2011. User-level sentiment analysis incorporating social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11). ACM, New York, NY, 1397--1405.

Digital Library

[91]

Duyu Tang, Bing Qin, and Ting Liu. 2015. Deep learning for sentiment analysis: Successful approaches and future challenges. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 5, 6 (2015), 292--303.

Digital Library

[92]

Duyu Tang, Furu Wei, Bing Qin, Ting Liu, and Ming Zhou. 2014a. Coooolll: A deep learning system for twitter sentiment classification. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Association for Computational Linguistics and Dublin City University, Dublin, Ireland, 208--212.

[93]

Duyu Tang, Furu Wei, Bing Qin, Ming Zhou, and Ting Liu. 2014b. Building large-scale twitter-specific sentiment lexicon: A representation learning approach. In COLING 2014, 25th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, August 23--29, 2014, Dublin, Ireland. 172--182.

[94]

Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, and Bing Qin. 2014c. Learning sentiment-specific word embedding for twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, June 22--27, 2014, Baltimore, MD, USA, Volume 1: Long Papers. 1555--1565.

[95]

Harsh Thakkar and Dhiren Patel. 2013. Approaches for sentiment analysis on twitter: A state-of-art study. In International Network for Social Network Analysis Conference (INSNA). Xi’an, China.

[96]

Mike Thelwall. 2010. Emotion homophily in social network site messages. First Monday 15, 4 (2010).

[97]

Mike Thelwall, Kevan Buckley, Georgios Paltoglou, Di Cai, and Arvid Kappas. 2010. Sentiment in short strength detection informal text. J. Am. Soc. Inf. Sci. Technol. 61, 12 (Dec. 2010), 2544--2558.

Digital Library

[98]

Mikalai Tsytsarau and Themis Palpanas. 2012. Survey on mining subjective data on the web. Data Min. Knowl. Discov. 24, 3 (May 2012), 478--514.

Digital Library

[99]

Peter D. Turney. 2001. Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In Proceedings of the 12th European Conference on Machine Learning (EMCL’01). Springer-Verlag, London, 491--502. http://dl.acm.org/citation.cfm?id=645328.650004

Digital Library

[100]

Peter D. Turney. 2002. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL 02). Association for Computational Linguistics, Stroudsburg, PA, 417--424.

Digital Library

[101]

Aline A. Vanin, Larissa A. Freitas, Renata Vieira, and Marco Bochernitsan. 2013. Some clues on irony detection in tweets. In Proceedings of the 22nd International Conference on World Wide Web Companion (WWW’13 Companion). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 635--636.

Digital Library

[102]

G. Vinodhini and R. M. Chandrasekaran. 2012. Sentiment analysis and opinion mining: A survey. Int. J. 2, 6 (2012).

[103]

Xiaojun Wan. 2009. Co-training for cross-lingual sentiment classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1 (ACL’09). Association for Computational Linguistics, Stroudsburg, PA, 235--243.

Digital Library

[104]

Changbo Wang, Zhao Xiao, Yuhua Liu, Yanru Xu, Aoying Zhou, and Kang Zhang. 2013. SentiView: Sentiment analysis and visualization for internet popular topics. IEEE T. Hum.-Mach. Syst. 43, 6 (2013), 620--630.

[105]

Wenbo Wang, Lu Chen, Krishnaprasad Thirunarayan, and Amit P. Sheth. 2012. Harnessing twitter ”big data” for automatic emotion identification. In SocialCom/PASSAT. IEEE, Los Alamitos, CA, 587--592.

Digital Library

[106]

Bing Xiang and Liang Zhou. 2014. Improving twitter sentiment analysis with topic-based mixture modeling and semi-supervised training. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Baltimore, MD, 434--439.

[107]

Jun-Ming Xu, Kwang-Sung Jun, Xiaojin Zhu, and Amy Bellmore. 2012. Learning from bullying traces in social media. In HLT-NAACL. The Association for Computational Linguistics, Baltimore, MD, 656--666.

Digital Library

[108]

Ning Yu. 2014. Exploring co-training strategies for opinion detection. J. Assoc. Inform. Sci. Technol. 65, 10 (2014), 2098--2110. http://dx.doi.org/10.1002/asi.23111.

Digital Library

[109]

Taras Zagibalov and John Carroll. 2008. Unsupervised classification of sentiment and objectivity in Chinese text. In Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP).

[110]

Lei Zhang, Riddhiman Ghosh, Mohamed Dekhil, Meichun Hsu, and Bing Liu. 2011. Combining lexicon-based and learning-based methods for twitter sentiment analysis. HP Laboratories. Technical Report.

[111]

Jiang Zhao, Man Lan, and Tiantian Zhu. 2014. ECNU: Expression- and message-level sentiment orientation classification in twitter using multiple effective features. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Association for Computational Linguistics and Dublin City University, Dublin, Ireland, 259--264.

[112]

Xiaojin Zhu. 2005. Semi-supervised learning literature survey. (2005). Technical Report 1530, Computer Sciences, University of Wisconsin—Madison.

[113]

Xiaojin Zhu and Andrew B. Goldberg. 2007. Kernel regression with order preferences. Proc. Natl. Conf. Artif. Intell. 22, 1 (2007), 681.

Digital Library

[114]

Xiaojin Zhu and Andrew B. Goldberg. 2009. Introduction to Semi-Supervised Learning. Morgan & Claypool Publishers, London.

Digital Library

[115]

Xiaojin Zhu, Andrew B. Goldberg, Ronald Brachman, and Thomas Dietterich. 2009. Introduction to Semi-Supervised Learning. Morgan and Claypool Publishers, London.

[116]

Xiaodan Zhu, Svetlana Kiritchenko, and Saif Mohammad. 2014. NRC-Canada-2014: Recent improvements in the sentiment analysis of tweets. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Association for Computational Linguistics and Dublin City University, Dublin, Ireland, 443--447.

[117]

Xiaojin Zhu, John Lafferty, and Zoubin Ghahramani. 2003. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions. In ICML 2003 Workshop on The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining. 58--65.

Cited By

Chandra Sekhar JKiran Mayee MNadagoudar RChinna Alluraiah NDhanamjayulu CChinthaginjala RK. RM. PMohanty SKhan B(2024)Classification and Comparative Evaluation of Text and Emoji-Based Tweets With Deep Neural Network ModelsJournal of Electrical and Computer Engineering10.1155/2024/96524242024Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1155/2024/9652424
Chaudhary LGirdhar NSharma DAndreu-Perez JDoucet ARenz M(2024)A Review of Deep Learning Models for Twitter Sentiment Analysis: Challenges and OpportunitiesIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.332200211:3(3550-3579)Online publication date: Jun-2024
https://doi.org/10.1109/TCSS.2023.3322002
Zhu WYu ZChio KLuo W(2024)Label Distribution Representation Learning in Document-Level Sentiment Analysis2024 IEEE 9th International Conference on Computational Intelligence and Applications (ICCIA)10.1109/ICCIA62557.2024.10719196(79-83)Online publication date: 9-Aug-2024
https://doi.org/10.1109/ICCIA62557.2024.10719196
Show More Cited By

Index Terms

A Survey and Comparative Study of Tweet Sentiment Analysis via Semi-Supervised Learning
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
  2. Information systems applications
    1. Decision support systems
      1. Expert systems

Recommendations

Inductive Semi-supervised Multi-Label Learning with Co-Training
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

In multi-label learning, each training example is associated with multiple class labels and the task is to learn a mapping from the feature space to the power set of label space. It is generally demanding and time-consuming to obtain labels for training ...
Instance selection in semi-supervised learning
Canadian AI'11: Proceedings of the 24th Canadian conference on Advances in artificial intelligence

Semi-supervised learning methods utilize abundant unlabeled data to help to learn a better classifier when the number of labeled instances is very small. A common method is to select and label unlabeled instances that the current classifier has high ...
Sentiment labeling for extending initial labeled data to improve semi-supervised sentiment classification

Semi-supervised framework which exploits unsupervised approach (JST) is proposed.Self-training suffers from incorrectly labeling problem with insufficient data.Confidently predicted instances are labeled and used as training data by JST.Self-training ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 49, Issue 1

March 2017

705 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/2911992

Editor:
Sartaj Sahni
Department of Computer and Information Science and Engineering/University of Florida/Gainesville, FL

Issue’s Table of Contents

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 June 2016

Accepted: 01 March 2016

Revised: 01 December 2015

Received: 01 June 2015

Published in CSUR Volume 49, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Survey
Research
Refereed

Funding Sources

CNPq
Brazilian Research Agencies Capes
FAPESP

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

57
Total Citations
View Citations
2,241
Total Downloads

Downloads (Last 12 months)72
Downloads (Last 6 weeks)13

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chandra Sekhar JKiran Mayee MNadagoudar RChinna Alluraiah NDhanamjayulu CChinthaginjala RK. RM. PMohanty SKhan B(2024)Classification and Comparative Evaluation of Text and Emoji-Based Tweets With Deep Neural Network ModelsJournal of Electrical and Computer Engineering10.1155/2024/96524242024Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1155/2024/9652424
Chaudhary LGirdhar NSharma DAndreu-Perez JDoucet ARenz M(2024)A Review of Deep Learning Models for Twitter Sentiment Analysis: Challenges and OpportunitiesIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.332200211:3(3550-3579)Online publication date: Jun-2024
https://doi.org/10.1109/TCSS.2023.3322002
Zhu WYu ZChio KLuo W(2024)Label Distribution Representation Learning in Document-Level Sentiment Analysis2024 IEEE 9th International Conference on Computational Intelligence and Applications (ICCIA)10.1109/ICCIA62557.2024.10719196(79-83)Online publication date: 9-Aug-2024
https://doi.org/10.1109/ICCIA62557.2024.10719196
Ishida TSeki YKeyaki AKashino WKando N(2023)Evaluation of Citizen Opinion Extraction Across Cities都市を横断した市民意見抽出の評価Journal of Natural Language Processing10.5715/jnlp.30.58630:2(586-631)Online publication date: 2023
https://doi.org/10.5715/jnlp.30.586
Lai YChen M(2023)Review of Survey Research in Fuzzy Approach for Text MiningIEEE Access10.1109/ACCESS.2023.326816511(39635-39649)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3268165
Taha K(2023)Semi-supervised and un-supervised clusteringInformation Systems10.1016/j.is.2023.102178114:COnline publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1016/j.is.2023.102178
Ermakova TFabian BGolimblevskaia EHenke M(2023)A Comparison of Commercial Sentiment Analysis ServicesSN Computer Science10.1007/s42979-023-01886-y4:5Online publication date: 24-Jun-2023
https://dl.acm.org/doi/10.1007/s42979-023-01886-y
de Oliveira WBerton L(2023)A systematic review for class-imbalance in semi-supervised learningArtificial Intelligence Review10.1007/s10462-023-10579-056:Suppl 2(2349-2382)Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1007/s10462-023-10579-0
Liu JZhou ZGao MTang JFan W(2023)Aspect sentiment mining of short bullet screen comments from online TV seriesJournal of the Association for Information Science and Technology10.1002/asi.2480074:8(1026-1045)Online publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1002/asi.24800
Ng CLaw KIp A(2022)Assessing Public Opinions of Products Through Sentiment AnalysisResearch Anthology on Implementing Sentiment Analysis Across Multiple Disciplines10.4018/978-1-6684-6303-1.ch073(1422-1440)Online publication date: 10-Jun-2022
https://doi.org/10.4018/978-1-6684-6303-1.ch073
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents