A survey of multilingual human-tagged short message datasets for sentiment analysis tasks

F. Steiner-Correa¹,
M. I. Viedma-del-Jesus¹ &
A. G. Lopez-Herrera²

907 Accesses
15 Citations
2 Altmetric
Explore all metrics

Abstract

Today, the electronic word-of-mouth (eWOM) statements expressed on blogs, social media or shopping platforms are much frequent and enable customers to share his/her point of view about acquired products or services. These eWOM statements can be used for the industry to improve its products and services and for customers for making better purchase decisions. Sentiment analysis (SA) techniques can be used to extract and analyze these eWOM statements. Research in recent years on SA has advanced considerably, and its applications in business management have grown exponentially. Automatic techniques (such as machine learning, deep learning and statistic approaches) have been used for this purpose. However, training a machine for processing or analyzing sentiments is a hard task, mainly due to the complexity of the natural language. This task is more complicated in multilingual environments. There is still a great paucity regarding training datasets, one of the key resources in achieving more favorable results. Training datasets, in fact, are a reservoir of information serving to teach and refine the skills of automatic techniques. Hence, the higher the quality of the training datasets, the better predictive power of sentiment analysis tasks. English datasets are relatively easy to find in the literature; however, datasets in other languages are very scarce. So, this paper therefore describes and compiles information concerning 25 datasets gleaned from short messages (statements expressed in social media and shopping platforms) in seven different languages, for the most part from Twitter. For quality issues, all the resources were human-tagged, and they are currently available to the scientific community. A new sentiment dataset in English extracted from Twitter has also been drawn up and each message evaluated subjectively. The current survey therefore aims to provide essential quality information for future research related to automatic sentiment analysis in monolingual or multilingual scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multilingual Sentiment Analysis: State of the Art and Independent Comparison of Techniques

Article Open access 01 June 2016

Multilingual Sentiment Analysis

A comparative empirical study on social media sentiment analysis over various genres and languages

Article 02 July 2016

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

http://www.internetlivestats.com/twitter-statistics/ [accessed July 14, 2017].
http://alt.qcri.org/semeval2015/task11/ [accessed July 14, 2017].
https://crowdflower.com [accessed July 14, 2017].
http://www.dai-labor.de/ [accessed July 14, 2017].
http://www.dai-labor.de/ [accessed July 14, 2017].
http://www3.nd.edu/~dwang5/courses/spring15/assignments/A1/Assignment1_SocialSensing.html [accessed July 14, 2017].
http://www.dai-labor.de/ [accessed July 14, 2017].
http://www.di.unito.it/~tutreeb/sentiTUT.html [accessed July 14, 2017].
http://www.evalita.it/2016/tasks/sentipolc [accessed July 14, 2017].
http://www.dai-labor.de/ [accessed July 14, 2017].

References

Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst 26(3):12:1–12:34. doi:10.1145/1361684.1361685
Article Google Scholar
Abdulla NA, Ahmed NA, Shehab MA, Al-Ayyoub M (2013) Arabic sentiment analysis: lexicon-based and corpus-based. In: Proceedings of IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT’13)
Ahmad M, Aftab S, Muhammad SS, Waheed U (2017) Tools and techniques for lexicon driven sentiment analysis: a review. Int J Multidiscip Sci Eng 8(1):17–23
Google Scholar
Al-Kabi M, Al-Ayyoub M, Alsmadi I, Wahsheh H (2016) A prototype for a standard Arabic sentiment analysis corpus. Int Arab J Inf Technol 13:163–170
Google Scholar
Al-Twairesh N, Al-Khalifa H, Al-Salman A (2015) Subjectivity and sentiment analysis of Arabic: trends and challenges. In: Proceedings of IEEE/ACS International Conference on Computer Systems and Applications, (AICCSA’15), pp 148–155
Araujo M, Pereira A, Reis J, Benevenuto F (2016) An evaluation of machine translation for multilingual sentence-level sentiment analysis. 1140–1145. doi:10.1145/2851613.2851817
Baca-Gomez YR, Martinez A, Rosso P et al (2016) Web service SWePT: a hybrid opinion mining approach. J Univers Comput Sci 22:671–690
MathSciNet Google Scholar
Balahur A, Hermida JM, Montoyo A (2012) Building and exploiting EmotiNet, a knowledge base for emotion detection based on the appraisal theory model. IEEE Trans Affect Comput 3(1):88–101
Article Google Scholar
Balog K, Mishne G, Rijke M De (2006) Why are they excited? Identifying and explaining spikes in blog mood levels. In: Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters and Demonstrations (EACL ’06) (pp. 207–210). Retrieved from http://dl.acm.org/citation.cfm?id=1609010
Barbosa L, Feng J (2010) Robust sentiment detection on twitter from biased and noisy data. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, pp 36–44. Retrieved from http://dl.acm.org/citation.cfm?id=1944571
Basile P, Basile V, Nissim M, Novielli N (2015) Deep tweets: from entity linking to sentiment analysis. In: Proceedings of Second Italian Conference on Computational Linguistics (CLiC-it’15), pp 41–45
Basile P, Novielli N (2014) UNIBA at EVALITA 2014-SENTIPOLC Task: predicting tweet sentiment polarity combining micro-blogging, lexicon and semantic features. In: Proceedings of 4th Evaluation of NLP and Speech Tools for Italian (EVALITA’14), pp 58–63
Basile V, Bolioli A, Nissim M, et al (2014) Overview of the Evalita 2014 sentiment polarity classification task. In: Proceedings of 4th Evaluation of NLP and Speech Tools for Italian (EVALITA’14), pp 50–57
Beineke P, Hastie T, Manning C, Vaithyanathan S (2004) Exploring Sentiment Summarization. In: Proceedings of the AAAI Spring Symposium on Exploring Attitude and Affect in Text Theories and Applications (Vol. 7, pp. 1–4). Retrieved from http://www.aaai.org/Papers/Symposia/Spring/2004/SS-04-07/SS04-07-003.pdf
Bernabé-Moreno J, Tejeda-Lorente A, Porcel C, Fujita H, Herrera-Viedma E (2015a) CARESOME: a system to enrich marketing customers acquisition and retention campaigns using social media information. Knowl-Based Syst 80:163–179
Article Google Scholar
Bernabé-Moreno J, Tejeda-Lorente A, Porcel C, Fujita H, Herrera-Viedma E (2015b) Emotional profiling of locations based on social media. Proced Comput Sci 55:960–969
Article Google Scholar
Bernabé-Moreno J, Tejeda-Lorente A, Porcel C, Herrera-Viedma E (2015c) A new model to quantify the impact of a topic in a location over time with social media. Expert Syst Appl 42(7):3381–3395
Article Google Scholar
Boiy E, Moens MF (2009) A machine learning approach to sentiment analysis in multilingual web texts. Inf Retr 12:526–558. doi:10.1007/s10791-008-9070-z
Article Google Scholar
Bosco C, Patti V, Bolioli A (2015) Developing corpora for sentiment analysis: the case of irony and Senti-TUT (extended abstract). In: Proceedings of Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI’15) pp 4158–4162. doi:10.1109/MIS.2013.28
Article Google Scholar
Bradley MM, Lang PJ (1994) Measuring emotion: the self-assessment manikin and the semantic differential. J Behav Ther Exp Psychiatry 25:49–59. doi:10.1016/0005-7916(94)90063-9
Article Google Scholar
Bravo-Marquez F, Mendoza M, Poblete B (2013) Combining strengths, emotions and polarities for boosting twitter sentiment analysis. In: Proceedings of Second International Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM’13), pp 1–9
Cambria E, Speer R, Havasi C, Hussain A (2010) SenticNet: a publicly available semantic resource for opinion mining. In: AAAI Fall Symposium: Commonsense Knowledge, vol. 10, p 02
Cambria E, Havasi C, Hussain A (2012) SenticNet 2: a semantic and affective resource for opinion mining and sentiment analysis, In: Proceedings of 25th Int’l Florida Artificial Intelligence Research Society Conference, AAAI, pp 202–207
Cambria E, Olsher D, Rajagopal E (2014) SenticNet 3: a common and commonsense knowledge base for cognition-driven sentiment analysis, In: Twentyeighth AAAI Conference on Artificial Intelligence, pp 1515–1521
Castellucci G, Croce D, Cao D De, Basili R (2014) A multiple kernel approach for twitter sentiment analysis in Italian. In: Proceedings of 4th Evaluation of NLP and speech tools for Italian (EVALITA’14), pp 98–103
Chafale D, Pimpalkar A (2014) Review on developing corpora for sentiment analysis using plutchik’s wheel of emotions with fuzzy logic. Int J Comput Sci Eng (IJCSE) 2:14–18
Google Scholar
Chen H, Zimbra D (2010) AI and opinion mining. IEEE Intell Syst 25:74–76. doi:10.1109/MIS.2010.75
Article Google Scholar
Coletta LFS, Silva NFF, Hruschka ER, Hruschka ERJ (2014) Combining classification and clustering for tweet sentiment analysis. In: Proceedings of Brazilian Conference on Intelligent Systems (BRACIS’14), pp 210–215
Cotelo JM, Cruz FL, Enríquez F, Troyano JA (2016) Tweet categorization by combining content and structural knowledge. Inf Fus 31:54–64. doi:10.1016/j.inffus.2016.01.002
Article Google Scholar
Cumbreras MÁG, Cámara EM, Román JV, Morera JG (2016) TASS 2015-the evolution of the Spanish opinion mining systems. Procesamiento de Lenguaje Nat 56:33–40
Google Scholar
Da Silva NFF, Hruschka ER, Hruschka ERJ (2014) Tweet sentiment analysis with classifier ensembles. Decis Support Syst 66:170–179. doi:10.1016/j.dss.2014.07.003
Article Google Scholar
Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th international conference on World Wide Web (pp 519–528). doi:10.1145/775152.775226
Dickinson B, Ganger M, Hu W (2015) Dimensionality reduction of distributed vector word representations and emoticon stemming for sentiment analysis. J Data Anal Inf Process 3:153–162. doi:10.4236/jdaip.2015.34015
Article Google Scholar
Ding X, Liu B, Yu PS (2008) A holistic lexicon-based approach to opinion mining. In: Proceedings of International conference on Web search and web data mining (WSDM’08), pp 231–239
Dosciatti MM, Ferreira LPC, Paraiso EC (2013) Identificando emoções em textos em português do Brasil usando máquina de vetores de suporte em solução multiclasse. In: Proceedings of X Encontro nacional de inteligência artificial e computacional
Duncan B, Zhang Y (2015) Neural networks for sentiment analysis on twitter. In: Proceedings of 14th International conference on cognitive informatics and cognitive computing (ICCI’CC’15), pp 275–278
Esuli A, Sebastiani F (2006) Determining term subjectivity and term orientation for opinion mining. In: Proceedings of the 11th Meeting of the European Chapter of the Association for Computational Linguistics (EACL-2006), Vol. 2, pp 193–200. Retrieved from http://acl.ldc.upenn.edu/eacl2006/main/papers/13_1_esulisebastiani_192.pdf
Farías DIH, Patti V, Rosso P (2016) Irony detection in twitter: The role of affective content. ACM Trans Internet Technol (TOIT) 16(3):19
Article Google Scholar
Fast E, Chen B, Bernstein MS (2016) Empath: understanding topic signals in large-scale text. In: Conference on human factors in computing systems (CHI’16), pp 4647–4657
Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382
Article Google Scholar
Gaspar R, Pedro C, Panagiotopoulos P, Seibt B (2016) Beyond positive or negative: qualitative sentiment analysis of social media reactions to unexpected stressful events. Comput Human Behav 56:179–191. doi:10.1016/j.chb.2015.11.040
Article Google Scholar
Ghosh A, Li G, Veale T, et al (2015) SemEval-2015 Task 11: Sentiment analysis of figurative language in twitter. In: Proceedings of 9th International Workshop on Semantic Evaluation (SemEval’15), pp 470–478
Go A, Bhayani R, Huang L (2009a) Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, 2009 5. doi:10.1016/j.sedgeo.2006.07.004
Article Google Scholar
Go A, Huang L, Bhayani R (2009b) Twitter sentiment analysis. CS224N - Final Project Report 17. doi:10.1007/978-3-642-35176-1_32
Chapter Google Scholar
Greene S, Resnik P (2009) More than words: syntactic packaging and implicit sentiment. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the ACL, pp 503–511
Hennig–Thurau T, Gwinner KP, Walsh G, Gremler DD (2004) Electronic word-of-mouth via consumer-opinion platforms: what motivates consumers to articulate themselves on the Internet? J Interact Mark 18(1):38–52. doi:10.1002/dir.10073
Article Google Scholar
Hodes RL, Cook EW, Lang PJ (1985) Individual differences in autonomic response: conditioned association or conditioned fear? Psychophysiology 22:545–560. doi:10.1111/j.1469-8986.1985.tb01649.x
Article Google Scholar
Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (KDD’04) (pp 168–177). doi:10.1145/1014052.1014073
Hu X, Tang L, Tang J, Liu H (2013) Exploiting social relations for sentiment analysis in microblogging. In: Proceedings of Sixth ACM International Conference on Web Search and Data Mining (WSDM’13), pp 537–546
Hung C, Lin HK, Yuan C (2013) Using objective words in SentiWordNet to improve word-of-mouth sentiment classification. IEEE Trans Intell Syst 2:47–54
Google Scholar
Hurtado L-F, Pla F (2014) ELiRF-UPV en TASS 2014: análisis de sentimientos, detección de tópicos y análisis de sentimientos de aspectos en Twitter. Procesamiento del Lenguaje Natural pp 1–7
Kang H, Yoo SJ, Han D (2012) Senti-lexicon and improved Naïve Bayes algorithms for sentiment analysis of restaurant reviews. Expert Syst Appl 39:6000–6010
Article Google Scholar
Jakob N, Gurevych I (2010) Extracting opinion targets in a single-and cross-domain setting with conditional random fields. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp 1035–1045. Retrieved from http://portal.acm.org/citation.cfm?id=1870759
Jindal N, Liu B (2006) Identifying comparative sentences in text documents. In: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR’06), p 244. doi:10.1145/1148170.1148215
Jindal N, Liu B (2007) Review spam detection. In: Proceedings of WWW-2007, pp 1189–1190. doi:10.1145/1242572.1242759
Jurafsky D, Martin JH (2009) Speech and language processing: an introduction to natural language processing
Krippendorff K (2004) Content analysis: an introduction to its methodology, 2nd edn
Krippendorff K (2011) Computing Krippendorff’s alpha-reliability. Departmental Papers (ASC) p 1-12
Lahuerta-Otero E, Cordero-Gutiérrez R (2016) Looking for the perfect tweet. The use of data mining techniques to find influencers on Twitter. Comput Human Behav 64:575–583. doi:10.1016/j.chb.2016.07.035
Article Google Scholar
Lee SW, Song YI, Lee JT, Han KS, Rim HC (2012) A new generative opinion retrieval model integrating multiple ranking factors. J Intell Inf Syst 38(2):487–505. doi:10.1007/s10844-011-0164-5
Article Google Scholar
Li S-T, Tsai F-C (2013) A fuzzy conceptualization model for text mining with application in opinion polarity classification. Knowl-Based Syst 39:23–33. doi:10.1016/j.knosys.2012.10.005
Article Google Scholar
Liu B (2012) Sentiment analysis and opinion mining. Synthesis lectures on human language technologies, vol 5. Morgan & Claypool Publishers, San Rafael. doi:10.2200/S00416ED1V01Y201204HLT016
Article Google Scholar
Martín-Valdivia MT, Martínez-Cámara E, Perea-Ortega JM, Ureña-López LA (2013) Sentiment polarity detection in Spanish reviews combining supervised and unsupervised approaches. Expert Syst Appl 40(10):3934–3942. doi:10.1016/j.eswa.2012.12.084
Article Google Scholar
Mohammad SM, Kiritchenko S, Zhu X (2013) NRC-Canada: building the state-of-the-art in sentiment analysis of tweets. In: Proceedings of seventh international workshop on semantic evaluation exercises (SemEval’13), pp 321–327. arXiv preprint arXiv:1308.6242. Accessed 09 Nov 2016
Mohammad SM, Sobhani P, Kiritchenko S (2016) Stance and sentiment in tweets. ACM Trans Embed Comput Syst 0:22. arXiv preprint arXiv:1605.01655v1. Accessed 09 Nov 2016
Mohammad SM, Zhu X, Kiritchenko S, Martin J (2015) Sentiment, emotion, purpose, and style in electoral tweets. Inf Process Manag 51:480–499. doi:10.1016/j.ipm.2014.09.003
Article Google Scholar
Momtazi S (2012) Fine-grained German sentiment analysis on social media. In: Proceedings of 9th Intl. Conference on Language Resources and Evaluation, pp 1215–1220
Montoyo A, Martínez-Barco P, Balahur A (2012) Subjectivity and sentiment analysis: an overview of the current state of the area and envisaged developments. Decis Support Syst 53:675–679. doi:10.1016/j.dss.2012.05.022
Article Google Scholar
Montesi M, Navarrete T (2008) Classifying web genres in context: a case study documenting the web genres used by a software engineer. Inf Process Manag 44(4):1410–1430. doi:10.1016/j.ipm.2008.02.001
Article Google Scholar
Morinaga S, Yamanishi K, Tateishi K, Fukushima T (2002) Mining product reputations on the web. In: Proceedings of Eighth ACM SIGKDD international conference on Knowledge discovery and data mining (KDD’02), pp 341–349
Mukherjee S, Bhattacharyya P (2012) Sentiment analysis in twitter with lightweight discourse analysis. In: Proceedings of Coling, pp 1847–1864
Mukherjee S, Malu A, Balamurali AR, Bhattacharyya P (2012) TwiSent: a multistage system for analyzing sentiment. In: Proceedings of Conference on Information and Knowledge Management (CIKM’12), pp 2531–2534
Nakov P, Rosenthal S, Kozareva Z, et al (2013) SemEval-2013 Task 2: sentiment analysis in twitter. In: Proceedings of International Workshop on Semantic Evaluation (SemEval’13), pp 312–320
Narr S, Hülfenhaus M, Albayrak S (2012) Language-independent twitter sentiment analysis. In: Proceedings of Knowledge Discovery and Machine Learning (KDML’12), pp 12–14
Nascimento P, Aguas R, de Lima D et al (2015) Análise de sentimento de tweets com foco em notícias. Revista Eletrônica de Sistemas de Informação 14:12. doi:10.5329/RESI
Article Google Scholar
Neviarouskaya A, Prendinger H, Ishizuka M (2011) SentiFul: a lexicon for sentiment analysis, IEEE Trans Affect Comput 2:1
Article Google Scholar
Nguyen HL, Jung JE (2017) Statistical approach for figurative sentiment analysis on social networking services: a case study on twitter. Multimed Tools Appl 76(6):8901–8914
Article Google Scholar
Obaidat I, Mohawesh R, Al-Ayyoub M, et al (2015) Enhancing the determination of aspect categories and their polarities in Arabic reviews using lexicon-based approaches. In: Proceedings of Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT’15), pp 1–6
Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th Annua Meeting of the Association for Computational Linguistics, pp 1–11. Retrieved from arXiv:1107.4557
Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summation based on minimum cuts. In: Proceedings of 42nd Annual Meeting on Association for Computational Linguistics (ACL’04), pp 271–278. doi:10.3115/1218955.1218990
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Conference on Empirical Methods in Natural Language Processing (EMNLP’02), pp 79–86. doi:10.3115/1118693.1118704
Pang B, Lee L (2004) A sentimental education: Sentiment analysis using subjectivity summation based on minimum cuts. In: Proceedings of 42nd Annual Meeting on Association for Computational Linguistics (ACL’04), pp 271–279. doi:10.3115/1218955.1218990
Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2:1–135. doi:10.1561/1500000011
Article Google Scholar
Park S (2015) Sentiment classification using sociolinguistic clusters. In: Proceedings of TASS 2015: Workshop on Sentiment Analysis at SEPLN, pp 99–104
Park S, Lee K, Song J (2011) Contrasting opposing views of news articles on contentious issues. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (HLT’11), pp 340–349
Parkhe V, Biswas B (2016) Sentiment analysis of movie reviews: finding most important movie aspects using driving factors. Soft Comput 20:3373–3379. doi:10.1007/s00500-015-1779-1
Article Google Scholar
Perea-Ortega JM, Balahur A (2014) Experiments on feature replacements for polarity classification of Spanish tweets. In: Proceedings of TASS 2014: Workshop on Sentiment Analysis at SEPLN, pp 1–7
Pino C, Kavasidis I, Spampinato C (2016) GeoSentiment: a Tool for Analyzing Geographically Distributed Event-related Sentiments. 2016 In: Proceedings of 13th IEEE Annual Consumer Communications and Networking Conference (CCNC)
Piryani R, Madhavi D, Singh VK (2017) Analytical mapping of opinion mining and sentiment analysis research during 2000–2015. Inf Process Manag 53(1):122–150
Article Google Scholar
Poria S, Gelbukh A, Hussain A, Howard N, Das D, Bandyopadhyay S (2013) Enhanced SenticNet with affective labels for concept-based opinion mining. IEEE Trans Intell Syst 2:31–38
Google Scholar
Ravi K, Ravi V (2015) A survey on opinion mining and sentiment analysis: Tasks, approaches and applications. Knowl-Based Syst 89:14–46. doi:10.1016/j.knosys.2015.06.015
Article Google Scholar
Reyes A, Rosso P, Buscaldi D (2012) From humor recognition to irony detection: the figurative language of social media. Data Knowl Eng 74:1–12
Article Google Scholar
Román JV, Morera JG, Cámara EM, Zafra SMJ (2015) TASS 2014-the challenge of aspect-based sentiment analysis. Procesamiento de Lenguaje Nat 54:61–68
Google Scholar
Roncal ISV, Urizar XS (2014) Looking for features for supervised tweet polarity classification. In: Proceedings of TASS 2014: Workshop on Sentiment Analysis at SEPLN
Rosenthal S, Nakov P, Kiritchenko S, et al (2015) Semeval-2015 task 10: sentiment analysis in twitter. In: Proceedings of 9th International Workshop on Semantic Evaluation (SemEval’15), pp 451–463
Roul RK, Asthana SR, Kumar G (2016) Study on suitability and importance of multilayer extreme learning machine for classification of text data. Soft Comput 1–18 doi:10.1007/s00500-016-2189-8
Article Google Scholar
Rushdi Saleh M, Martín-Valdivia MT, Montejo-Ráez A, Ureña-López LA (2011) Experiments with SVM to classify opinions in different domains. Expert Syst Appl 38(12):14799–14804. doi:10.1016/j.eswa.2011.05.070
Article Google Scholar
Sarvabhotla K, Pingali P, Varma V (2011) Sentiment classification a lexical similarity based approach for extracting subjectivity in documents. Inf Retr 14(3):337–353
Article Google Scholar
Saif H, Fernandez M, He Y, Alani H (2013) Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold. In: Proceedings of 1st International Workshop on Emotion and Sentiment in Social and Expressive Media: Approaches and Perspectives from AI (ESSEM’13), pp 9–21
Saif H, He Y, Alani H (2012) Semantic sentiment analysis of twitter. In: Proceedings of The 11th International Semantic Web Conference (ISWC’12), pp 508–524
Chapter Google Scholar
Saif H, He Y, Fernandez M, Alani H (2014a) Adapting sentiment lexicons using contextual semantics for sentiment analysis of Twitter. In: Proceedings of European Semantic Web Conference (ESWC’14), pp 54–63
Google Scholar
Saif H, He Y, Fernandez M, Alani H (2014b) Semantic patterns for sentiment analysis of twitter. In: Proceedings of Proceedings of the 13th International Semantic Web Conference - Part II (ISWC’14), pp 324–340
Google Scholar
Savoy J (2012) Authorship attribution based on specific vocabulary. ACM Trans Inf Syst 30(2):1–30. doi:10.1145/2180868.2180874
Article Google Scholar
Seki Y, Kando N, Aono M (2009) Multilingual opinion holder identification using author and authority viewpoints. Inf Process Manag 45(2):189–199. doi:10.1016/j.ipm.2008.11.004
Article Google Scholar
Serrano-Guerrero J, Olivas JA, Romero FP, Herrera-Viedma E (2015) Sentiment analysis: a review and comparative analysis of web services. Inf Sci 311:18–38
Article Google Scholar
Schouten K, Frasincar F (2016) Survey on aspect-level sentiment analysis. IEEE Trans Knowl Data Eng 28(3):813–830
Article Google Scholar
Scholz T, Conrad S, Hillekamps L (2012) Opinion mining on a German corpus of a media response analysis. In: Proceedings of International Conference on Text, Speech and Dialogue, pp 39–46
Chapter Google Scholar
Shalunts G, Backfried G, Prinz K (2014) Sentiment analysis of German social media data for natural disasters. In: Proceedings of 11th International conference on information systems for crisis response and management (ISCRAM’14), pp 752–756
Shammas DA, Kennedy L, Churchill EF (2009) Tweet the debates: understanding community annotation of uncollected sources. In: Proceedings of The first SIGMM workshop on Social media (WSM’09), pp 1–8
Spencer J, Uchyigit G (2012) Sentimentor: sentiment analysis of twitter data. In: Proceedings of The 1st International Workshop on Sentiment Discovery from Affective Data (SDAD’12), pp 56–66
Speriosu M, Sudan N, Upadhyay S, Baldridge J (2011) Twitter polarity classification with label propagation over lexical links and the follower graph. In: Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP’11), pp 53–63
Taboada M (2016) Sentiment analysis: an overview from linguistics. Annu Rev Linguistics 2:325–347
Article Google Scholar
Toprak C, Jakob N, Gurevych I (2010) Sentence and Expression Level Annotation of Opinions in User-Generated Discourse. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Vol. 1, pp 575–584. Retrieved from http://www.aclweb.org/anthology/P10-1059
Tsai ACR, Wu CE, Tsai RTH, Hsu JYJ (2013) Building a concept-level sentiment dictionary based on commonsense knowledge. IEEE Trans Intell Syst 2:22–30
Google Scholar
Tsakalidis A, Papadopoulos S, Kompatsiaris I (2014) An ensemble model for cross-domain polarity classification on Twitter. In: Conference on Web Information Systems Engineering-Part II (WISE’14), pp 168–177
Chapter Google Scholar
Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL’02), (July), pp 417–424. doi:10.3115/1073083.1073153
Vilares D, Alonso MA (2016) A review on political analysis and social media. Procesamiento de Lenguaje Nat 56:13–24
Google Scholar
Vilares D, Doval Y, Alonso MA, Gómez-Rodríguez C (2014) LyS at TASS 2014: a prototype for extracting and analysing aspects from Spanish tweets. In: Proceedings of TASS 2014: Workshop on Sentiment Analysis at SEPLN
Wang D, Zhu S, Li T (2013) SumView: a web-based engine for summarizing product reviews and customer opinions. Expert Syst Appl 40(1):27–33. doi:10.1016/j.eswa.2012.05.070
Article Google Scholar
Wang W, Wang H, Song Y (2016) Ranking product aspects through sentiment analysis of online reviews. J Exp Theor Artif Intell, 1–20
Wiegand M, Klakow D (2012) Generalization Methods for In-Domain and Cross-Domain Opinion Holder Extraction. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (Eacl’12), pp. 325–335
Wilson T, Wiebe J, Hoffmann P (2009) Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis. Comput Linguist 35:399–433. doi:10.1162/coli.08-012-R1-06-90
Article Google Scholar
Winkler S, Schaller S, Dorfer V et al (2015) Data-based prediction of sentiments using heterogeneous model ensembles. Soft Comput 19:3401–3412. doi:10.1007/s00500-014-1325-6
Article Google Scholar
Xie S, Wang G, Lin S, Yu PS (2012) Review spam detection via temporal pattern discovery. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 823–831. doi:10.1145/2339530.2339662
Yu Y, Wang X (2015) World Cup 2014 in the Twitter world: a big data analysis of sentiments in U.S. sports fans’ tweets. Comput Human Behav 48:392–400. doi:10.1016/j.chb.2015.01.075
Article Google Scholar
Yu LC, Wu JL, Chang PC, Chu HS (2013) Using a contextual entropy model to expand emotion words and their intensity for the sentiment classification of stock market news. Knowl-Based Syst 41(April):89–97. doi:10.1016/j.knosys.2013.01.001
Article Google Scholar

Download references

Acknowledgements

This study was funded by Coordination of Improvement of Higher Education, CAPES-Brazil (Grant Number BEX 2230/15-1), the Andalusian Excellence Projects (Grant Number P10-SEJ-6768) and the Spanish National Project (Grant Number TIN2013-40658-P).

Author information

Authors and Affiliations

Department of Market and Marketing Research, University of Granada, Granada, Spain
F. Steiner-Correa & M. I. Viedma-del-Jesus
Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain
A. G. Lopez-Herrera

Authors

F. Steiner-Correa
View author publications
You can also search for this author in PubMed Google Scholar
M. I. Viedma-del-Jesus
View author publications
You can also search for this author in PubMed Google Scholar
A. G. Lopez-Herrera
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to F. Steiner-Correa.

Ethics declarations

Conflicts of interest

The author Steiner-Correa, A.F. declare that he has no conflict of interest. The author Viedma-del-Jesus, M.I. declare that she has no conflict of interest. The author López-Herrera, A.G. declare that he has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Steiner-Correa, F., Viedma-del-Jesus, M.I. & Lopez-Herrera, A.G. A survey of multilingual human-tagged short message datasets for sentiment analysis tasks. Soft Comput 22, 8227–8242 (2018). https://doi.org/10.1007/s00500-017-2766-5

Download citation

Published: 12 August 2017
Issue Date: December 2018
DOI: https://doi.org/10.1007/s00500-017-2766-5

A survey of multilingual human-tagged short message datasets for sentiment analysis tasks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multilingual Sentiment Analysis: State of the Art and Independent Comparison of Techniques

Multilingual Sentiment Analysis

A comparative empirical study on social media sentiment analysis over various genres and languages

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Ethical approval

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A survey of multilingual human-tagged short message datasets for sentiment analysis tasks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multilingual Sentiment Analysis: State of the Art and Independent Comparison of Techniques

Multilingual Sentiment Analysis

A comparative empirical study on social media sentiment analysis over various genres and languages

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Ethical approval

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation