Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Adapting sentiment lexicons to domain-specific social media texts

Published: 01 February 2017 Publication History

Abstract

Social media has become the largest data source of public opinion. The application of sentiment analysis to social media texts has great potential, but faces great challenges because of domain heterogeneity. Sentiment orientation of words varies by content domain, but learning context-specific sentiment in social media domains continues to be a major challenge. The language domain poses another challenge since the language used in social media today differs significantly from that used in traditional media. To address these challenges, we propose a method to adapt existing sentiment lexicons for domain-specific sentiment classification using an unannotated corpus and a dictionary. We evaluate our method using two large developing corpora, containing 743,069 tweets related to the stock market and one million tweets related to political topics, respectively, and five existing sentiment lexicons as seeds and baselines. The results demonstrate the usefulness of our method, showing significant improvement in sentiment classification performance. We propose a method to adapt existing sentiment lexicons for domain-specific sentiment classification.The proposed method addresses challenges from both content domain and language domain.We evaluate our method using two large developing corpora and five existing sentiment lexicons as seeds and baselines.The evaluation results demonstrate the usefulness of our method.

References

[1]
Y. Yu, W. Duan, Q. Cao, The impact of social and conventional media on firm equity value: a sentiment analysis approach, Decis. Support. Syst., 55 (2013) 919-926.
[2]
A. Bifet, E. Frank, Sentiment knowledge discovery in twitter streaming data, in: Discovery Science, 2010, pp. 1-15.
[3]
B. Liu, Sentiment analysis and opinion mining, in: Synthesis Lectures on Human Language Technologies, 5, 2012, pp. 1-167.
[4]
X. Luo, J. Zhang, How do consumer buzz and traffic in social media marketing predict the value of the firm?, J. Manag. Inf. Syst., 30 (2013) 213-238.
[5]
R. Divol, D. Edelman, H. Sarrazin, Demystifying social media, McKinsey Q., 2 (2012) 66-77.
[6]
H. Chen, R.H. Chiang, V.C. Storey, Business intelligence and analytics: from big data to big impact, MIS Q., 36 (2012) 1165-1188.
[7]
M. Chau, J. Xu, Business intelligence in blogs: understanding consumer interactions and communities, MIS Q., 36 (2012) 1189-1216.
[8]
B. Bickart, R.M. Schindler, Internet forums as influential sources of consumer information, J. Interact. Mark., 15 (2001) 31-40.
[9]
T. Sakaki, M. Okazaki, Y. Matsuo, Earthquake shakes Twitter users: real-time event detection by social sensors, in: Proceedings of the 19th International Conference on World Wide Web, 2010, pp. 851-860.
[10]
H. Chen, D. Zimbra, AI and opinion mining, IEEE Intell. Syst., 25 (2010) 74-80.
[11]
A. Abbasi, S. France, Z. Zhang, H. Chen, Selecting attributes for sentiment classification using feature relation networks, IEEE Trans. Knowl. Data Eng., 23 (2011) 447-462.
[12]
S. Stieglitz, L. Dang-Xuan, Emotions and information diffusion in social mediasentiment of microblogs and sharing behavior, J. Manag. Inf. Syst., 29 (2013) 217-248.
[13]
R. Aggarwal, R. Gopal, A. Gupta, H. Singh, Putting money where the mouths are: the relation between venture financing and electronic word-of-mouth, Inf. Syst. Res., 23 (2012) 976-992.
[14]
G. Mishne, N. Glance, Predicting movie sales from blogger sentiment, in: Proceedings of AAAI 2006 Spring Symposium on Computational Approaches to Analysing Weblogs (AAAI-CAAW 2006), 2006, pp. 301-304.
[15]
J. Bollen, H. Mao, X. Zeng, Twitter mood predicts the stock market, J. Comput. Sci., 2 (2011) 1-8.
[16]
C. Oh, O.R.L. Sheng, Investigating predictive power of stock micro blog sentiment in forecasting future stock price directional movement, in: ICIS 2011 Proceedings, 2011, pp. 57-58.
[17]
E. Boiy, M.F. Moens, A machine learning approach to sentiment analysis in multilingual web texts, Inf. Retr., 12 (2009) 526-558.
[18]
B. Pang, L. Lee, S. Vaithyanathan, Thumbs up?: sentiment classification using machine learning techniques, in: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, 10, 2002, pp. 79-86.
[19]
M. Hu, B. Liu, Mining and summarizing customer reviews, in: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004, pp. 168-177.
[20]
T.L. Ngo-Ye, A.P. Sinha, Analyzing Online Review Helpfulness Using a Regressional Relief-enhanced Text Mining Method, 2012.
[21]
E. Riloff, J. Wiebe, Learning extraction patterns for subjective expressions, in: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, 2003, pp. 105-112.
[22]
P. Chaovalit, L. Zhou, Movie review mining: a comparison between supervised and unsupervised classification approaches, in: Proceedings of the 38th Annual Hawaii International Conference on System Sciences, 2005, pp. 112c.
[23]
T. Fu, A. Abbasi, D. Zeng, H. Chen, Sentimental Spidering: leveraging opinion information in focused crawlers, ACM Trans. Inf. Syst., 30 (2012) 24.
[24]
D.E. O'Leary, Blog mining-review and extensions: From each according to his opinion, Decis. Support. Syst., 51 (11//2011) 821-830.
[25]
B. Pang, L. Lee, Opinion Mining and Sentiment Analysis: Now Pub, 2008.
[26]
X. Ding, B. Liu, P.S. Yu, A holistic lexicon-based approach to opinion mining, in: Proceedings of the International Conference on Web Search and Web Data Mining, 2008, pp. 231-240.
[27]
T. Loughran, B. McDonald, When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks, J. Financ., 66 (2011) 35-65.
[28]
T. Wilson, J. Wiebe, P. Hoffmann, Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis, Comput. Linguist., 35 (2009) 399-433.
[29]
R. Raina, A. Battle, H. Lee, B. Packer, A.Y. Ng, Self-taught learning: transfer learning from unlabeled data, in: Proceedings of the 24th International Conference on Machine Learning, 2007, pp. 759-766.
[30]
S. Brody, N. Diakopoulos, Cooooooooooooooollllllllllllll!!!!!!!e!!!!!!!: using word lengthening to detect sentiment in microblogs, in: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2011, pp. 562-570.
[31]
R. Balasubramanyan, W.W. Cohen, D. Pierce, D.P. Redlawsk, What pushes their buttons?: predicting comment polarity from the content of political blog posts, in: Proceedings of the Workshop on Languages in Social Media, 2011, pp. 12-19.
[32]
R.P. Schumaker, H. Chen, Textual Analysis of Stock Market Prediction Using Breaking Financial News: The AZF in Text System, 2009.
[33]
A. Go, R. Bhayani, L. Huang, Twitter sentiment classification using distant supervision, in: CS224N Project Report, 2009, pp. 1-12.
[34]
J. Smailovi, M. Grar, N. Lavra, M. nidari, Predictive sentiment analysis of tweets: a stock market application, in: Human-Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data, Springer, 2013, pp. 77-88.
[35]
R. Abbott, M. Walker, P. Anand, J.E.F. Tree, R. Bowmani, J. King, How can you say such things?!?: recognizing disagreement in informal political argument, in: Proceedings of the Workshop on Languages in Social Media, 2011, pp. 2-11.
[36]
C.-H. Chou, A.P. Sinha, H. Zhao, A hybrid attribute selection approach for text classification, J. Assoc. Inf. Syst., 11 (2010) 491-518.
[37]
P.J. Stone, D.C. Dunphy, M.S. Smith, The General Inquirer: A Computer Approach to Content Analysis, M.I.T. Press, Oxford, England, 1966.
[38]
T. Wilson, P. Hoffmann, S. Somasundaran, J. Kessler, J. Wiebe, Y. Choi, OpinionFinder: a system for subjectivity analysis, in: Proceedings of HLT/EMNLP on Interactive Demonstrations, 2005, pp. 34-35.
[39]
A. Esuli, F. Sebastiani, Sentiwordnet: a publicly available lexical resource for opinion mining, in: Proceedings of 5th Conference on Language Resources and Evaluation, 2006, pp. 417-422.
[40]
B. O'Connor, R. Balasubramanyan, B. Routledge, N. Smith, From tweets to polls: linking text sentiment to public opinion time series, in: Proceedings of the International AAAI Conference on Weblogs and Social Media, 2010.
[41]
D. Garcia, Sentiment during recessions, J. Financ., 68 (2013) 1267-1300.
[42]
S.R. Das, M.Y. Chen, Yahoo! for Amazon: sentiment extraction from small talk on the web, Manag. Sci., 53 (2007) 1375-1388.
[43]
T. Wilson, J. Wiebe, P. Hoffmann, Recognizing contextual polarity in phrase-level sentiment analysis, in: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, 2005, pp. 347-354.
[44]
A. Bermingham, A.F. Smeaton, Classifying sentiment in microblogs: is brevity an advantage, in: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, 2010, pp. 1833-1836.
[45]
V. Hatzivassiloglou, K.R. McKeown, Predicting the semantic orientation of adjectives, in: Proceedings of the Eighth Conference on European Chapter of the Association for Computational Linguistics, 1997, pp. 174-181.
[46]
P.D. Turney, Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews, in: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 2002, pp. 417-424.
[47]
K.W. Church, P. Hanks, Word association norms, mutual information, and lexicography, Comput. Linguist., 16 (1990) 22-29.
[48]
J. Wiebe, E. Riloff, Creating subjective and objective sentence classifiers from unannotated texts, in: Computational Linguistics and Intelligent Text Processing, Springer, 2005, pp. 486-497.
[49]
N. Oliveira, P. Cortez, N. Areal, Stock market sentiment lexicon acquisition using microblogging data and statistical measures, Decis. Support. Syst., 85 (5//2016) 62-73.
[50]
Wiktionary, . http://www.wiktionary.org/
[51]
L. Zhou, J.K. Burgoon, D.P. Twitchell, T. Qin, J.F. Nunamaker, A comparison of classification methods for predicting deception in computer-mediated communication, J. Manag. Inf. Syst., 20 (2004) 139-166.
[52]
L. Velikovich, S. Blair-Goldensohn, K. Hannan, R. McDonald, The viability of web-derived polarity lexicons, in: Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2010, pp. 777-785.
[53]
B. Liu, M. Hu, J. Cheng, Opinion observer: analyzing and comparing opinions on the Web, in: Proceedings of the 14th International Conference on World Wide Web, 2005, pp. 342-351.
[54]
G. Qiu, B. Liu, J. Bu, C. Chen, Expanding domain sentiment lexicon through double propagation, in: Proceedings of the 21st International Joint Conference on Artifical Intelligence, 2009, pp. 1199-1204.
[55]
P. Koehn, K. Knight, Learning a translation lexicon from monolingual corpora, in: Proceedings of the ACL-02 Workshop on Unsupervised Lexical Acquisition, 9, 2002, pp. 9-16.
[56]
Alexa, . http://www.alexa.com/
[57]
Twitter, . https://blog.twitter.com/2013/celebrating-twitter7
[58]
W. Xu, T. Li, B. Jiang, C. Cheng, Web mining for financial market prediction based on online sentiments, in: Proceedings of PACIS 2012, 2012.
[59]
P.C. Tetlock, Giving content to investor sentiment: the role of media in the stock market, J. Financ., 62 (2007) 1139-1168.
[60]
J. Jiang, N. Yu, C.Y. Lin, FoCUS: learning to crawl web forums, in: Proceedings of the 21st International Conference Companion on World Wide Web, 2012, pp. 33-42.
[61]
R.P. Schumaker, Y. Zhang, C.-N. Huang, H. Chen, Evaluating sentiment in financial news articles, Decis. Support. Syst., 53 (6//2012) 458-464.
[62]
A. Fahrni, M. Klenner, Old wine or warm beer: target-specific sentiment analysis of adjectives, in: Proc. of the Symposium on Affective Language in Human and Machine, AISB, 2008, pp. 60-63.
[63]
Z. Zhang, X. Li, Y. Chen, Deciphering Word-of-Mouth in Social Media: Text-based Metrics of Consumer Reviews, 2012.
[64]
T.T. Thet, J.-C. Na, C.S. Khoo, Aspect-based sentiment analysis of movie reviews on discussion boards, J. Inf. Sci., 36 (2010) 823-848.
[65]
M. Taboada, J. Brooke, M. Tofiloski, K. Voll, M. Stede, Lexicon-based methods for sentiment analysis, Comput. Linguist., 37 (2011) 267-307.
[66]
O. Owoputi, B. O'Connor, C. Dyer, K. Gimpel, N. Schneider, N.A. Smith, Improved part-of-speech tagging for online conversational text with word clusters, in: Proceedings of NAACL-HLT, 2013, pp. 380-390.
[67]
R. Aggarwal, R. Gopal, R. Sankaranarayanan, P.V. Singh, Blog, blogger, and the firm: can negative employee posts lead to positive outcomes?, Inf. Syst. Res., 23 (2012) 306-322.
[68]
J.R. Landis, G.G. Koch, The measurement of observer agreement for categorical data, Biometrics (1977) 159-174.
[69]
I. Maks, P. Vossen, Different approaches to automatic polarity annotation at synset level, in: Proceedings of the First International Workshop on Lexical Resources, 2011.
[70]
A. Tumasjan, T.O. Sprenger, P.G. Sandner, I.M. Welpe, Election forecasts with twitter: how 140 characters reflect the political landscape, Soc. Sci. Comput. Rev., 29 (November 1 2011) 402-418.
[71]
D. Gayo-Avello, I Wanted to Predict Elections With Twitter and All I Got Was This Lousy Paper A Balanced Survey on Election Prediction Using Twitter Data, 2012.
[72]
S. Nann, J. Krauss, D. Schoder, Predictive analytics on public datathe case of stock markets, in: ECIS, 2013.
[73]
N. Oliveira, P. Cortez, N. Areal, On the predictability of stock market behavior using StockTwits sentiment and posting volume, in: Progress in Artificial Intelligence, Springer, 2013, pp. 355-365.
[74]
T.O. Sprenger, A. Tumasjan, P.G. Sandner, I.M. Welpe, Tweets and trades: the information content of stock microblogs, in: European Financial Management, 20, 2014, pp. 926-957.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Decision Support Systems
Decision Support Systems  Volume 94, Issue C
February 2017
109 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 February 2017

Author Tags

  1. Lexicon expansion
  2. Opinion mining
  3. Sentiment analysis
  4. Sentiment lexicon
  5. Social media

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Distilling wisdom of crowds in online communitiesDecision Support Systems10.1016/j.dss.2024.114190180:COnline publication date: 9-Jul-2024
  • (2023)Classification of Tweets Into Facts and Opinions Using Recurrent Neural NetworksInternational Journal of Technology and Human Interaction10.4018/IJTHI.31935819:1(1-14)Online publication date: 10-Mar-2023
  • (2023)Sentiment Analysis for Hotel Reviews: A Systematic Literature ReviewACM Computing Surveys10.1145/360515256:2(1-38)Online publication date: 15-Sep-2023
  • (2022)Organizational Adoption of Sentiment Analytics in Social Media NetworksInternational Journal of Information Technologies and Systems Approach10.4018/IJITSA.30702315:2(1-29)Online publication date: 10-Aug-2022
  • (2022)Identifying Structural Holes for Sentiment ClassificationInformation Systems Frontiers10.1007/s10796-021-10185-x24:5(1735-1751)Online publication date: 1-Oct-2022
  • (2021)Semisupervised sentiment analysis method for online text reviewsJournal of Information Science10.1177/016555152091003247:3(387-403)Online publication date: 1-Jun-2021
  • (2021)A Training-Optimization-Based Method for Constructing Domain-Specific Sentiment LexiconComplexity10.1155/2021/61524942021Online publication date: 1-Jan-2021
  • (2021)Building a Restaurant-Specific Sentiment Lexicon via Probability TheoryProceedings of the Brazilian Symposium on Multimedia and the Web10.1145/3470482.3479453(129-132)Online publication date: 5-Nov-2021
  • (2021)Intelligent sentinet-based lexicon for context-aware sentiment analysis: optimized neural network for sentiment classification on social mediaThe Journal of Supercomputing10.1007/s11227-021-03709-477:11(12801-12825)Online publication date: 13-Apr-2021
  • (2021)Mining sentiment tendencies and summaries from consumer reviewsInformation Systems and e-Business Management10.1007/s10257-020-00482-419:1(107-135)Online publication date: 1-Mar-2021
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media