Adapting sentiment lexicons to domain-specific social media texts

Published: 01 February 2017


Social media has become the largest data source of public opinion. The application of sentiment analysis to social media texts has great potential, but faces great challenges because of domain heterogeneity. Sentiment orientation of words varies by content domain, but learning context-specific sentiment in social media domains continues to be a major challenge. The language domain poses another challenge since the language used in social media today differs significantly from that used in traditional media. To address these challenges, we propose a method to adapt existing sentiment lexicons for domain-specific sentiment classification using an unannotated corpus and a dictionary. We evaluate our method using two large developing corpora, containing 743,069 tweets related to the stock market and one million tweets related to political topics, respectively, and five existing sentiment lexicons as seeds and baselines. The results demonstrate the usefulness of our method, showing significant improvement in sentiment classification performance. We propose a method to adapt existing sentiment lexicons for domain-specific sentiment classification.The proposed method addresses challenges from both content domain and language domain.We evaluate our method using two large developing corpora and five existing sentiment lexicons as seeds and baselines.The evaluation results demonstrate the usefulness of our method.


