Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3511095.3536367acmconferencesArticle/Chapter ViewAbstractPublication PageshtConference Proceedingsconference-collections
extended-abstract

Identifying neutral reviews from unlabeled data: An exploratory study on user ratings and word-level polarity scores

Published: 28 June 2022 Publication History

Abstract

The presence of the reviews containing mixed or contrasting opinions, also known as neutral reviews, is prevalent in user feedback data. By leveraging annotated data, supervised machine learning (ML) classifiers can learn implicit patterns to identify these neutral reviews. However, labeled data are barely available in most circumstances. When annotated data are unavailable, unsupervised approaches such as lexicon-based methods are employed that utilize word-level polarity scores with a set of rules. As a preliminary study for developing a sophisticated unsupervised framework for recognizing neutral reviews, here, we scrutinize the performances of the existing lexicon-based methods. When applied to four multi-domain review datasets, we observe that all of them perform poorly for identifying neutral reviews. We manually inspect the semantic attributes of a subset of neutral reviews classified wrong by these lexicon-based methods. The experimental results and manual analysis reveal that determining neutrality utilizing the lexical rule-based methods is often ineffective due to numerous reasons, such as user preferences on certain aspects, coverage of the sentiment lexicon, irregularly in the efficacy of aggregation rules, and the context-sensitive polarity of words. As a preliminary study, this analysis reveals traits of neutral reviews and limitations of existing approaches and provides insights to develop methods for neutral review identification from the unlabeled data.

Supplementary Material

MP4 File (Neutrality.mp4)
This video contains slides of the paper - 'Identifying neutral reviews from unlabeled data: An exploratory study on user ratings and word-level polarity scores' presented in ACM Conference on Hypertext and Social Media- 2022.

References

[1]
Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow, and Rebecca J Passonneau. 2011. Sentiment analysis of twitter data. In Proceedings of the workshop on language in social media (LSM 2011). 30–38.
[2]
Orestes Appel, Francisco Chiclana, Jenny Carter, and Hamido Fujita. 2018. Successes and challenges in developing a hybrid approach to sentiment analysis. Applied Intelligence 48, 5 (2018), 1176–1188.
[3]
Luis Chiruzzo, Mathias Etcheverry, and Aiala Rosá. 2020. Sentiment analysis in Spanish tweets: Some experiments with focus on neutral tweets. (2020).
[4]
Marco Guerini, Lorenzo Gatti, and Marco Turchi. 2013. Sentiment analysis: How to derive prior polarities from SentiWordNet. arXiv preprint arXiv:1309.5843(2013).
[5]
AR Hamed, Renxi Qiu, and Dayou Li. 2016. The importance of neutral class in sentiment analysis of Arabic tweets. Int. J. Comput. Sci. Inform. Technol 8 (2016), 17–31.
[6]
Clayton J Hutto and Eric Gilbert. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Eighth international AAAI conference on weblogs and social media.
[7]
Moshe Koppel and Jonathan Schler. 2006. The importance of neutral examples for learning sentiment. Computational Intelligence 22, 2 (2006), 100–109.
[8]
J Richard Landis and Gary G Koch. 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159–174.
[9]
Andrew Maas, Raymond E Daly, Peter T Pham, Dan Huang, Andrew Y Ng, and Christopher Potts. 2011. Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies. 142–150.
[10]
Andrius Mudinas, Dell Zhang, and Mark Levene. 2012. Combining lexicon and learning based approaches for concept-level sentiment analysis. In Proceedings of the first international workshop on issues of sentiment discovery and opinion mining. 1–8.
[11]
Finn Årup Nielsen. 2011. A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. arXiv preprint arXiv:1103.2903(2011).
[12]
Bo Pang and Lillian Lee. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. arXiv preprint cs/0409058(2004).
[13]
Bo Pang and Lillian Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. arXiv preprint cs/0506075(2005).
[14]
Filipe N Ribeiro, Matheus Araújo, Pollyanna Gonçalves, Marcos André Gonçalves, and Fabrício Benevenuto. 2016. Sentibench-a benchmark comparison of state-of-the-practice sentiment analysis methods. EPJ Data Science 5, 1 (2016), 1–29.
[15]
Salim Sazzed. 2020. Development of sentiment lexicon in bengali utilizing corpus and cross-lingual resources. In 2020 IEEE 21st International conference on information reuse and integration for data science (IRI). IEEE, 237–244.
[16]
Salim Sazzed. 2021. Improving sentiment classification in low-resource bengali language utilizing cross-lingual self-supervised learning. In International Conference on Applications of Natural Language to Information Systems. Springer, 218–230.
[17]
Salim Sazzed and Sampath Jayarathna. 2019. A sentiment classification in bengali and machine translated english corpus. In 2019 IEEE 20th international conference on information reuse and integration for data science (IRI). IEEE, 107–114.
[18]
Salim Sazzed and Sampath Jayarathna. 2021. Ssentia: a self-supervised sentiment analyzer for classification from unlabeled data. Machine Learning with Applications 4 (2021), 100026.
[19]
Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, and Manfred Stede. 2011. Lexicon-based methods for sentiment analysis. Computational linguistics 37, 2 (2011), 267–307.
[20]
Mike Thelwall, Kevan Buckley, Georgios Paltoglou, Di Cai, and Arvid Kappas. 2010. Sentiment strength detection in short informal text. Journal of the American society for information science and technology 61, 12 (2010), 2544–2558.
[21]
Ana Valdivia, M Victoria Luzón, Erik Cambria, and Francisco Herrera. 2018. Consensus vote models for detecting and filtering neutrality in sentiment analysis. Information Fusion 44(2018), 126–135.
[22]
Lei Zhang, Riddhiman Ghosh, Mohamed Dekhil, Meichun Hsu, and Bing Liu. 2011. Combining lexicon-based and learning-based methods for Twitter sentiment analysis. HP Laboratories, Technical Report HPL-2011 89 (2011).

Cited By

View all
  • (2022)Understanding Linguistic Variations in Neutral and Strongly Opinionated Reviews2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA55696.2022.00237(1512-1516)Online publication date: Dec-2022

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
HT '22: Proceedings of the 33rd ACM Conference on Hypertext and Social Media
June 2022
272 pages
ISBN:9781450392334
DOI:10.1145/3511095
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 June 2022

Check for updates

Author Tags

  1. context sensitivity
  2. neutral review
  3. sentiment analysis
  4. sentiment lexicon
  5. unlabeled data

Qualifiers

  • Extended-abstract
  • Research
  • Refereed limited

Conference

HT '22
Sponsor:
HT '22: 33rd ACM Conference on Hypertext and Social Media
June 28 - July 1, 2022
Barcelona, Spain

Acceptance Rates

Overall Acceptance Rate 378 of 1,158 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)0
Reflects downloads up to 29 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Understanding Linguistic Variations in Neutral and Strongly Opinionated Reviews2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA55696.2022.00237(1512-1516)Online publication date: Dec-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media