extended-abstract

Identifying neutral reviews from unlabeled data: An exploratory study on user ratings and word-level polarity scores

Author:

Salim SazzedAuthors Info & Claims

HT '22: Proceedings of the 33rd ACM Conference on Hypertext and Social Media

Pages 198 - 202

https://doi.org/10.1145/3511095.3536367

Published: 28 June 2022 Publication History

Abstract

The presence of the reviews containing mixed or contrasting opinions, also known as neutral reviews, is prevalent in user feedback data. By leveraging annotated data, supervised machine learning (ML) classifiers can learn implicit patterns to identify these neutral reviews. However, labeled data are barely available in most circumstances. When annotated data are unavailable, unsupervised approaches such as lexicon-based methods are employed that utilize word-level polarity scores with a set of rules. As a preliminary study for developing a sophisticated unsupervised framework for recognizing neutral reviews, here, we scrutinize the performances of the existing lexicon-based methods. When applied to four multi-domain review datasets, we observe that all of them perform poorly for identifying neutral reviews. We manually inspect the semantic attributes of a subset of neutral reviews classified wrong by these lexicon-based methods. The experimental results and manual analysis reveal that determining neutrality utilizing the lexical rule-based methods is often ineffective due to numerous reasons, such as user preferences on certain aspects, coverage of the sentiment lexicon, irregularly in the efficacy of aggregation rules, and the context-sensitive polarity of words. As a preliminary study, this analysis reveals traits of neutral reviews and limitations of existing approaches and provides insights to develop methods for neutral review identification from the unlabeled data.

Supplementary Material

MP4 File (Neutrality.mp4)

This video contains slides of the paper - 'Identifying neutral reviews from unlabeled data: An exploratory study on user ratings and word-level polarity scores' presented in ACM Conference on Hypertext and Social Media- 2022.

Download
27.08 MB

References

[1]

Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow, and Rebecca J Passonneau. 2011. Sentiment analysis of twitter data. In Proceedings of the workshop on language in social media (LSM 2011). 30–38.

Digital Library

[2]

Orestes Appel, Francisco Chiclana, Jenny Carter, and Hamido Fujita. 2018. Successes and challenges in developing a hybrid approach to sentiment analysis. Applied Intelligence 48, 5 (2018), 1176–1188.

Digital Library

[3]

Luis Chiruzzo, Mathias Etcheverry, and Aiala Rosá. 2020. Sentiment analysis in Spanish tweets: Some experiments with focus on neutral tweets. (2020).

[4]

Marco Guerini, Lorenzo Gatti, and Marco Turchi. 2013. Sentiment analysis: How to derive prior polarities from SentiWordNet. arXiv preprint arXiv:1309.5843(2013).

[5]

AR Hamed, Renxi Qiu, and Dayou Li. 2016. The importance of neutral class in sentiment analysis of Arabic tweets. Int. J. Comput. Sci. Inform. Technol 8 (2016), 17–31.

[6]

Clayton J Hutto and Eric Gilbert. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Eighth international AAAI conference on weblogs and social media.

[7]

Moshe Koppel and Jonathan Schler. 2006. The importance of neutral examples for learning sentiment. Computational Intelligence 22, 2 (2006), 100–109.

[8]

J Richard Landis and Gary G Koch. 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159–174.

[9]

Andrew Maas, Raymond E Daly, Peter T Pham, Dan Huang, Andrew Y Ng, and Christopher Potts. 2011. Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies. 142–150.

Digital Library

[10]

Andrius Mudinas, Dell Zhang, and Mark Levene. 2012. Combining lexicon and learning based approaches for concept-level sentiment analysis. In Proceedings of the first international workshop on issues of sentiment discovery and opinion mining. 1–8.

Digital Library

[11]

Finn Årup Nielsen. 2011. A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. arXiv preprint arXiv:1103.2903(2011).

[12]

Bo Pang and Lillian Lee. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. arXiv preprint cs/0409058(2004).

[13]

Bo Pang and Lillian Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. arXiv preprint cs/0506075(2005).

[14]

Filipe N Ribeiro, Matheus Araújo, Pollyanna Gonçalves, Marcos André Gonçalves, and Fabrício Benevenuto. 2016. Sentibench-a benchmark comparison of state-of-the-practice sentiment analysis methods. EPJ Data Science 5, 1 (2016), 1–29.

[15]

Salim Sazzed. 2020. Development of sentiment lexicon in bengali utilizing corpus and cross-lingual resources. In 2020 IEEE 21st International conference on information reuse and integration for data science (IRI). IEEE, 237–244.

[16]

Salim Sazzed. 2021. Improving sentiment classification in low-resource bengali language utilizing cross-lingual self-supervised learning. In International Conference on Applications of Natural Language to Information Systems. Springer, 218–230.

Digital Library

[17]

Salim Sazzed and Sampath Jayarathna. 2019. A sentiment classification in bengali and machine translated english corpus. In 2019 IEEE 20th international conference on information reuse and integration for data science (IRI). IEEE, 107–114.

Digital Library

[18]

Salim Sazzed and Sampath Jayarathna. 2021. Ssentia: a self-supervised sentiment analyzer for classification from unlabeled data. Machine Learning with Applications 4 (2021), 100026.

[19]

Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, and Manfred Stede. 2011. Lexicon-based methods for sentiment analysis. Computational linguistics 37, 2 (2011), 267–307.

[20]

Mike Thelwall, Kevan Buckley, Georgios Paltoglou, Di Cai, and Arvid Kappas. 2010. Sentiment strength detection in short informal text. Journal of the American society for information science and technology 61, 12 (2010), 2544–2558.

[21]

Ana Valdivia, M Victoria Luzón, Erik Cambria, and Francisco Herrera. 2018. Consensus vote models for detecting and filtering neutrality in sentiment analysis. Information Fusion 44(2018), 126–135.

[22]

Lei Zhang, Riddhiman Ghosh, Mohamed Dekhil, Meichun Hsu, and Bing Liu. 2011. Combining lexicon-based and learning-based methods for Twitter sentiment analysis. HP Laboratories, Technical Report HPL-2011 89 (2011).

Cited By

Sazzed S(2022)Understanding Linguistic Variations in Neutral and Strongly Opinionated Reviews2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA55696.2022.00237(1512-1516)Online publication date: Dec-2022
https://doi.org/10.1109/ICMLA55696.2022.00237

Recommendations

Exploiting emoticons in polarity classification of text

With people increasingly using emoticons in written text on the Web in order to express, stress, or disambiguate their sentiment, it is crucial for automated sentiment analysis tools to correctly account for such graphical cues for sentiment. We analyze ...
Word sense disambiguation based sentiment lexicons for sentiment classification

Sentiment analysis has attracted much attention from both researchers and practitioners as word-of-mouth (WOM) has a significant influence on consumer behavior. One core task of sentiment analysis is the discovery of sentimental words. This can be done ...
Towards building a high-quality microblog-specific Chinese sentiment lexicon

Due to the huge popularity of microblogging services, microblogs have become important sources of customer opinions. Sentiment analysis systems can provide useful knowledge to decision support systems and decision makers by aggregating and summarizing ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

HT '22: Proceedings of the 33rd ACM Conference on Hypertext and Social Media

June 2022

272 pages

ISBN:9781450392334

DOI:10.1145/3511095

General Chairs:
Alejandro Bellogín
Universidad Autonoma de Madrid, Spain
,
Ludovico Boratto
University of Cagliari, Italy
,
Program Chair:
Federica Cena
University of Torino, Italy

Copyright © 2022 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 June 2022

Check for updates

Author Tags

Qualifiers

Extended-abstract
Research
Refereed limited

Conference

HT '22

Sponsor:

HT '22: 33rd ACM Conference on Hypertext and Social Media

June 28 - July 1, 2022

Barcelona, Spain

Acceptance Rates

Overall Acceptance Rate 378 of 1,158 submissions, 33%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
73
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)0

Reflects downloads up to 29 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Sazzed S(2022)Understanding Linguistic Variations in Neutral and Strongly Opinionated Reviews2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA55696.2022.00237(1512-1516)Online publication date: Dec-2022
https://doi.org/10.1109/ICMLA55696.2022.00237

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents