Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3651781.3651819acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicscaConference Proceedingsconference-collections
research-article
Open access

Leveraging Machine Translation to Enhance Sentiment Analysis on Multilingual Text

Published: 30 May 2024 Publication History

Abstract

This research investigates the complexity of sentiment analysis in the multilingual context of Malaysia, where various languages, including English and Malay, are often intermixed. Since existing sentiment analysis tools are predominantly cater for single-language scenarios only, it results in creating a gap for effective analysis in this diverse linguistic environment. To bridge this gap, a comprehensive solution utilizing Machine Translation (MT) to homogenize the language in code-mixed texts is experimented in this study. Several approaches to sentiment analysis were utilized to showcase the feasibility of MT in enhancing sentiment analysis. By translating code-mixed text into its equivalent text in a uniform language accurately, a significant improvement in sentiment analysis were observed, indicating the contribution of this work in surpassing language barriers in multilingual sentiment analysis.

References

[1]
Basant Agarwal, Namita Mittal, Pooja Bansal, and Sonal Garg, “Sentiment Analysis Using Common-Sense and Context Information,” Computational Intelligence and Neuroscience, pp. 1–9. 2015. https://doi.org/10.1155/2015/715730
[2]
Majlinda Axhiu, Florida Veljanoska, Biljana Ciglovska, and Mirlinda Husejni, “The Usage of Sentiment Analysis for Hearing the Voice of the Customer and Improving Businesses,” Journal of Educational and Social Research, Vol. 4, No. 4, pp. 401–407, June 2014.
[3]
Kia Dashtipour, Soujanya Poria, Amir Hussain, Erik Cambria, Ahmad Y A Hawalah, Alexander Gelbukh, and Qiang Zhou, “Multilingual Sentiment Analysis: State of the Art and Independent Comparison of Techniques,” Cognitive Computation, Vol. 8, No. 4, pp. 757–771, June 2016.
[4]
Nanda Putri Romadhona, Sin-En Lu, Bo-Han Lu, and Richard Tzong-Han Tsai, “BRCC and SentiBahasaRojak: The First Bahasa Rojak Corpus for Pretraining and Sentiment Analysis Dataset,” pp. 4418–4428, October 2022. https://aclanthology.org/2022.coling-1.389.pdf.
[5]
Nor Asni Syahriza Abu Hassan, “Bahasa rojak: Malaysian way of speaking English?,” KONAKA Konferensi Akademik 2015, pp. 234-239, November 2015. https://ir.uitm.edu.my/id/eprint/68094/2/68094.pdf
[6]
Muhammad Irfan, “Machine Translation,” October 2017. https://www.researchgate.net/publication/320730405
[7]
Grace Hui-Chin Lin and Paul Shih Chieh Chien, “Machine Translation for Academic Purposes,” Proceedings of the International Conference on TESOL and Translation 2009, pp. 133–148, December 2009. https://files.eric.ed.gov/fulltext/ED513879.pdf
[8]
Haifeng Wang, Hua Wu, Zhongjun He, Liang Huang, and Kenneth Ward Church, “Progress in Machine Translation,” Engineering, Vol. 18, pp. 143–153. November 2022. https://doi.org/10.1016/j.eng.2021.03.023
[9]
Emily Ohman, “The Validity of Lexicon-based Emotion Analysis in Interdisciplinary Research,” Proceedings of the Workshop on Natural Language Processing for Digital Humanities, pp. 7–12, December 2021. https://doi.org/10.26615/978-952-94-5833-2_002
[10]
Manuel Ojeda-Hernández, Domingo López-Rodríguez, and Ángel Mora, “Lexicon-based sentiment analysis in texts using Formal Concept Analysis,” International Journal of Approximate Reasoning, Vol. 155, pp. 104–112, April 2023. https://doi.org/10.1016/j.ijar.2023.02.001.
[11]
TextBlob: Simplified Text Processing. https://textblob.readthedocs.io/en/dev/
[12]
Wan Nur Syahirah Wan Min and Nur Zareen Zulkarnain, “Comparative Evaluation of Lexicons in Performing Sentiment Analysis,” Journal of Advanced Computing Technology and Application (JACTA), Vol. 2, No. 1, May 2020.
[13]
Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani, “SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining,” Proceedings of the 8th International Conference on Language Resources and Evaluation, pp. 2201–2204, May 2010. https://doi.org/10.1145/3227609.3227664
[14]
C.J. Hutto and Eric Gilbert, “VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text,” Proceedings of the International AAAI Conference on Web and Social Media, Vol. 8, No. 1, pp. 216–225.
[15]
Yi-Fei Tan, H. Lam, Asyraf Azlan, and Wooi-King Soo, “Sentiment analysis for telco popularity on Twitter big data using a novel Malaysian dictionary,” Frontiers in Artificial Intelligence and Applications, Vol. 282, pp. 112–125, 2016.
[16]
Khalifa Chekima and Rayner Alfred, “Sentiment Analysis of Malay Social Media Text,” Proceedings of International Conference on Computational Science and Technology, pp. 205–219, February 2018.
[17]
Nurul Husna Mahadzir, Mohd Faizal Omar, Mohd Nasrun Mohd Nawi, Anas A. Salameh, Kasmaruddin Che Hussin, and Abid Sohail, “MELex: The Construction of Malay-English Sentiment Lexicon,” Computers, Materials & Continua, Vol. 71, No. 1, pp. 1789–1805, 2022.
[18]
Mesolitica, November 2023. https://github.com/mesolitica/malaysian-dataset/
[19]
Saif Mohammad, “A Practical Guide to Sentiment Annotation: Challenges and Solutions,” Association for Computational Linguistics, pp. 174–179, June 2016.
[20]
MK Zainal, “Eh what does ‘xtau’ mean?” – A dictionary of Malay SMS short-forms for your sanity. January 2016. https://cilisos.my/bahasa-sms-shortforms-glossary/
[21]
Googletrans: Free and Unlimited Google translate API for Python. https://py-googletrans.readthedocs.io/en/latest/.
[22]
Lu Xing Han, “DL Translate: A Deep Learning-based Translation Library Built on Huggingface Transformers,” 2021.
[23]
Nisheeth Joshi, Iti Mathur, Hemant Darbari, and Ajai Kumar, “HEVAL: Yet Another Human Evaluation Metric,” International Journal on Natural Language Computing (IJNLC), Vol. 2, No.5, pp. 21–36, November 2013.
[24]
Abdul Mohaimin Rahat, Abdul Kahir, and Abu Kaisar Mohammad Masum, “Comparison of Naive Bayes and SVM Algorithm based on Sentiment Analysis Using Review Dataset,” Proceedings of the 8th International Conference System Modeling and Advancement in Research Trends (SMART), pp. 266–270, November 2019.
[25]
Julius Sim and Chris C Wright, “The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements,” Physical Therapy, Vol. 85, No. 3, pp. 257–268, March 2005. https://doi.org/10.1093/ptj/85.3.257
[26]
Harry N. Boone Jr and Deborah A. Boone, “Analyzing Likert Data,” The Journal of Extension, Vol. 50, Issue 2, April 2012.
[27]
Thomas Schmidt, Brigitte Winterl, Milena Maul, Alina Schark, Andrea Vlad and Christian Wolff, “Inter-Rater Agreement and Usability: A Comparative Evaluation of Annotation Tools for Sentiment Annotation,” INFORMATIK 2019 Workshops, pp. 121–133, September 2019.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICSCA '24: Proceedings of the 2024 13th International Conference on Software and Computer Applications
February 2024
395 pages
ISBN:9798400708329
DOI:10.1145/3651781
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 May 2024

Check for updates

Author Tags

  1. Code-mixed Text
  2. Lexicon
  3. Machine Translation
  4. Multilingual Sentiment Analysis
  5. Rojak language
  6. Sentiment Analysis

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICSCA 2024

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 211
    Total Downloads
  • Downloads (Last 12 months)211
  • Downloads (Last 6 weeks)69
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media