Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

SentiVerb system: classification of social media text using sentiment analysis

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This article proposes novel frameworks of SentiVerb and Spell Checker system, which extracts the reaction, mood, and opinion of users from social media text (SMT). The opinion of users is extracted from their written text on social media such as comments, tweets, blogs, feedbacks etc. and are classified as positive or negative opinion based on sentiment score of SMT using dictionary-based approach and a binary classifier. The dictionary-based approach uses opinion verb dictionary (OVD) to extract the sentiment of opinion verbs present in SMT. This OVD contain only opinion verbs along with their sentiment score. The various steps of the framework such as lower-case conversion, tokenization, spell checker, Part-of-Speech tagging, stop word elimination, stemming, sentiment score calculation, and classification of SMT has been discussed. A new concept of threshold negative parameter is first time introduced in this article. In the experiment, the proposed SentiVerb system’s performance is evaluated on three datasets such as Facebook comments on goods and services tax (GST) implementation in India, tweets on the debate between former president of USA Mr. Barack Obama and Mr. John McCain, and the movie reviews. Consequently, the implementation of the proposed SentiVerb system using rule-based classifier (RBC) gives the best performance result in term of accuracy with 82.5% on GST comments and 79.18% on Obama-McCain debate, which is better than the existing algorithms on the social issues related domain dataset(s). Also, system performance (accuracy of 71.3%) is better than others results on standard movie dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://www.nltk.org/book/ch05.html

  2. Negation words are those words which reverse the polarity of the word if they come before the positive or negative opinion word within a sentence.

  3. https://developers.facebook.com/docs/reference/api/

  4. https://findmyfbid.com/

  5. https://json-csv.com/

  6. https://bitbucket.org/speriosu/updown/src/5de483437466/data/

  7. http://www.cs.cornell.edu/people/pabo/movie-review-data/

References

  1. Asghar MZ, Ahmad S, Qasim M, et al (2016) SentiHealth: creating health-related sentiment lexicon using hybrid approach. Springerplus 5. https://doi.org/10.1186/s40064-016-2809-x

  2. Baccianella S, Esuli A, Sebastiani F (2010) SENTIWORDN ET 3 . 0 : an enhanced lexical resource for sentiment analysis and opinion mining. In: Calzolari N, Choukri K, Maegaard B, et al (eds) International Conference on Language Resources and Evaluation, LREC. Valletta, Malta, p 2200–2204

  3. Benamara F, Irit S, Cesarano C, et al (2007) Sentiment Analysis : Adjectives and Adverbs are better than Adjectives Alone. In: In Proc of Int Conf on Weblogs and Social Media. p 1–4

  4. Binali H, Potdar V, Wu C (2009) A state of the art opinion mining and its application domains. In: 2009 IEEE International Conference on Industrial Technology. p 1–6

  5. Blitzer J, Blitzer J, Dredze M, et al (2007) Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: 45th Annual Meeting of the Association for Computational Linguistics (ACL’07). p 187–205

  6. Clark E, Araki K (2011) Text normalization in social media: Progress, problems and applications for a pre-processing system of casual English. Procedia - Soc Behav Sci 27:2–11. https://doi.org/10.1016/j.sbspro.2011.10.577

    Article  Google Scholar 

  7. Cruz L, Ochoa J, Roche M, Poncelet P (2017) Dictionary-based sentiment analysis applied to a specific domain. In: SIMBig 2015, SIMBig 2016: Information Management and Big Data. p 57–68

  8. Das S, Chen M (2001) Yahoo! for Amazon: Extracting market sentiment from stock message boards. In: Proceedings of the Asia Pacific finance …. Bangkok, Thailand, p 43

  9. Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. Proc twelfth Int Conf World Wide Web - WWW ‘03 519. https://doi.org/10.1145/775152.775226

  10. Davies M Word frequency data. http://www.wordfrequency.info/free.asp?s=y. Accessed 7 Feb 2017

  11. Doyle D Stopwords. https://www.ranks.nl/stopwords. Accessed 2 Sep 2017

  12. Dutta S, Saha T, Banerjee S, Naskar SK (2015) Text normalization in code-mixed social media text. 2015 IEEE 2nd Int Conf Recent Trends Inf Syst ReTIS 2015 - Proc 378–382. https://doi.org/10.1109/ReTIS.2015.7232908

  13. Fernández-Gavilanes M, Álvarez-López T, Juncal-Martínez J et al (2016) Unsupervised method for sentiment analysis in online texts. Expert Syst Appl 58:57–75. https://doi.org/10.1016/j.eswa.2016.03.031

    Article  Google Scholar 

  14. Guan T, Wang L, Jin J, Song X (2018) Knowledge contribution behavior in online Q & a communities : an empirical investigation. Comput Human Behav 81:137–147

    Article  Google Scholar 

  15. Gupellil I, Boukhalfa K (2015) Social big data mining: a survey focused on opinion mining and sentiments analysis. 12th Int Symp Program Syst ISPS 2015 132–141. https://doi.org/10.1109/ISPS.2015.7244976

  16. Han B, Cook P, Baldwin T (2013) Lexical normalization for social media text. ACM Trans Intell Syst Technol 4:1–27. https://doi.org/10.1145/2414425.2414430

    Article  Google Scholar 

  17. Hatzivassiloglou V, McKeown KR (1997) Predicting the semantic orientation of adjectives. In: Proceedings of the 35th annual meeting on Association for Computational Linguistics. p 174–181

  18. Hearst MA (1992) Direction-Based Text Interpretation as an Information Access Refinement. In: Jacobs P, Erlbaum L (eds) Text-based Intelligent Systems. p 257–274

  19. Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ‘04. p 168–177

  20. Hu X, Tang J, Gao H, Liu H (2013) Unsupervised sentiment analysis with emotional signals. In: Proceedings of the 22nd international conference on World Wide Web - WWW ‘13. p 607–618

  21. Hussein DME-DM (2016) A survey on sentiment analysis challenges. J King Saud Univ - Eng Sci. https://doi.org/10.1016/j.jksues.2016.04.002

  22. Jivani AG (2011) A comparative study of stemming algorithms. Int J Comput Technol Appl 2:1930–1938

    Google Scholar 

  23. Kapko M (2015) 7 Staggering social media use by-the-minute stats. In: CIO From IDG. http://www.cio.com/article/2915592/social-media/7-staggering-social-media-use-by-the-minute-stats.html. Accessed 3 Jan 2017

  24. Karamibekr M, Ghorbani AA (2012) Sentiment analysis of social issues. In: 2012 International Conference on Social Informatics. p 215–221

  25. Karamibekr M, Ghorbani AA (2012) Verb oriented sentiment classification. In: Proceedings - 2012 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2012. p 327–331

  26. Kim S-M, Hovy E (2004) Determining the sentiment of opinions. In: Proceedings of the 20th international conference on Computational Linguistics - COLING ‘04. Geneva, p 1367–1373

  27. Leong LY, Jaafar NI, Ainin S (2018) The effects of Facebook browsing and usage intensity on impulse purchase in f-commerce. Comput Human Behav 78:160–173. https://doi.org/10.1016/j.chb.2017.09.033

    Article  Google Scholar 

  28. Li H, Yamanishi K (2001) Mining from open answers in questionnaire data. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ‘01. p 443–449

  29. Liu B (2012) Sentiment analysis and opinion mining. Morgan & Claypool Publishers, San Rafael

    Book  Google Scholar 

  30. Liu F, Weng F, Wang B, Liu Y (2011) Insertion, deletion, or substitution? Normalizing text messages without pre-categorization nor supervision. In: ACL-HLT 2011 - Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. p 71–76

  31. Liu F, Weng F, Jiang X (2012) A Broad-Coverage Normalization System for Social Media Language. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1. p 1035–1044

  32. Lu Y, Castellanos M, Dayal U, Zhai C (2011) Automatic construction of a context-aware sentiment lexicon. In: Proceedings of the 20th international conference on World wide web - WWW ‘11. ACM Press, New York, p 347

  33. Luoa N, Guoa X, Lua B, Chen G (2018) Can non-work-related social media use benefit the company ? A study on internal blogging and affective organizational commitment. Comput Human Behav 81:84–92. https://doi.org/10.1016/j.chb.2017.12.004

    Article  Google Scholar 

  34. Medhat W, Hassan A, Korashy H (2014) Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J 5:1093–1113. https://doi.org/10.1016/j.asej.2014.04.011

    Article  Google Scholar 

  35. Modi SN (2017) Good and simple tax. In: Facebook. https://www.facebook.com/narendramodi/posts/10159500212090165. Accessed 6 Oct 2017

  36. Morinaga S, Yamanishi K, Tateishi K, Fukushima T (2002) Mining product reputations on the Web. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ‘02. p 341

  37. Mou J, Shin D (2018) Effects of social popularity and time scarcity on online consumer behaviour regarding smart healthcare products: an eye-tracking approach. Comput Human Behav 78:74–89. https://doi.org/10.1016/j.chb.2017.08.049

    Article  Google Scholar 

  38. Nasukawa T, Yi J (2003) Sentiment analysis: Capturing favorability using natural language processing. In: Proceedings of the 2nd international conference on knowledge capture. p 70–77

  39. Ortega Bueno R, Fonseca Bruzón A, Muñiz Cuza C, et al (2015) UO_UA: using latent semantic analysis to build a domain-dependent sentiment resource. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Association for Computational Linguistics, Dublin, Ireland, p 773–778

  40. Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the ACL

  41. Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2:1–135

    Article  Google Scholar 

  42. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: sentiment classification using machine learning techniques. Proc ACL-02 Conf Empir methods Nat Lang Process (EMNLP ‘02) July 06–07 2002 Philadelphia, PA, USA Philadelph, p 79–86. https://doi.org/10.3115/1118693.1118704

  43. Peng L, Cui G, Zhuang M, Li C (2014) What do seller manipulations of online product reviews mean to consumers?

  44. Polanyi L, Zaenen A (2006) Contextual valence shifters. In: Shanahan JG, Qu Y, Wiebe J (eds) Computing attitude and affect in text: theory and applications, The Inform. Springer, Dordrecht, p 1–10

  45. Press Trust Of India (2013) India to have the highest internet traffic growth rate. In: Bus. Stand. http://www.business-standard.com/article/technology/india-to-have-the-highest-internet-traffic-growth-rate-113071000014_1.html. Accessed 12 Oct 2017

  46. QuinStreet Inc. webopedia. https://www.webopedia.com/quick_ref/textmessageabbreviations.asp. Accessed 20 Jul 2017

  47. Ravi K, Ravi V (2015) A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl-Based Syst 89:14–46. https://doi.org/10.1016/j.knosys.2015.06.015

    Article  Google Scholar 

  48. Sack W (1994) On the computation of point of view. In: Proceedings of the Association for Advancement of Artificial Intelligence. p 1488

  49. Saif H, He Y, Fernandez M, Alani H (2014) Adapting sentiment lexicons using contextual semantics for sentiment analysis of twitter. In: Presutti V, Blomqvist E, Troncy R, Sack H, Papadakis ITA (eds) The semantic web: ESWC 2014 satellite events. ESWC 2014. Springer, Cham, pp 54–63

    Chapter  Google Scholar 

  50. Saif H, Fernandez M, He Y, Alani H (2014) On Stopwords, filtering and data sparsity for sentiment analysis of twitter. Proc Ninth Int Conf Lang Resour Eval 810–817

  51. Schmidbauer H, Rösch A, Stieler F (2018) The 2016 US presidential election and media on Instagram: who was in the lead? Comput Human Behav 81:148–160. https://doi.org/10.1016/j.chb.2017.11.021

    Article  Google Scholar 

  52. Serrano-Guerrero J, Olivas JA, Romero FP, Herrera-Viedma E (2015) Sentiment analysis: a review and comparative analysis of web services. Inf Sci 311:18–38. https://doi.org/10.1016/j.ins.2015.03.040

    Article  Google Scholar 

  53. Singh SK, Sachan MK (2017) Importance and challenges of social media text. Int J Adv Res Comput Sci 8:831–834. https://doi.org/10.26483/ijarcs.v8i3.3108

  54. Singh SK, Paul S (2015) Sentiment analysis of social issues and sentiment score calculation of negative prefixes. Int J Appl Eng Res 10:1694–1699

    Google Scholar 

  55. Singh SK, Paul S, Kumar D (2014) Sentiment analysis approaches on different data set domain: survey. Int J Database Theory Appl 7:39–50. https://doi.org/10.14257/ijdta.2014.7.5.04

    Article  Google Scholar 

  56. Singh SK, Paul S, Kumar D, Arfi H (2014) Sentiment analysis on twitter data set: survey. Int J Appl Eng Res 9:13925–13936

    Google Scholar 

  57. Singh PK, Singh SK, Paul S (2015) Sentiment classification of social issues using contextual valence shifters. Int J Eng Technol 7:1443–1452

    Google Scholar 

  58. Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45:427–437. https://doi.org/10.1016/j.ipm.2009.03.002

    Article  Google Scholar 

  59. Somasundaran S, Wiebe J (2010) Recognizing stances in ideological on-line debates. In: Proceeding CAAGET ‘10 Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text. Los Angeles, California, p 116–124

  60. Subasic P, Huettner A (2001) Affect analysis of text using fuzzy semantic typing. IEEE Trans Fuzzy Syst 9:483–496. https://doi.org/10.1109/91.940962

    Article  Google Scholar 

  61. Subrahmanian VS, Reforgiato D (2008) AVA : adjective- combinations for sentiment analysis. IEEE Intell Syst 23:43–50. https://doi.org/10.1109/MIS.2008.57

    Article  Google Scholar 

  62. Taboada M, Anthony C, Voll K (2006) Methods for creating semantic orientation dictionaries. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC). Genoa, Italy, pp 427–432

  63. Taboada M, Brooke J, Tofiloski M et al (2010) Lexicon-basedmethods for sentiment analysis. Comput Linguist 37:267–307. https://doi.org/10.1162/COLI_a_00049

    Article  Google Scholar 

  64. Tong R (2001) An operational system for detecting and tracking opinions in on-line discussion. Work Notes ACM SIGIR 2001 Work Oper Text Classif, p 1–6

  65. Turney PD (2002) Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. Proc 40th Annu Meet Assoc Comput Linguist - ACL ‘02 417–424. https://doi.org/10.3115/1073083.1073153

  66. Vilnai-Yavetz I, Levina O (2018) Motivating social sharing of e-business content: intrinsic motivation, extrinsic motivation, or crowding-out effect? Comput Human Behav 79:181–191. https://doi.org/10.1016/j.chb.2017.10.034

    Article  Google Scholar 

  67. Vohra SM, Teraiya JB (2012) A comparative study of sentiment analysis techniques. J Information, Knowl Res Comput Eng 2:313–317. https://doi.org/10.13140/2.1.4255.0722

    Article  Google Scholar 

  68. Wang P, Ng HT (2013) A Beam-Search Decoder for Normalization of Social Media Text with Application to Machine Translation. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp 471–481

  69. Washenko A (2015) The 75 most important social media acronyms. In: Sproutsocial Blog. https://sproutsocial.com/insights/social-media-acronyms/. Accessed 25 May 2017

  70. Whissell CM (1989) The Dictionary of Affect in Language. In: The Measurement of Emotions. Elsevier, p 113–131

  71. Wiebe JM (1990) Identifying subjective characters in narrative. In: Proceedings of the 13th conference on Computational linguistics (COLING ‘90). Helsinki, Finland, p 401–408

  72. Wiebe J (1994) Tracking point of view in narrative. Comput Linguist 20:233–287

    Google Scholar 

  73. Wiebe JM (2000) Learning subjective adjectives from corpora. In: Proceedings of the National Conference on Artificial Intelligence. p 735–741

  74. Wiebe JM, Bruce RF, O’Hara TP (1999) Development and use of a gold-standard data set for subjectivity classifications. In: Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics. p 246–253

  75. XPO6 (2009) List of English Stop Words. http://xpo6.com/list-of-english-stop-words/. Accessed 20 Aug 2017

  76. Yi J, Nasukawa T, Bunescu R, Niblack W (2003) Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques. In: Third IEEE International Conference on Data Mining. p 427–434

Download references

Acknowledgements

The authors would like to thank Sant Longowal Institute of Engineering and Technology, Punjab, India for providing the systems and infrastructure to support this research work. Also thank to Dr. Sanjeev Prakash and Smt. Preetpal Kaur Buttar for proof reading.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shailendra Kumar Singh.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

Table 13 Abbreviation words list of social media text

Appendix 2

Table 14 List of stop words for sentiment analysis

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singh, S.K., Sachan, M.K. SentiVerb system: classification of social media text using sentiment analysis. Multimed Tools Appl 78, 32109–32136 (2019). https://doi.org/10.1007/s11042-019-07995-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-07995-2

Keywords

Navigation