Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3386723.3387840acmotherconferencesArticle/Chapter ViewAbstractPublication PagesnissConference Proceedingsconference-collections
research-article

Feature Selection Methods in Sentiment Analysis: A Review

Published: 18 May 2020 Publication History

Abstract

The development of digital tecnnologies nowadays assists people by suggesting opinion, choices, preferences and feelings. This opinion is useful for company's engagement to make certain analysis to know their potential users and personalized their need. However, the information needs extraction to make further analysis. Thus, sentiment analysis is used to extract opinion and others and transform it into meaningful data. During the process of analysis, feature selection method is required to select a subset which consists of relevant features to construct a predictive model. This process requires some conditions during the selection of feature subset. The required conditions for feature selection are that the selected feature subset must be small and relevant for a high dimensional dataset which considers the presence of noise plus there are no redundant features. However, some of the feature selection methods unable to fulfill all conditions. In this research, 40 papers were collected, classified and reviewed. We discussed on the feature selection methods in sentiment analysis based on its level of analysis and make comparison between these methods to know its limitation and advantages. The comparison made between methods are based on its accuracy and CPU performance. Finally, suggest the best/benchmark method for feature selection. The findings obtained from this research shows that hybrid methods obtain the best accuracy and CPU performance compared to the other methods.

References

[1]
Abbasi, B. Z., Hussain, S., & Faisal, M. I. (2019). An Automated Text Classification Method: Using Improved Fuzzy Set Approach for Feature Selection. Proceedings of 2019 16th International Bhurban Conference on Applied Sciences and Technology, IBCAST 2019, 666--670. https://doi.org/10.1109/IBCAST.2019.8667159
[2]
Agarwal, B., & Mittal, N. (2013). Sentiment Classification using Rough Set based Hybrid Feature Selection. Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA 2013), (June), 115--119. Retrieved from http://www.aclweb.org/anthology/W/W13/W13-16.pdf#page=127
[3]
Al-Radaideh, Q. A., & Al-Qudah, G. Y. (2017). Application of Rough Set-Based Feature Selection for Arabic Sentiment Analysis. Cognitive Computation, 9(4), 436--445. https://doi.org/10.1007/s12559-017-9477-1
[4]
Al-Radaideh, Q. A., & Twaiq, L. M. (2014). Rough set theory for arabic sentiment classification. Proceedings - 2014 International Conference on Future Internet of Things and Cloud, FiCloud 2014, 559--564. https://doi.org/10.1109/FiCloud.2014.97
[5]
Chen, B., Chen, L., & Chen, Y. (2013). Efficient Ant Colony Optimization for Image Feature Selection. Signal Processing, 93(6), 1566--1576. https://doi.org/10.1016/j.sigpro.2012.10.022
[6]
D'Andrea, A., Ferri, F., Grifoni, P., & Guzzo, T. (2015). Approaches, Tools and Applications for Sentiment Analysis Implementation. International Journal of Computer Applications, 125(3), 26--33. https://doi.org/10.5120/ijca2015905866
[7]
Dashtipour, K., Poria, S., Hussain, A., Cambria, E., Hawalah, A. Y. A., Gelbukh, A., & Zhou, Q. (2016). Multilingual Sentiment Analysis: State of the Art and Independent Comparison of Techniques. Cognitive Computation, 8(4), 757--771. https://doi.org/10.1007/s12559-016-9415-7
[8]
Deng, H., & Runger, G. (2012). Feature Selection via Regularized Trees. Proceedings of the International Joint Conference on Neural Networks, 10--15. https://doi.org/10.1109/IJCNN.2012.6252640
[9]
Doquire, G., & Verleysen, M. (2011). Feature Selection for Multi-Label Classification Problems. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6691 LNCS(1), 9--16. https://doi.org/10.1007/978-3-642-21501-8_2
[10]
Duric, A., & Song, F. (2012). Feature Selection For Sentiment Analysis based on Content and Syntax Models. Decision Support Systems, 53(4), 704--711. https://doi.org/10.1016/j.dss.2012.05.023
[11]
Elawady, R. M., Barakat, S., & Elrashidy, N. M. (2014). Different Feature Selection for Sentiment Classification. International Journal of Information Science and Intelligent System, 3(1), 137--150.
[12]
Gu, Q., Li, Z., & Han, J. (2011). Generalized Fisher Score for Feature Selection. Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, UAI 2011, 266--273.
[13]
Hu, Q., Pan, W., Song, Y., & Yu, D. (2012). Feature Selection for Monotonic Classification. Knowledge-Based Systems, 31(1), 8--18. https://doi.org/10.1016/j.knosys.2012.01.011
[14]
Jović, A., Brkić, K., & Bogunović, N. (2015). A review of feature selection methods with applications. 38th International Convention on Information and Communication Technology, Electronics and Microelectonics (MIPRO) IEEE, 112, 1200--1205. https://doi.org/10.1016/j.compbiomed.2019.103375
[15]
Koncz, P., & Paralic, J. (2011). An Approach to Feature Selection for Sentiment Analysis. 15th IEEE International Conference on Intelligent Engineering Systems, 357--362. https://doi.org/10.1109/INES.2011.5954773
[16]
Lavanya, D., & Rani, D. K. U. (2011). Analysis of Feature Selection with Classification: Breast Cancer Datasets. Indian Journal of Computer Science and Engineering (IJCSE), 2(5), 756--763.
[17]
Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., & Liu, H. (2017). Feature Selection: A Data Perspective. ACM Computing Surveys, 50(6). https://doi.org/10.1145/3136625
[18]
Nie, F., Zhu, W., & Li, X. (2016). Unsupervised Feature Selection with Structured Graph Optimization. 30th AAAI Conference on Artificial Intelligence, AAAI 2016, (3102015), 1302--1308.
[19]
Qian, Y., Wang, Q., Cheng, H., Liang, J., & Dang, C. (2015). Fuzzy-Rough Feature Selection Accelerator. Fuzzy Sets and Systems, 258, 61--78. https://doi.org/10.1016/j.fss.2014.04.029
[20]
Sailaja, N. V., Sree, L. P., & Mangathayaru, N. (2016). Rough Set based Feature Selection Approach for Text Mining. Proceedings of the 2016 2nd International Conference on Contemporary Computing and Informatics, IC3I 2016, 40--45. https://doi.org/10.1109/IC3I.2016.7917932
[21]
Salas-Zárate, M. del P., Paredes-Valverde, M. A., Limon-Romero, J., Tlapa, D., & Baez-Lopez, Y. (2016). Sentiment classification of spanish reviews: An approach based on feature selection and machine learning methods. Journal of Universal Computer Science, 22(5), 691--708.
[22]
Sánchez-Maroño, N., Alonso-Betanzos, A., & Tombilla-Sanromán, M. (2007). Filter methods for feature selection -A comparative study. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 4881 LNCS, 178--187. https://doi.org/10.1007/978-3-540-77226-2_19
[23]
Serrano-Guerrero, J., Olivas, J. A., Romero, F. P., & Herrera-Viedma, E. (2015). Sentiment analysis: A review and comparative analysis of web services. Information Sciences, 311, 18--38. https://doi.org/10.1016/j.ins.2015.03.040
[24]
Sharma, A., & Dey, S. (2012). A Comparative Study of Feature Selection and Machine Learning Techniques for Sentiment Analysis. Proceedings of the 2012 ACM Research in Applied Computation Symposium, 1--7.
[25]
Song, Q., Ni, J., & Wang, G. (2013). A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data. IEEE Transactions on Knowledge and Data Engineering, 25(1), 1--14. https://doi.org/10.1109/TKDE.2011.181
[26]
Tang, J., & Liu, H. (2012). Feature Selection with Linked Data in Social Media. Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012, 118--128. https://doi.org/10.1137/1.9781611972825.11
[27]
Uğuz, H. (2011). A Two-Stage Feature Selection Method for Text Categorization by Using Information Gain, Principal Component Analysis and Genetic algorithm. Knowledge-Based Systems, 24(7), 1024--1032. https://doi.org/10.1016/j.knosys.2011.04.014
[28]
Vieira, S. M., Sousa, J. M. C., & Kaymak, U. (2012). Fuzzy Criteria for Feature Selection. Fuzzy Sets and Systems, 189(1), 1--18. https://doi.org/10.1016/j.fss.2011.09.009
[29]
Wang, Suge, Li, D., Song, X., Wei, Y., & Li, H. (2011). A Feature Selection Method based on Improved Fisher's Discriminant Ratio for Text Sentiment Classification. Expert Systems with Applications, 38(7), 8696--8702. https://doi.org/10.1016/j.eswa.2011.01.077
[30]
Wang, Suhang, Tang, J., & Liu, H. (2015). Embedded Unsupervised Feature Selection. Proceedings of the National Conference on Artificial Intelligence, 1, 470--476.
[31]
Xue, B., Zhang, M., Member, S., & Browne, W. N. (2012). Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach. IEEE Transactions on Cybernetics, 1--16.
[32]
Yang, S., Yuan, L., Lai, Y. C., Shen, X., Wonka, P., & Ye, J. (2013). Feature Grouping and Selection Over an Undirected Graph. Graph Embedding for Pattern Analysis, 27--43. https://doi.org/10.1007/978-1-4614-4457-2_2
[33]
Yusof, N. N., Mohamed, A., & Abdul-Rahman, S. (2019). Context Enrichment Model Based Framework for Sentiment Analysis. In International Conference on Soft Computing in Data Science, SCDS 2015 (pp. 325--335). https://doi.org/10.1007/978-981-287-936-3
[34]
Zhang, H., Lu, G., Qassrawi, M. T., Zhang, Y., & Yu, X. (2012). Feature Selection for Optimizing Traffic Classification. Computer Communications, 35(12), 1457--1471. https://doi.org/10.1016/j.comcom.2012.04.012
[35]
Zhang, T. (2013). Multi-stage convex relaxation for feature selection. Bernoulli, 19(5), 2277--2293. https://doi.org/10.3150/12-BEJ452
[36]
Zhang, Z., Dong, J., Luo, X., Choi, K. S., & Wu, X. (2014). Heartbeat classification using disease-specific feature selection. Computers in Biology and Medicine, 46(1), 79--89. https://doi.org/10.1016/j.compbiomed.2013.11.019
[37]
El Aboudi, N., & Benhlima, L. (2016, September). Review on wrapper feature selection approaches. In 2016 International Conference on Engineering & MIS (ICEMIS) (pp. 1--5). IEEE.
[38]
Nor Nadiah Yusof, Azlinah Mohamed, and Shuzlina Abdul-Rahman. A Review of Contextual Information for Context-Based Approach in Sentiment Analysis. International Journal of Machine Learning and Computing vol. 8, no. 4, pp. 399--403, 2018. ISSN: 2010-3700
[39]
Nor Nadiah Yusof, Azlinah Hj Mohamed, Shuzlina Abdul Rahman, Context-Based Framework For Sentiment Analysis, Computing Conference 2017, IEEE.
[40]
Nor Nadiah Yusof, Azlinah Mohamed, S Abdul-Rahman, Reviewing Classification Approaches in Sentiment Analysis, Soft Computing in Data Science: First International Conference, SCDS 2015, Putrajaya, Malaysia, September 2-3, 2015, Proceedings, CCIS 545, Springer, p. 43--53, 2015.

Cited By

View all
  • (2022)Climate Change Sentiment Analysis Using Lexicon, Machine Learning and Hybrid ApproachesSustainability10.3390/su1408472314:8(4723)Online publication date: 14-Apr-2022
  • (2022)Feature selection for online streaming high-dimensional dataApplied Soft Computing10.1016/j.asoc.2022.109355127:COnline publication date: 1-Sep-2022

Index Terms

  1. Feature Selection Methods in Sentiment Analysis: A Review

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    NISS '20: Proceedings of the 3rd International Conference on Networking, Information Systems & Security
    March 2020
    528 pages
    ISBN:9781450376341
    DOI:10.1145/3386723
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 May 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Embedded
    2. Feature Selection
    3. Filter
    4. Hybrid
    5. Sentiment Analysis
    6. Wrapper

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    NISS2020

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)18
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 13 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Climate Change Sentiment Analysis Using Lexicon, Machine Learning and Hybrid ApproachesSustainability10.3390/su1408472314:8(4723)Online publication date: 14-Apr-2022
    • (2022)Feature selection for online streaming high-dimensional dataApplied Soft Computing10.1016/j.asoc.2022.109355127:COnline publication date: 1-Sep-2022

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media