Abstract
The amount of digital books or e-books is increasing day by day. Book Assortment is the job of assigning a category or set of appropriate genres to a book. The goal of this research paper is to classify books with related genres. Many existing approaches, like Support Vector Machine (SVM), Neural Text Categorizer (NTC), etc. are available for text mining. We applied existing machine learning algorithms with different datasets and implemented existing feature selection methods to select features. In our proposed dictionary-based approach, we classified books by its attributes like title, description, genre, and author using text mining. In the learning part, we created a dictionary of keywords from the book’s description and title and then assigned genres to the keywords. In the classification part, we attributed genres to a book. For classifying the books, we extracted a dataset from web pages using web scraping. Our proposed approach outperforms traditional approaches to reduce the time of training when massive data is considered.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mooney RJ, Roy L (2000) Content-based book recommending using learning for text categorization. In: Proceedings of the fifth ACM conference on digital libraries
Bhatia J, Patel T, Trivedi H, Majmudar V (2012) HTV dynamic load balancing algorithm for virtual machine instances in cloud. In: 2012 international symposium on cloud and services computing, Mangalore, pp 15–20.https://doi.org/10.1109/ISCOS.2012.25
Karimkhan M, Bhatia JB (2014) Sentiment analysis and big data processing. IJCSC 5(1):136–142
Bhatia J, Kumhar M (2015) Perspective study on load balancing paradigms in cloud computing. IJCSC 6(1):112–120
Bhatia JB (2015) A dynamic model for load balancing in cloud infrastructure. Nirma Univ J Eng Technol (NUJET) 4(1):15
MerriamWebster.com. Genre (2014) https://www.merriamwebster.com/dictionary/genre
Bieber A (2018) Voices from the interior: reimagining childhood under Janusz Korczak’s care. Lion Unicorn 42(3):321–337
Swales JM (2019) The futures of EAP genre studies: a personal viewpoint. J English Acad Purposes 38:75–82
Kessler B, Nunberg G, Schütze H (1997) Automatic detection of text genre. arXiv preprint cmp-lg/9707002
Liu Y et al (2020) A new feature selection method for text classification based on independent feature space search. Math Probl Eng
Gupta A, Begum SA (2019) Efficient multi-cluster feature selection on text data. J Inf Optimiz Sci 40(8):1583–1598
Zheng W, Jin Z (2020) Comparing multiple categories of feature selection methods for text classification. Dig Scholarship Human 35(1):208–224
Liu P et al. (2019) Sentiment analysis of chinese tourism review based on boosting and LSTM. In: 2019 international conference on communications, information system, and computer engineering (CISCE). IEEE
Yang Y, Pedersen JO (2017) A comparative study on feature selection in text categorization. ICML 97:412–420
Zhao Y, Dong S, Li L (2014) Sentiment analysis on news comments based on a supervised learning method
Sarkar SD, Goswami S (2013) Empirical study on filter-based feature selection methods for text classification. Int J Comput Appl 81(6)
Sharma A, Dey S (2012) Performance investigation of feature selection methods and sentiment lexicons for sentiment analysis. IJCA Special Issue Adv Comput Commun Technol HPC Appl 3:15–20
Ozsarfati E et al (2019) Book genre classification based on titles with comparative machine learning algorithms. In: 2019 IEEE 4th international conference on computer and communication systems (ICCCS). IEEE
Buczkowski P, Sobkowicz A, Kozlowski M (2018) Deep learning approaches towards book covers classification. ICPRAM:309–316
Worsham J, Kalita J (2018) Genre identification and the compositional effect of the genre in literature. In: Proceedings of the 27th international conference on computational linguistics
Álvarez-López T et al (2018) A proposal for book-oriented aspect-based sentiment analysis: comparison over domains. In: International conference on applications of natural language to information systems. Springer, Cham
Vachhani H et al (2019) Machine learning-based stock market analysis: a short survey. In: International conference on innovative data communication technologies and application. Springer, Cham
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Thakur, V., Patel, A.C. (2021). An Improved Dictionary Based Genre Classification Based on Title and Abstract of E-book Using Machine Learning Algorithms. In: Singh, P.K., Wierzchoń, S.T., Tanwar, S., Ganzha, M., Rodrigues, J.J.P.C. (eds) Proceedings of Second International Conference on Computing, Communications, and Cyber-Security. Lecture Notes in Networks and Systems, vol 203. Springer, Singapore. https://doi.org/10.1007/978-981-16-0733-2_23
Download citation
DOI: https://doi.org/10.1007/978-981-16-0733-2_23
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-0732-5
Online ISBN: 978-981-16-0733-2
eBook Packages: EngineeringEngineering (R0)