Nothing Special   »   [go: up one dir, main page]

Skip to main content

An Improved Dictionary Based Genre Classification Based on Title and Abstract of E-book Using Machine Learning Algorithms

  • Conference paper
  • First Online:
Proceedings of Second International Conference on Computing, Communications, and Cyber-Security

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 203))

Abstract

The amount of digital books or e-books is increasing day by day. Book Assortment is the job of assigning a category or set of appropriate genres to a book. The goal of this research paper is to classify books with related genres. Many existing approaches, like Support Vector Machine (SVM), Neural Text Categorizer (NTC), etc. are available for text mining. We applied existing machine learning algorithms with different datasets and implemented existing feature selection methods to select features. In our proposed dictionary-based approach, we classified books by its attributes like title, description, genre, and author using text mining. In the learning part, we created a dictionary of keywords from the book’s description and title and then assigned genres to the keywords. In the classification part, we attributed genres to a book. For classifying the books, we extracted a dataset from web pages using web scraping. Our proposed approach outperforms traditional approaches to reduce the time of training when massive data is considered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Mooney RJ, Roy L (2000) Content-based book recommending using learning for text categorization. In: Proceedings of the fifth ACM conference on digital libraries

    Google Scholar 

  2. Bhatia J, Patel T, Trivedi H, Majmudar V (2012) HTV dynamic load balancing algorithm for virtual machine instances in cloud. In: 2012 international symposium on cloud and services computing, Mangalore, pp 15–20.https://doi.org/10.1109/ISCOS.2012.25

  3. Karimkhan M, Bhatia JB (2014) Sentiment analysis and big data processing. IJCSC 5(1):136–142

    Google Scholar 

  4. Bhatia J, Kumhar M (2015) Perspective study on load balancing paradigms in cloud computing. IJCSC 6(1):112–120

    Google Scholar 

  5. Bhatia JB (2015) A dynamic model for load balancing in cloud infrastructure. Nirma Univ J Eng Technol (NUJET) 4(1):15

    Google Scholar 

  6. MerriamWebster.com. Genre (2014) https://www.merriamwebster.com/dictionary/genre

  7. Bieber A (2018) Voices from the interior: reimagining childhood under Janusz Korczak’s care. Lion Unicorn 42(3):321–337

    Article  Google Scholar 

  8. Swales JM (2019) The futures of EAP genre studies: a personal viewpoint. J English Acad Purposes 38:75–82

    Google Scholar 

  9. Kessler B, Nunberg G, Schütze H (1997) Automatic detection of text genre. arXiv preprint cmp-lg/9707002

    Google Scholar 

  10. Liu Y et al (2020) A new feature selection method for text classification based on independent feature space search. Math Probl Eng

    Google Scholar 

  11. Gupta A, Begum SA (2019) Efficient multi-cluster feature selection on text data. J Inf Optimiz Sci 40(8):1583–1598

    Google Scholar 

  12. Zheng W, Jin Z (2020) Comparing multiple categories of feature selection methods for text classification. Dig Scholarship Human 35(1):208–224

    Google Scholar 

  13. Liu P et al. (2019) Sentiment analysis of chinese tourism review based on boosting and LSTM. In: 2019 international conference on communications, information system, and computer engineering (CISCE). IEEE

    Google Scholar 

  14. Yang Y, Pedersen JO (2017) A comparative study on feature selection in text categorization. ICML 97:412–420

    Google Scholar 

  15. Zhao Y, Dong S, Li L (2014) Sentiment analysis on news comments based on a supervised learning method

    Google Scholar 

  16. Sarkar SD, Goswami S (2013) Empirical study on filter-based feature selection methods for text classification. Int J Comput Appl 81(6)

    Google Scholar 

  17. Sharma A, Dey S (2012) Performance investigation of feature selection methods and sentiment lexicons for sentiment analysis. IJCA Special Issue Adv Comput Commun Technol HPC Appl 3:15–20

    Google Scholar 

  18. Ozsarfati E et al (2019) Book genre classification based on titles with comparative machine learning algorithms. In: 2019 IEEE 4th international conference on computer and communication systems (ICCCS). IEEE

    Google Scholar 

  19. Buczkowski P, Sobkowicz A, Kozlowski M (2018) Deep learning approaches towards book covers classification. ICPRAM:309–316

    Google Scholar 

  20. Worsham J, Kalita J (2018) Genre identification and the compositional effect of the genre in literature. In: Proceedings of the 27th international conference on computational linguistics

    Google Scholar 

  21. Álvarez-López T et al (2018) A proposal for book-oriented aspect-based sentiment analysis: comparison over domains. In: International conference on applications of natural language to information systems. Springer, Cham

    Google Scholar 

  22. Vachhani H et al (2019) Machine learning-based stock market analysis: a short survey. In: International conference on innovative data communication technologies and application. Springer, Cham

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Thakur, V., Patel, A.C. (2021). An Improved Dictionary Based Genre Classification Based on Title and Abstract of E-book Using Machine Learning Algorithms. In: Singh, P.K., Wierzchoń, S.T., Tanwar, S., Ganzha, M., Rodrigues, J.J.P.C. (eds) Proceedings of Second International Conference on Computing, Communications, and Cyber-Security. Lecture Notes in Networks and Systems, vol 203. Springer, Singapore. https://doi.org/10.1007/978-981-16-0733-2_23

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-0733-2_23

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-0732-5

  • Online ISBN: 978-981-16-0733-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics