Nothing Special   »   [go: up one dir, main page]

Skip to main content

Textual Clustering: Towards a More Efficient Descriptors of Texts

  • Conference paper
  • First Online:
Advances in Computational Collective Intelligence (ICCCI 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1287))

Included in the following conference series:

  • 1295 Accesses

Abstract

In this paper, we present how itemsets can be used as a discriminating descriptor in a textual clustering process. We implemented a platform named “IDETEX” capable of extracting itemsets from textual data and using them for the experimentation in different types of clustering methods, such as K-Medoids, Hierarchical clustering and Self-Organizing Map (Kohenon). To some extent the experimentations performed reveal promising results with different classifiers either “Hierarchical”, “Non-hierarchical” or “Neural network”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Agrawal, R., Imielinski T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of data, Washington, D.C, pp. 207–216 (1993)

    Google Scholar 

  • Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: 20th International Conference on Very Large Database, San Francisco, CA, pp. 487–499 (1994)

    Google Scholar 

  • Alghamdi, R.A., Taileb, M., Ameen, M.: A new multimodal fusion method based on association rules mining for image retrieval. 17th IEEE Mediterranean Electrotechnical Conference “MELECON”, pp. 493–499. Beirut, Lebanon (2014)

    Google Scholar 

  • Bahri, E., Lallich, S.: Proposition d’une méthode de classification associative adaptative. 10eme journées Francophones d’Extraction et Gestion des Connaissances, EGC 2010, pp. 501–512 (2010)

    Google Scholar 

  • Charrad, M., Ghazzali, N., Boiteau, V., Niknafs, A.: NbClust: an R package for determining the relevant number of clusters in a data set. J. Stat. Softw. Foundation for Open Access Statistics Press, 61(6) (2014)

    Google Scholar 

  • Fournier-Viger, P., Lin, J.C.W., Vo, B., Chi, T. T., Zhang, J., Le, H.B.: A survey of itemset mining. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 7(4), e1207 (2017)

    Google Scholar 

  • Geng, L., Hamilton, H.J.: Interestingness measures for data mining a survey. ACM Comput. Surv. (CSUR). 38(3), 9–11 (2006)

    Article  Google Scholar 

  • Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM Sigmod Record 29(2), 1–12 (2000)

    Article  Google Scholar 

  • Huy, T.N., Shao, H., Tong, B., Suzuki, E.: A feature-free and parameter-light multi-task clustering framework. Knowl. Inf. Syst. 36, 16–20 (2013). https://doi.org/10.1007/s10115-012-0550-5

    Article  Google Scholar 

  • Jin, X., Han, J.: K-medoids clustering. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning. Springer, Boston, MA (2011). https://doi.org/10.1007/978-0-387-30164-8_426

    Chapter  Google Scholar 

  • Ward Jr, J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963)

    Article  MathSciNet  Google Scholar 

  • Le Bras, Y., Meyer, P., Lenca, P., et Lallich, S.: Mesure de la robustesse de règles d’association.: QDC (2010)

    Google Scholar 

  • Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In Knowledge Discovery and Data Mining, New York City, NY.: American Association for Artificial Intelligence Press, pp. 80–86 (1998)

    Google Scholar 

  • McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: AAAI workshop on learning for text categorization, American Association for Artificial Intelligence Press, pp. 41–48 (1998)

    Google Scholar 

  • Mittal, K., Aggarwal, G., Mahajan, P.: A comparative study of association rule mining techniques and predictive mining approaches for association classification. Int. J. Adv. Res. Comput. Sci. 8(9) (2017)

    Google Scholar 

  • Rompré, L, Biskri, I., Meunier, J-G.: Using association rules mining for retrieving genre-specific music files. In: Proceedings of FLAIRS 2017, AAAI Press, pp. 706–711 (2017)

    Google Scholar 

  • Tan, P.N., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. New York.: ACM Press, pp. 32–41 (2002)

    Google Scholar 

  • Kohonen, T.: Self-Organizing Maps. Springer, Heidelberg, Berlin (1995)

    Book  Google Scholar 

  • Zaïane, O.R., et Antonie, M.L.: Classifying text documents by associating terms with text categories. In: Proceedings of the 13th Australasian database conference-Volume 5, pp. 215–222 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ismaïl Biskri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bokhabrine, A., Biskri, I., Ghazzali, N. (2020). Textual Clustering: Towards a More Efficient Descriptors of Texts. In: Hernes, M., Wojtkiewicz, K., Szczerbicki, E. (eds) Advances in Computational Collective Intelligence. ICCCI 2020. Communications in Computer and Information Science, vol 1287. Springer, Cham. https://doi.org/10.1007/978-3-030-63119-2_65

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63119-2_65

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63118-5

  • Online ISBN: 978-3-030-63119-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics