Abstract
In this paper, we present how itemsets can be used as a discriminating descriptor in a textual clustering process. We implemented a platform named “IDETEX” capable of extracting itemsets from textual data and using them for the experimentation in different types of clustering methods, such as K-Medoids, Hierarchical clustering and Self-Organizing Map (Kohenon). To some extent the experimentations performed reveal promising results with different classifiers either “Hierarchical”, “Non-hierarchical” or “Neural network”.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agrawal, R., Imielinski T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of data, Washington, D.C, pp. 207–216 (1993)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: 20th International Conference on Very Large Database, San Francisco, CA, pp. 487–499 (1994)
Alghamdi, R.A., Taileb, M., Ameen, M.: A new multimodal fusion method based on association rules mining for image retrieval. 17th IEEE Mediterranean Electrotechnical Conference “MELECON”, pp. 493–499. Beirut, Lebanon (2014)
Bahri, E., Lallich, S.: Proposition d’une méthode de classification associative adaptative. 10eme journées Francophones d’Extraction et Gestion des Connaissances, EGC 2010, pp. 501–512 (2010)
Charrad, M., Ghazzali, N., Boiteau, V., Niknafs, A.: NbClust: an R package for determining the relevant number of clusters in a data set. J. Stat. Softw. Foundation for Open Access Statistics Press, 61(6) (2014)
Fournier-Viger, P., Lin, J.C.W., Vo, B., Chi, T. T., Zhang, J., Le, H.B.: A survey of itemset mining. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 7(4), e1207 (2017)
Geng, L., Hamilton, H.J.: Interestingness measures for data mining a survey. ACM Comput. Surv. (CSUR). 38(3), 9–11 (2006)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM Sigmod Record 29(2), 1–12 (2000)
Huy, T.N., Shao, H., Tong, B., Suzuki, E.: A feature-free and parameter-light multi-task clustering framework. Knowl. Inf. Syst. 36, 16–20 (2013). https://doi.org/10.1007/s10115-012-0550-5
Jin, X., Han, J.: K-medoids clustering. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning. Springer, Boston, MA (2011). https://doi.org/10.1007/978-0-387-30164-8_426
Ward Jr, J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963)
Le Bras, Y., Meyer, P., Lenca, P., et Lallich, S.: Mesure de la robustesse de règles d’association.: QDC (2010)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In Knowledge Discovery and Data Mining, New York City, NY.: American Association for Artificial Intelligence Press, pp. 80–86 (1998)
McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: AAAI workshop on learning for text categorization, American Association for Artificial Intelligence Press, pp. 41–48 (1998)
Mittal, K., Aggarwal, G., Mahajan, P.: A comparative study of association rule mining techniques and predictive mining approaches for association classification. Int. J. Adv. Res. Comput. Sci. 8(9) (2017)
Rompré, L, Biskri, I., Meunier, J-G.: Using association rules mining for retrieving genre-specific music files. In: Proceedings of FLAIRS 2017, AAAI Press, pp. 706–711 (2017)
Tan, P.N., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. New York.: ACM Press, pp. 32–41 (2002)
Kohonen, T.: Self-Organizing Maps. Springer, Heidelberg, Berlin (1995)
Zaïane, O.R., et Antonie, M.L.: Classifying text documents by associating terms with text categories. In: Proceedings of the 13th Australasian database conference-Volume 5, pp. 215–222 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Bokhabrine, A., Biskri, I., Ghazzali, N. (2020). Textual Clustering: Towards a More Efficient Descriptors of Texts. In: Hernes, M., Wojtkiewicz, K., Szczerbicki, E. (eds) Advances in Computational Collective Intelligence. ICCCI 2020. Communications in Computer and Information Science, vol 1287. Springer, Cham. https://doi.org/10.1007/978-3-030-63119-2_65
Download citation
DOI: https://doi.org/10.1007/978-3-030-63119-2_65
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63118-5
Online ISBN: 978-3-030-63119-2
eBook Packages: Computer ScienceComputer Science (R0)