Topic Modeling Using Community Detection on a Word Association Graph

Mahfuzur Rahman Chowdhury, Intesur Ahmed, Farig Sadeque, Muhammad Yanhaona

Abstract

Topic modeling of a text corpus is one of the most well-studied areas of information retrieval and knowledge discovery. Despite several decades of research in the area that begets an array of modeling tools, some common problems still obstruct automated topic modeling from matching users’ expectations. In particular, existing topic modeling solutions suffer when the distribution of words among the underlying topics is uneven or the topics are overlapped. Furthermore, many solutions ask the user to provide a topic count estimate as input, which limits their usefulness in modeling a corpus where such information is unavailable. We propose a new topic modeling approach that overcomes these shortcomings by formulating the topic modeling problem as a community detection problem in a word association graph/network that we generate from the text corpus. Experimental evaluation using multiple data sets of three different types of text corpora shows that our approach is superior to prominent topic modeling alternatives in most cases. This paper describes our approach and discusses the experimental findings.

Anthology ID:: 2023.ranlp-1.98
Volume:: Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
Month:: September
Year:: 2023
Address:: Varna, Bulgaria
Editors:: Ruslan Mitkov, Galia Angelova
Venue:: RANLP
SIG:
Publisher:: INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:: 908–917
Language:
URL:: https://aclanthology.org/2023.ranlp-1.98
DOI:
Bibkey:
Cite (ACL):: Mahfuzur Rahman Chowdhury, Intesur Ahmed, Farig Sadeque, and Muhammad Yanhaona. 2023. Topic Modeling Using Community Detection on a Word Association Graph. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pages 908–917, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):: Topic Modeling Using Community Detection on a Word Association Graph (Chowdhury et al., RANLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.ranlp-1.98.pdf

PDF Cite Search