An Efficient Algorithm for Topic Ranking and Modeling Topic Evolution

Kumar Shubhankar²⁰,
Aditya Pratap Singh²⁰ &
Vikram Pudi²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6860))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

1399 Accesses
8 Citations

Abstract

In this paper we introduce a novel and efficient approach to detect and rank topics in a large corpus of research papers. With rapidly growing size of academic literature, the problem of topic detection and topic ranking has become a challenging task. We present a unique approach that uses closed frequent keyword-set to form topics. We devise a modified time independent PageRank algorithm that assigns an authoritative score to each topic by considering the sub-graph in which the topic appears, producing a ranked list of topics. The use of citation network and the introduction of time invariance in the topic ranking algorithm reveal very interesting results. Our approach also provides a clustering technique for the research papers using topics as similarity measure. We extend our algorithms to study various aspects of topic evolution which gives interesting insight into trends in research areas over time. Our algorithms also detect hot topics and landmark topics over the years. We test our algorithms on the DBLP dataset and show that our algorithms are fast, effective and scalable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Topic discovery and evolution in scientific literature based on content and citations

Article 01 October 2017

A decade of research in statistics: a topic model approach

Article 12 March 2015

Modeling Topic Evolution to Steer Interactive Information Search

References

Agarwal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of the 20th VLDB Conference (1994)
Google Scholar
Pasquier, N., Bastide, Y., Taoull, R., Lakhal, L.: Efficient Mining of Association Rules Using Closed Itemset Lattices. Information Systems (1999)
Google Scholar
Brin, S., Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. In: Proc. of the 7th International Conference on World Wide Web (1998)
Google Scholar
Klienberg, J.: Authoritative sources in a hyperlinked environment. In: Proc. of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms (1998)
Google Scholar
The DBLP Computer Science Bibliography, http://dblp.uni-trier.de/
Wartena, C., Brussee, R.: Topic Detection by Clustering Keywords. In: Proc. of the 19th International Conference on Database and Expert Systems Applications (2008)
Google Scholar
Beil, F., Ester, M., Xu, X.: Frequent Term-Based Text Clustering. In: Proc. of the 8th International Conference on Knowledge Discovery and Data Mining (2002)
Google Scholar
Krishna, S.M., Bhavani, S.D.: An Efficient Approach for Text Clustering Based on Frequent Itemsets. European Journal of Scientific Research (2010)
Google Scholar
Zhuang, L., Dai, H.: A Maximal Frequent Itemset Approach for Web Document Clustering. In: Proc. of the 4th International Conference on Computer and Information Technology (2004)
Google Scholar
Geng, X., Wang, J.: Toward theme development analysis with topic clustering. In: Proc. of the 1st International Conference on Advanced Computer Theory and Engineering (2008)
Google Scholar
Jo, Y., Lagoze, C., Giles, C.L.: Detecting Research Topics via the Correlation between the Graphs and Texts. In: Proc. of SIGKDD (2007)
Google Scholar
Agarwal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. of the 1993 ACM SIGMOD Conference (1993)
Google Scholar
Griffiths, T.I., Steyvers, M.: Finding Scientific Topics. Proc. of the National Academy of Sciences (2004)
Google Scholar
Steyvers, M., Smyth, P., Rosen-Zvi, M., Griffiths, T.I.: Probabilistic Author-topic Models for Information Discovery. In: Proc. of SIGKDD (2004)
Google Scholar
Mei, Q., Zhai, C.: Discovery Evolutionary Theme Patterns from Text – An Exploration of Temporal Text Mining. In: Proc. of SIGKDD (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Data Engineering, International Institute of Information Technology, Hyderabad, India
Kumar Shubhankar, Aditya Pratap Singh & Vikram Pudi

Authors

Kumar Shubhankar
View author publications
You can also search for this author in PubMed Google Scholar
Aditya Pratap Singh
View author publications
You can also search for this author in PubMed Google Scholar
Vikram Pudi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut de Recherche en Informatique de Toulouse (IRIT), Paul Sabatier University, 118, route de Narbonne, 31062, Toulouse Cedex, France
Abdelkader Hameurlain
Brigham Young University, 784 TNRB, 84602, Provo, UT, USA
Stephen W. Liddle
Software Competence Center Hagenberg and Johannes-Keppler-University Linz, Softwarepark 21, 4232, Hagenberg, Austria
Klaus-Dieter Schewe
School of Information Technology and Electrical Engineering, University of Queensland, 4072, Brisbane, QLD, Australia
Xiaofang Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shubhankar, K., Singh, A.P., Pudi, V. (2011). An Efficient Algorithm for Topic Ranking and Modeling Topic Evolution. In: Hameurlain, A., Liddle, S.W., Schewe, KD., Zhou, X. (eds) Database and Expert Systems Applications. DEXA 2011. Lecture Notes in Computer Science, vol 6860. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23088-2_23

Download citation

DOI: https://doi.org/10.1007/978-3-642-23088-2_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23087-5
Online ISBN: 978-3-642-23088-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Efficient Algorithm for Topic Ranking and Modeling Topic Evolution

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Topic discovery and evolution in scientific literature based on content and citations

A decade of research in statistics: a topic model approach

Modeling Topic Evolution to Steer Interactive Information Search

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

An Efficient Algorithm for Topic Ranking and Modeling Topic Evolution

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Topic discovery and evolution in scientific literature based on content and citations

A decade of research in statistics: a topic model approach

Modeling Topic Evolution to Steer Interactive Information Search

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation