Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1645953.1646076acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Detecting topic evolution in scientific literature: how can citations help?

Published: 02 November 2009 Publication History

Abstract

Understanding how topics in scientific literature evolve is an interesting and important problem. Previous work simply models each paper as a bag of words and also considers the impact of authors. However, the impact of one document on another as captured by citations, one important inherent element in scientific literature, has not been considered. In this paper, we address the problem of understanding topic evolution by leveraging citations, and develop citation-aware approaches. We propose an iterative topic evolution learning framework by adapting the Latent Dirichlet Allocation model to the citation network and develop a novel inheritance topic model. We evaluate the effectiveness and efficiency of our approaches and compare with the state of the art approaches on a large collection of more than 650,000 research papers in the last 16 years and the citation network enabled by CiteSeerX. The results clearly show that citations can help to understand topic evolution better.

References

[1]
J. Allan. Topic detection and tracking. event-based information organization. Kluwer Academic Publishers, 2002.
[2]
L. AlSumait et al. On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking. In ICDM'08.
[3]
D. Blei et al. Latent dirichlet allocation. Journal of Machine Learning Research 2003.
[4]
D. Blei and M. Jordan. Modeling annotated data. In SIGIR'03.
[5]
D. Blei and J. Lafferty. Dynamic topic models. In ICML'06.
[6]
L. Bolelli et al. Finding Topic Trends in Digital Libraries. In JCDL'09.
[7]
L. Bolelli et al. Topic and Trend Detection in Text Collections using Latent Dirichlet Allocation. In ECIR'09.
[8]
R. Brown et al. Link detection results and analysis. In 1999 TDT-3 Evaluation Project Workshop 1999.
[9]
F. Chen et al. Story link detection and new event detection are asymmetric. HLT-NAACL 2003.
[10]
D. Cohn and T. Hofmann. The missing link -- a probabilitstic model of document content and hypertext connectivity. In NIPS'01.
[11]
L. Dietz et al. Unsupervised prediction of citation influences. In ICML'07.
[12]
E. Erosheva et al. Mixed--membership models of scientific publications. PNAS 2004.
[13]
A. Gohr and A. Hinneburg. Topic Evolution in a Stream of Documents. In SDM'09.
[14]
T. Griffiths and M. Steyvers. Finding scientific topics. PNAS 2004.
[15]
G. Heinrich. Parameter estimation for text analysis. Technical Note, University of Leipzig, 2008.
[16]
Y. Jo et al. Detecting research topics via the correlation between graphs and texts. In SIGKDD'07.
[17]
G. Mann et al. Bibliometric impact measures leveraging topic analysis. In JCDL'06.
[18]
A. McCallum et al. Topic and role discovery in social networks with experiments on enron and academic email. Journal of Artificial Intelligence Research 2007.
[19]
Q. Mei and C. Zhai. Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In SIGKDD'05.
[20]
Q. Mei et al. A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In WWW'06.
[21]
Q. Mei et al. Topic modeling with network regularization. In WWW'08.
[22]
F. Morchen et al. Anticipating annotations and emerging trends in biomedical literature. In SIGKDD'08.
[23]
S. Morinaga and K. Yamanishi. Tracking dynamics of topic trends using a finite mixture model. In SIGKDD'04.
[24]
R. Nallapati et al. Joint latent topic models for text and citations. In SIGKDD'08.
[25]
D. Newman et al. Statistical entity-topic models. In SIGKDD'06.
[26]
G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Management 1988.
[27]
R. Schult and M. Spiliopoulou. Discovering emerging topics in unlabelled text collections. In Proc. East European ADBIS Conference 2006.
[28]
M. Spiliopoulou et al. Monic: modeling and monitoring cluster transitions. In SIGKDD'06.
[29]
M. Steyvers et al. Probabilistic author-topic models for information discovery. In SIGKDD'04.
[30]
Y. Teh et al. Hierarchical dirichlet processes. In Technical report, UC Berkeley Statistics TR-653, 2004.
[31]
C. Wang et al. Continuous time dynamic topic models. In UAI'08.
[32]
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information retrieval. TOIS 2004.
[33]
D. Zhou et al. Topic evolution and social interactions: How authors effect research. In CIKM'06.

Cited By

View all
  • (2024)Online Research Topic Modeling and Recommendation Utilizing Multiview Autoencoder-Based ApproachIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.325350211:1(1013-1022)Online publication date: Feb-2024
  • (2023)Multiview Deep Online Clustering: An Application to Online Research Topic Modeling and RecommendationsIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.318734210:5(2566-2578)Online publication date: Oct-2023
  • (2023)Detecting Favorite Topics in Computing Scientific Literature via Dynamic Topic ModelingIEEE Access10.1109/ACCESS.2023.326966011(41535-41545)Online publication date: 2023
  • Show More Cited By

Index Terms

  1. Detecting topic evolution in scientific literature: how can citations help?

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management
    November 2009
    2162 pages
    ISBN:9781605585123
    DOI:10.1145/1645953
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 November 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. citations
    2. inheritance topic model
    3. topic evolution

    Qualifiers

    • Research-article

    Conference

    CIKM '09
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)38
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 21 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Online Research Topic Modeling and Recommendation Utilizing Multiview Autoencoder-Based ApproachIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.325350211:1(1013-1022)Online publication date: Feb-2024
    • (2023)Multiview Deep Online Clustering: An Application to Online Research Topic Modeling and RecommendationsIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.318734210:5(2566-2578)Online publication date: Oct-2023
    • (2023)Detecting Favorite Topics in Computing Scientific Literature via Dynamic Topic ModelingIEEE Access10.1109/ACCESS.2023.326966011(41535-41545)Online publication date: 2023
    • (2023)An integrated latent Dirichlet allocation and Word2vec method for generating the topic evolution of mental models from global to localExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.118695212:COnline publication date: 1-Feb-2023
    • (2023)An embedding approach for analyzing the evolution of research topics with a case study on computer science subdomainsScientometrics10.1007/s11192-023-04642-4128:3(1567-1582)Online publication date: 31-Jan-2023
    • (2023)Toxicity in Evolving Twitter TopicsComputational Science – ICCS 202310.1007/978-3-031-36027-5_4(40-54)Online publication date: 26-Jun-2023
    • (2023)Interpreting the Smart City Through Topic ModelingIntelligence for Future Cities10.1007/978-3-031-31746-0_3(29-46)Online publication date: 2-Jun-2023
    • (2022)Analyzing the generalizability of the network-based topic emergence identification methodSemantic Web10.3233/SW-21295113:3(423-439)Online publication date: 1-Jan-2022
    • (2022)Utilizing Keywords Evolution in Context for Emerging Trend Detection in Scientific PublicationsProceedings of the 11th International Symposium on Information and Communication Technology10.1145/3568562.3568640(247-253)Online publication date: 1-Dec-2022
    • (2022)SciNoBo: A Hierarchical Multi-Label Classifier of Scientific PublicationsCompanion Proceedings of the Web Conference 202210.1145/3487553.3524677(800-809)Online publication date: 25-Apr-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media