Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2232817.2232832acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

BibRank: a language-based model for co-ranking entities in bibliographic networks

Published: 10 June 2012 Publication History

Abstract

Bibliographic documents are basically associated with many entities including authors, venues, affiliations, etc. While bibliographic search engines addressed mainly relevant document ranking according to a query topic, ranking other related relevant bibliographic entities is still challenging. Indeed, document relevance is the primary level that allows inferring the relevance of the other entities regardless of the query topic. In this paper, we propose a novel integrated ranking model, called BibRank, that aims at ranking both document and author entities in bibliographic networks. The underlying algorithm propagates entity scores through the network by means of citation and authorship links. Moreover, we propose to weight these relationships using content-based indicators that estimate the topical relatedness between entities. In particular, we estimate the common similarity between homogeneous entities by analyzing marginal citations. We also compare document and author language models in order to evaluate the level of author's knowledge on the document topic and the document representativeness of author's knowledge. Experiment results on the representative CiteSeerX dataset show that BibRank model outperforms baseline ranking models with a significant improvement.

References

[1]
E. Agichtein, E. Brill, and S. Dumais. Improving web search ranking by incorporating user behavior information. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '06, pages 19--26, New York, NY, USA, 2006. ACM.
[2]
S. Alonso, F. J. Cabrerizo, E. H. Viedma, and F. Herrera. hg-index: a new index to characterize the scientific output of researchers based on the h- and g-indices. Scientometrics, 82:391--400, 2010.
[3]
C. T. Bergstrom, J. D. West, and M. A. Wiseman. The eigenfactor metrics. Journal of Neuroscience, 28(45):11433--11434, 2008.
[4]
J. Cohen. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1):37--46, 1960.
[5]
A. Dahlan and B. Sitohang. Combining pagerank and citation analysis to measure information credibility in internet. In Proceedings of iiWAS, pages 375--382, 2007.
[6]
L. Egghe. An improvement of the h-index: the g-index. ISSI Newsletter, 2:8--9, 2006.
[7]
M. Frické and D. Fallis. Indicators of accuracy for answers to ready reference questions on the internet. J. Am. Soc. Inf. Sci. Technol., 55:238--245, 2004.
[8]
E. Garfield. The history and meaning of the journal impact factor. JAMA: The Journal of the American Medical Association, 1, 2006.
[9]
D. Hiemstra. A linguistically motivated probabilistic model of information retrieval. In Proceedings of the Second European Conference on Research and Advanced Technology for Digital Libraries, ECDL '98, pages 569--584, London, UK, 1998. Springer-Verlag.
[10]
J. E. Hirsch. An index to quantify an individual's scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102:16569--16572, 2005.
[11]
L. B. Jabeur, L. Tamine, and M. Boughanem. A social model for literature access: towards a weighted social network of authors. In Proceedings of Adaptivity, Personalization and Fusion of Heterogeneous Information, RIAO '10, pages 32--39, 2010.
[12]
F. Jelinek and R. L. Mercer. Interpolated estimation of markov source parameters from sparse data. In Proceedings of the Workshop on Pattern Recognition in Practice, pages 381--397, Amsterdam, The Netherlands: North-Holland, 1980.
[13]
B. Jin, L. Liang, R. Rousseau, and L. Egghe. The r- and ar-indices: Complementing the h-index. Chinese Science Bulletin, 52:855--863, 2007.
[14]
M. G. Kendall. A new measure of rank correlation. Biometrika, 30(1/2):81--93, 1938.
[15]
S. M. Kirsch, A. Prof, D. Armin, B. Cremers, and S. I. A. D. Rheinischen. Social information retrieval, 2005.
[16]
J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46:604--632, 1999.
[17]
R. Lempel and S. Moran. The stochastic approach for link-structure analysis (salsa) and the tkc effect. Comput. Netw., 33:387--401, 2000.
[18]
X. Liu, J. Bollen, M. L. Nelson, and H. Van de Sompel. Co-authorship networks in the digital library research community. Inf. Process. Manage., 41:1462--1480, 2005.
[19]
L. Nie, B. D. Davison, and X. Qi. Topical link analysis for web search. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '06, pages 91--98, New York, NY, USA, 2006. ACM.
[20]
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical Report 1999--66, Stanford InfoLab, November 1999.
[21]
J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '98, pages 275--281. ACM, 1998.
[22]
S. E. Robertson and S. Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '94, pages 232--241, New York, NY, USA, 1994. Springer-Verlag New York, Inc.
[23]
J. Tang, R. Jin, and J. Zhang. A topic modeling approach and its integration into the random walk framework for academic search. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pages 1055--1060, Washington, DC, USA, 2008. IEEE Computer Society.
[24]
S. Uddin, L. Hossain, A. A., and K. Rasmussen. Trend and efficiency analysis of co-authorship network. Scientometrics, In press, 2011.
[25]
D. Walker, H. Xie, K.-K. Yan, and S. Maslov. Ranking scientific publications using a simple model of network traffic. Society, pages 1--5, 2006.
[26]
X. Wei and W. B. Croft. Lda-based document models for ad-hoc retrieval. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '06, pages 178--185, New York, NY, USA, 2006. ACM.
[27]
E. Yan and Y. Ding. Measuring scholarly impact in heterogeneous networks. In Proceedings of the 73rd ASIS&T Annual Meeting on Navigating Streams in an Information Ecosystem - Volume 47, ASIS&T '10, pages 1--7, Silver Springs, MD, USA, 2010. American Society for Information Science.
[28]
Z. Yang, L. Hong, and B. D. Davison. Topic-driven multi-type citation network analysis. In Proceedings of Adaptivity, Personalization and Fusion of Heterogeneous Information, RIAO '10, pages 24--31, 2010.
[29]
C.-T. Zhang. The e-index, complementing the h-index for excess citations. PLoS ONE, 4, 2009.
[30]
J. Zhang, J. Tang, B. Liang, Z. Yang, S. Wang, J. Zuo, and J. Li. Recommendation over a heterogeneous social network. In Proceedings of the 2008 The Ninth International Conference on Web-Age Information Management, WAIM '08, pages 309--316, Washington, DC, USA, 2008. IEEE Computer Society.
[31]
D. Zhou, S. A. Orshanskiy, H. Zha, and C. L. Giles. Co-ranking authors and documents in a heterogeneous network. In Proceedings of the 2007 Seventh IEEE International Conference on Data Mining, pages 739--744, Washington, DC, USA, 2007. IEEE Computer Society.

Cited By

View all
  • (2019)Combining Parts of Speech, Term Proximity, and Query Expansion for Document Retrieval2019 IEEE 13th International Conference on Semantic Computing (ICSC)10.1109/ICOSC.2019.8665507(150-153)Online publication date: Jan-2019
  • (2016)Exploiting heterogeneous scientific literature networks to combat ranking biasJournal of the Association for Information Science and Technology10.1002/asi.2346367:7(1679-1702)Online publication date: 1-Jul-2016
  • (2014)Community-based endogamy as an influence indicatorProceedings of the 14th ACM/IEEE-CS Joint Conference on Digital Libraries10.5555/2740769.2740782(67-76)Online publication date: 8-Sep-2014
  • Show More Cited By

Index Terms

  1. BibRank: a language-based model for co-ranking entities in bibliographic networks

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
    June 2012
    458 pages
    ISBN:9781450311540
    DOI:10.1145/2232817
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 June 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tag

    1. bibliographic network

    Qualifiers

    • Research-article

    Conference

    JCDL '12
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 415 of 1,482 submissions, 28%

    Upcoming Conference

    JCDL '24
    The 2024 ACM/IEEE Joint Conference on Digital Libraries
    December 16 - 20, 2024
    Hong Kong , China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 13 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Combining Parts of Speech, Term Proximity, and Query Expansion for Document Retrieval2019 IEEE 13th International Conference on Semantic Computing (ICSC)10.1109/ICOSC.2019.8665507(150-153)Online publication date: Jan-2019
    • (2016)Exploiting heterogeneous scientific literature networks to combat ranking biasJournal of the Association for Information Science and Technology10.1002/asi.2346367:7(1679-1702)Online publication date: 1-Jul-2016
    • (2014)Community-based endogamy as an influence indicatorProceedings of the 14th ACM/IEEE-CS Joint Conference on Digital Libraries10.5555/2740769.2740782(67-76)Online publication date: 8-Sep-2014
    • (2014)Community-based endogamy as an influence indicatorIEEE/ACM Joint Conference on Digital Libraries10.1109/JCDL.2014.6970152(67-76)Online publication date: Sep-2014

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media