Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2611040.2611060acmotherconferencesArticle/Chapter ViewAbstractPublication PageswimsConference Proceedingsconference-collections
research-article

Term Impact-Based Web Page Ranking

Published: 02 June 2014 Publication History

Abstract

Indexing Web pages based on content is a crucial step in a modern search engine. A variety of methods and approaches exist to support web page rankings. In this paper, we describe a new approach for obtaining measures for Web page ranking. Unlike other recent approaches, it exploits the meta-terms extracted from the titles and urls for indexing the contents of web documents. We use the term impact to correlate each meta-term with document's content, rather than term frequency and other similar techniques. Our approach also uses the structural knowledge available in Wikipedia for making better expansion and formulation for the queries. Evaluation with automatic metrics provided by TREC reveals that our approach is effective for building the index and for retrieval. We present retrieval results from the ClueWeb collection, for a set of test queries, for two tasks: for an adhoc retrieval task and for a diversity task (which aims at retrieving relevant pages that cover different aspects of the queries).

References

[1]
Dalton, J. and Dietz, L., (2012), "Bi-directional Linkability From Wikipedia to Documents and Back Again", In Proceeding of TREC 2012 Knowledge Base Acceleration Track.
[2]
Kamps, J., Kaptein, R., and Koolen, M., (2010), "Using Anchor Text, Spam Filtering and Wikipedia for Web Search and Entity Ranking", In Proceeding of 2010 Web TREC Track.
[3]
Ding, L., Finin, T., Joshi, A., Pan, R., Cost, R., Peng, Y., Reddivari, P., Doshi, V., and Sachs, Y., (2004) "Swoogle: A Search and Metadata Engine for the Semantic Web", In Proceedings of ACM 1581138741/04/0011.
[4]
Lawrence, S. and Giles, C., (1999), "Accessibility of information on the web". Macmillan Magazines Ltd.
[5]
Amitay, E., (2001), "What Lays in the Layout: Using anchor-paragraph arrangements to extract descriptions of Web documents". Doctoral thesis, Macquarie University.
[6]
Anh, V. and Moffat, A., (2010), "The Role of Anchor Text in ClueWeb09 Retrieval". In Proceedings of the 18th Text Retrieval Conference (TREC).
[7]
Craswell, N., Hawking, D., and Robertson, S. (2010), "Effective site finding using link anchor information". In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 250--257.
[8]
Kaptein, R., Koolen, M., Kamps, J. (2010), "Result Diversity and Entity Ranking Experiments: Anchors, Links, Text and Wikipedia". In Proceedings of the 19th Text Retrieval Conference (TREC).
[9]
MacKinnon, I. and Vechtomova, O. (2008), "Improving Complex Interactive Question Answering Enhanced with Wikipedia Anchor Text". In Proceedings of Advances in Information Retrieval, 30th European Conference on IR Research (ECIR).
[10]
Xing, Y. and James, A. (2010), "A Content based Approach for Discovering Missing Anchor Text for Web Search", In Proceedings of SIGIR'10, pages 19--23.
[11]
Heyuan L., Yuanhai X., Shaohua G., Feng G., Xiaoming Y., Yue L., Xueqi C., (2012), "ICTNET at Web Track 2012 Ad-hoc Task", In Proceedings of TREC 2012 Web Track.
[12]
Craswell, N. and Hawking, D., (2010), "Query-Independent Evidence in Home Page Finding", Trystan Upstill, Australian National University and CSIRO Mathematical and Information Sciences.
[13]
Baykan, E., Henzinger, M., Marian, L., Weber, L. (2009) "Purely URL-based Topic Classification", Ecole Polytechnique, Google, Lausanne, Switzerland.
[14]
Deveaud, R., Juan, E., and Bellot, P. (2012) "LIA at TREC 2012 Web Track: Unsupervised Search Concepts Identification from General Sources of Information", In Proceedings of TREC 2012 Web Track.
[15]
Chapelle, O., Metzler, D., Zhang, Y., and Grinspan, P. (2009). "Expected Reciprocal Rank for Graded Relevance", Yahoo Labs and Google Inc, Santa Clara CA, Sunnyvale CA, and San Bruno CA. ACM.
[16]
Teerapong Leelanupab, Guido Zuccon, and Joemon M. Jose. "Is Intent-Aware Expected Reciprocal Rank Sufficient to Evaluate Diversity?". Advances in Information Retrieval, 35th European Conference on IR Research, ECIR 2013, pp 738--742.
[17]
Mark Sanderson, Monica Lestari Paramita, Paul Clough, Evangelos Kanoulas. "Do user preferences and evaluation measures line up?". In proceedings of SIGIR'10, 2010, ACM 978.
[18]
Hang Cui, Ji-Rong Wen, Jian-Yun Nie, and Wei-Ying Ma. "Query Expansion by Mining User Logs". In Proceedings of e IEEE Computer Society, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003.
[19]
Limsopatham, N., McCreadie, R., Albakour, M., Macdonald, C., Santos, R., and Ounis, I. (2012). "University of Glasgow at TREC 2012: Experiments with Terrier in Medical Records, Microblog, and Web Tracks". In Proceedings of TREC Web Track.
[20]
Symonds, M. Zuccon, G., Koopman, B., and Bruza, P. (2012). "QUT Para at TREC 2012 Web Track: Word Associations for Retrieving Web Documents", In Proceedings of TREC 2012 Web Track.
[21]
Zheng, W. and Fang, H. (2012). "Exploiting Ontologies for Search Result Diversification". In Proceedings of TREC 2012 Web Track.
[22]
Dong Nguyen and Djoerd Hiemstra, "Ensemble Clustering for Result Diversification". In Proceedings of TREC 2012 Web Track.
[23]
R. Sarikaya, A. Gravano, and Y. Gao, "Rapid Language Model Development Using External Resources for New Spoken Dialog Domains", IEEE International Conference (Volume 1), 2005, Pages 573--576.
[24]
D. Mioduser, R. Nachmias, O. Lahav, and A. Oren, (2000). "A Web-based learning environments: Current pedagogical and technological state", Journal of Research on Computing in Education, page 55.
[25]
C.L.A. Clarke, N. Craswell, and E.M. Voorhees, "Overview of the TREC 2012 Web Track". In Proceedings of TREC 2012, the Twenty-First Text Retrieval Conference, NIST Special Publication: SP 500--298, 2012.
[26]
Francisco João Pinto and Carme Fernández Pérez-Sanjulián. "Automatic query expansion and word sense disambiguation with long and short queries using WordNet under vector model". Actas de los Talleres de las Jornadas de Ingeniería del Software y Bases de Datos, Vol. 2, No. 2, 2008.
[27]
Paha, N. and Gulati, P.; Gupta, P. "Ontology driven conjunctive query expansion based on mining user logs". Proceeding of International Conference on Methods and Models in Computer Science, 2009. ICM2CS 2009., Pages 1--4.
[28]
Al-akashi, F. and Inkpen, D. (2012). "Intelligent Web Page Retrieval Using Wikipedia Knowledge". In Proceedings of the 2nd Web Intelligence, Mining and Semantics (WIMS) International Conference.

Index Terms

  1. Term Impact-Based Web Page Ranking

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WIMS '14: Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS14)
    June 2014
    506 pages
    ISBN:9781450325387
    DOI:10.1145/2611040
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Aristotle University of Thessaloniki

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 June 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Web retrieval
    2. Wikipedia anchors
    3. indexing
    4. query expansion
    5. searching
    6. term impact
    7. vector space model

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    WIMS '14

    Acceptance Rates

    WIMS '14 Paper Acceptance Rate 41 of 90 submissions, 46%;
    Overall Acceptance Rate 140 of 278 submissions, 50%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 88
      Total Downloads
    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 04 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media