Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2632188.2632210acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Query performance prediction for microblog search: a preliminary study

Published: 11 July 2014 Publication History

Abstract

Microblogging has recently become an integral part of the daily life of millions of people around the world. With a continuous flood of posts, microblogging services (e.g., Twitter) have to effectively handle millions of user queries that aim to search and follow recent developments of news or events. While predicting the quality of retrieved documents against search queries was extensively studied in domains such as the Web and news, the different nature of data and search task in microblogs triggers the need for re-visiting the problem in that context. In this work, we re-examined several state-of-the-art query performance predictors in the domain of microblog ad-hoc search using the two most-commonly used tweets collections with three different retrieval models that are used in microblog search. Our experiments showed that a temporal predictor was generally the best to fit the prediction task in the context of microblog search, indicating the importance of the temporal aspect in this task. The results also highlighted the need to either re-design some of the existing predictors or propose new ones to function effectively with different retrieval models that are used in our tested domain. Finally, our experiments on combining multiple predictors resulted in achieving considerable improvements in prediction quality over individual predictors, which confirmed the results reported in the literature but in different domains.

References

[1]
D. Carmel and E. Yom-Tov. Estimating the query difficulty for information retrieval. Synthesis Lectures on Information Concepts, Retrieval, and Services, 2(1):1--89, Jan. 2010.
[2]
S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002.
[3]
R. Cummins, J. Jose, and C. O'Riordan. Improved query performance prediction using standard deviation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011.
[4]
M. Efron and G. Golovchinsky. Estimation methods for ranking recent information. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011.
[5]
C. Hauff, L. Azzopardi, and D. Hiemstra. The combination and evaluation of query performance prediction methods. In Advances in Information Retrieval, number 5478 in LNCS, pages 301--312. 2009.
[6]
B. He and I. Ounis. Inferring query performance using pre-retrieval predictors. In String Processing and Information Retrieval, number 3246 in LNCS, pages 43--54. 2004.
[7]
R. Jones and F. Diaz. Temporal profiles of queries. ACM Trans. Inf. Syst., 25(3):14:1--14:31, July 2007.
[8]
M. Keikha, S. Gerani, and F. Crestani. Time-based relevance models. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011.
[9]
H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web, 2010.
[10]
X. Li and W. B. Croft. Time-based language models. In Proceedings of the Twelfth International Conference on Information and Knowledge Management, 2003.
[11]
J. Lin and M. Efron. Overview of the TREC--2013 Microblog Track. 2013.
[12]
C. D. Manning, P. Raghavan, and H. Schütze. Introduction to information retrieval. Cambridge University Press, Cambridge, United Kingdom, 2008.
[13]
I. Ounis, C. Macdonald, J. Lin, and I. Soboroff. Overview of the TREC--2011 Microblog Track. 2011.
[14]
F. Raiber and O. Kurland. Using document-quality measures to predict web-search effectiveness. In Advances in Information Retrieval, number 7814 in LNCS, pages 134--145. Jan. 2013.
[15]
A. Shtok, O. Kurland, D. Carmel, F. Raiber, and G. Markovits. Predicting query performance by query-drift estimation. ACM Trans. Inf. Syst., 30(2):11:1--11:35, 2012.
[16]
I. H. Witten, E. Frank, and M. A. Hall. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, third edition, 2011.
[17]
C. Zhai and J. Lafferty. Model-based feedback in the language modeling approach to information retrieval. In Proceedings of the Tenth International Conference on Information and Knowledge Management, 2001.
[18]
Y. Zhao, F. Scholer, and Y. Tsegay. Effective pre-retrieval query performance prediction using similarity and variability evidence. In Advances in Information Retrieval, volume 4956 of LNCS, pages 52--64. 2008.
[19]
Y. Zhou. Retrieval performance prediction and document quality. PhD thesis, University of Massachusetts Amherst, 2007.
[20]
Y. Zhou and W. B. Croft. Query performance prediction in web search environments. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007.

Cited By

View all
  • (2016)Improving Tweet Timeline Generation by Predicting Optimal Retrieval DepthInformation Retrieval Technology10.1007/978-3-319-28940-3_11(135-146)Online publication date: 22-Jan-2016

Index Terms

  1. Query performance prediction for microblog search: a preliminary study

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SoMeRA '14: Proceedings of the first international workshop on Social media retrieval and analysis
    July 2014
    72 pages
    ISBN:9781450330220
    DOI:10.1145/2632188
    • Program Chairs:
    • Markus Schedl,
    • Peter Knees,
    • Jialie Shen
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 July 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. microblog search
    2. query difficulty
    3. temporal retrieval

    Qualifiers

    • Research-article

    Conference

    SIGIR '14
    Sponsor:

    Acceptance Rates

    SoMeRA '14 Paper Acceptance Rate 13 of 19 submissions, 68%;
    Overall Acceptance Rate 13 of 19 submissions, 68%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 14 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)Improving Tweet Timeline Generation by Predicting Optimal Retrieval DepthInformation Retrieval Technology10.1007/978-3-319-28940-3_11(135-146)Online publication date: 22-Jan-2016

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media