research-article

Query performance prediction for microblog search: a preliminary study

Authors:

Maram Hasanain,

Tamer ElsayedAuthors Info & Claims

SoMeRA '14: Proceedings of the first international workshop on Social media retrieval and analysis

Pages 1 - 6

https://doi.org/10.1145/2632188.2632210

Published: 11 July 2014 Publication History

Abstract

Microblogging has recently become an integral part of the daily life of millions of people around the world. With a continuous flood of posts, microblogging services (e.g., Twitter) have to effectively handle millions of user queries that aim to search and follow recent developments of news or events. While predicting the quality of retrieved documents against search queries was extensively studied in domains such as the Web and news, the different nature of data and search task in microblogs triggers the need for re-visiting the problem in that context. In this work, we re-examined several state-of-the-art query performance predictors in the domain of microblog ad-hoc search using the two most-commonly used tweets collections with three different retrieval models that are used in microblog search. Our experiments showed that a temporal predictor was generally the best to fit the prediction task in the context of microblog search, indicating the importance of the temporal aspect in this task. The results also highlighted the need to either re-design some of the existing predictors or propose new ones to function effectively with different retrieval models that are used in our tested domain. Finally, our experiments on combining multiple predictors resulted in achieving considerable improvements in prediction quality over individual predictors, which confirmed the results reported in the literature but in different domains.

References

[1]

D. Carmel and E. Yom-Tov. Estimating the query difficulty for information retrieval. Synthesis Lectures on Information Concepts, Retrieval, and Services, 2(1):1--89, Jan. 2010.

Digital Library

[2]

S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002.

Digital Library

[3]

R. Cummins, J. Jose, and C. O'Riordan. Improved query performance prediction using standard deviation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011.

Digital Library

[4]

M. Efron and G. Golovchinsky. Estimation methods for ranking recent information. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011.

Digital Library

[5]

C. Hauff, L. Azzopardi, and D. Hiemstra. The combination and evaluation of query performance prediction methods. In Advances in Information Retrieval, number 5478 in LNCS, pages 301--312. 2009.

Digital Library

[6]

B. He and I. Ounis. Inferring query performance using pre-retrieval predictors. In String Processing and Information Retrieval, number 3246 in LNCS, pages 43--54. 2004.

[7]

R. Jones and F. Diaz. Temporal profiles of queries. ACM Trans. Inf. Syst., 25(3):14:1--14:31, July 2007.

Digital Library

[8]

M. Keikha, S. Gerani, and F. Crestani. Time-based relevance models. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011.

Digital Library

[9]

H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web, 2010.

Digital Library

[10]

X. Li and W. B. Croft. Time-based language models. In Proceedings of the Twelfth International Conference on Information and Knowledge Management, 2003.

Digital Library

[11]

J. Lin and M. Efron. Overview of the TREC--2013 Microblog Track. 2013.

[12]

C. D. Manning, P. Raghavan, and H. Schütze. Introduction to information retrieval. Cambridge University Press, Cambridge, United Kingdom, 2008.

Digital Library

[13]

I. Ounis, C. Macdonald, J. Lin, and I. Soboroff. Overview of the TREC--2011 Microblog Track. 2011.

[14]

F. Raiber and O. Kurland. Using document-quality measures to predict web-search effectiveness. In Advances in Information Retrieval, number 7814 in LNCS, pages 134--145. Jan. 2013.

Digital Library

[15]

A. Shtok, O. Kurland, D. Carmel, F. Raiber, and G. Markovits. Predicting query performance by query-drift estimation. ACM Trans. Inf. Syst., 30(2):11:1--11:35, 2012.

Digital Library

[16]

I. H. Witten, E. Frank, and M. A. Hall. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, third edition, 2011.

Digital Library

[17]

C. Zhai and J. Lafferty. Model-based feedback in the language modeling approach to information retrieval. In Proceedings of the Tenth International Conference on Information and Knowledge Management, 2001.

Digital Library

[18]

Y. Zhao, F. Scholer, and Y. Tsegay. Effective pre-retrieval query performance prediction using similarity and variability evidence. In Advances in Information Retrieval, volume 4956 of LNCS, pages 52--64. 2008.

Digital Library

[19]

Y. Zhou. Retrieval performance prediction and document quality. PhD thesis, University of Massachusetts Amherst, 2007.

Digital Library

[20]

Y. Zhou and W. B. Croft. Query performance prediction in web search environments. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007.

Digital Library

Cited By

Hasanain MElsayed TMagdy W(2016)Improving Tweet Timeline Generation by Predicting Optimal Retrieval DepthInformation Retrieval Technology10.1007/978-3-319-28940-3_11(135-146)Online publication date: 22-Jan-2016
https://doi.org/10.1007/978-3-319-28940-3_11

Index Terms

Query performance prediction for microblog search: a preliminary study
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Effectiveness of state-of-the-art features for microblog search
SAC '13: Proceedings of the 28th Annual ACM Symposium on Applied Computing

We investigate in this paper information retrieval in microblogs exploiting different state-of-the-art features. Microbloggers, besides posting microblogs, search for fresh and relevant information related to their interests, by submitting a query to a ...
Learning to Rank Microblog Posts for Real-Time Ad-Hoc Search
Natural Language Processing and Chinese Computing
Abstract
Microblogging websites have emerged to the center of information production and diffusion, on which people can get useful information from other users’ microblog posts. In the era of Big Data, we are overwhelmed by the large amount of microblog ...
Effective pseudo-relevance for Microblog retrieval
ACSW '17: Proceedings of the Australasian Computer Science Week Multiconference

Microblog services such as Twitter have become a part of daily life for many users, with thousands of documents published each second. Microblog documents are often too short, overwhelming in their use of informal language and hard to understand due to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SoMeRA '14: Proceedings of the first international workshop on Social media retrieval and analysis

July 2014

72 pages

ISBN:9781450330220

DOI:10.1145/2632188

Program Chairs:
Markus Schedl
Johannes Kepler University Linz, Austria
,
Peter Knees
Johannes Kepler University Linz, Austria
,
Jialie Shen
Singapore Management University, Singapore

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 July 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR '14

Sponsor:

SIGIR

SIGIR '14: The 37th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 11, 2014

Queensland, Gold Coast, Australia

Acceptance Rates

SoMeRA '14 Paper Acceptance Rate 13 of 19 submissions, 68%;

Overall Acceptance Rate 13 of 19 submissions, 68%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
130
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)1

Reflects downloads up to 14 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hasanain MElsayed TMagdy W(2016)Improving Tweet Timeline Generation by Predicting Optimal Retrieval DepthInformation Retrieval Technology10.1007/978-3-319-28940-3_11(135-146)Online publication date: 22-Jan-2016
https://doi.org/10.1007/978-3-319-28940-3_11

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents