Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2232817.2232867acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

Exploiting real-time information retrieval in the microblogosphere

Published: 10 June 2012 Publication History

Abstract

Information seeking behavior in microblogging environments such as Twitter differs from traditional web search. The best performing microblog retrieval techniques attempt to utilize both semantic and temporal aspects of documents. In this paper, we present an effective approach, including the query modeling, the document modeling and the temporal re-ranking, to discover the most recent but relevant information to the query. For the query modeling, we introduce a two-stage pseudo-relevance feedback query expansion to overcome the severe vocabulary-mismatch problem of short message retrieval in microblog. For the document modeling, we propose two ways to expand document with the help of the shortened URL. For the temporal re-ranking, we suggest several methods to evaluate the temporal aspects of documents. Experimental results demonstrate that our approach obtains significant improvements compared with baseline systems. Specifically, the proposed system gives 26.37% and 9.94% further increases in P@30 and MAP over the best performing result on highrel in the TREC'11 Real-Time Search Task.

References

[1]
J. Allan, M. E. Connell, W. B. Croft, F. Feng, D. Fisher, and X. Li. Inquery and trec-9. In TREC, 2000.
[2]
C. Buckley, G. Salton, J. Allan, and A. Singhal. Automatic query expansion using SMART:textscTrec 3. In D. K. Harman, editor, Overview of the 3th Text REtrieval ConferencetextscTrec-3, pages 69--80, Gaithersburg, 1995. NIST.
[3]
C. C. Chen, Y.-T. Chen, Y. S. Sun, and M. C. Chen. Life cycle modeling of news events using aging theory. In N. Lavrac, D. Gamberger, L. Todorovski, and H. Blockeel, editors, ECML, volume 2837 of Lecture Notes in Computer Science, pages 47--59. Springer, 2003.
[4]
K.-Y. Chen, L. Luesukprasert, and S. cho Timothy Chou. Hot topic extraction based on timeline analysis and multidimensional sentence modeling. IEEE Trans. Knowl. Data Eng., 19(8):1016--1025, 2007.
[5]
W. Dakka, L. Gravano, and P. G. Ipeirotis. Answering general time sensitive queries. In J. G. Shanahan, S. Amer-Yahia, I. Manolescu, Y. Zhang, D. A. Evans, A. Kolcz, K.-S. Choi, and A. Chowdhury, editors, CIKM, pages 1437--1438. ACM, 2008.
[6]
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, B, 39, 1977.
[7]
A. Dong, R. Zhang, P. Kolari, J. Bai, F. Diaz, Y. Chang, Z. Zheng, and H. Zha. Time is of the essence: improving recency ranking using twitter data. In M. Rappa, P. Jones, J. Freire, and S. Chakrabarti, editors, WWW, pages 331--340. ACM, 2010.
[8]
R. T. Fernández, D. E. Losada, and L. Azzopardi. Extending the language modeling framework for sentence retrieval to include local context. Inf Retr., 14(4):355--389, 2011.
[9]
B. J. Jansen, M. Zhang, K. Sobel, and A. Chowdury. Micro-blogging as online word of mouth branding. In D. R. O. Jr., R. B. Arthur, K. Hinckley, M. R. Morris, S. E. Hudson, and S. Greenberg, editors, CHI Extended Abstracts, pages 3859--3864. ACM, 2009.
[10]
A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter: understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, pages 56--65. ACM, 2007.
[11]
V. Lavrenko and W. B. Croft. Relevance-based language models. In Proceedings of SIGIR, pages 120--127, 2001.
[12]
X. Li and W. B. Croft. Time-based language models. In CIKM, pages 469--475. ACM, 2003.
[13]
D. E. Losada and R. T. Fernández. Highly frequent terms and sentence retrieval. In N. Ziviani and R. A. Baeza-Yates, editors, SPIRE, volume 4726 of Lecture Notes in Computer Science, pages 217--228. Springer, 2007.
[14]
Y. Lv and C. Zhai. A comparative study of methods for estimating query language models with pseudo feedback. In D. W.-L. Cheung, I.-Y. Song, W. W. Chu, X. Hu, and J. J. Lin, editors, CIKM, pages 1895--1898. ACM, 2009.
[15]
Y. Lv and C. Zhai. Positional relevance model for pseudo-relevance feedback. In F. Crestani, S. Marchand-Maillet, H.-H. Chen, E. N. Efthimiadis, and J. Savoy, editors, SIGIR, pages 579--586. ACM, 2010.
[16]
V. Murdock. Aspects of sentence retrieval. SIGIR Forum, 41(2):127, 2007.
[17]
I. Ounis, C. Macdonald, J. Lin, and I. Soboroff. Overview of the TREC-2011 Microblog Track. In Proceedings of TREC 2011, 2012.
[18]
J. Pontin. From many tweets, one loud voice on the Internet. New York Times Online {web site}. Retrieved May, 8:2006, 2007.
[19]
S. E. Robertson and K. S. Jones. Relevance weighting of search terms. Journal of the American Society for Information Science, 27:129--146, 1976.
[20]
S. E. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. Okapi at trec-3. In TREC'94, pages 109--126, 1994.
[21]
J. Rocchio. Relevance feedback in information retrieval. In The SMART Retrieval System: experiments in automatic document processing, pages 313--323. Prentice Hall, 1971.
[22]
G. Salton and C. Buckley. Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41:288--297, 1990.
[23]
T. Tao and C. Zhai. Regularized estimation of mixture models for robust pseudo-relevance feedback. In E. N. Efthimiadis, S. T. Dumais, D. Hawking, and K. Jarvelin, editors, SIGIR, pages 162--169. ACM, 2006.
[24]
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst., 22(2):179--214, 2004.
[25]
C. Zhai and J. D. Lafferty. Model-based feedback in the language modeling approach to information retrieval. In CIKM, pages 403--410. ACM, 2001.

Cited By

View all
  • (2023)Microblog Retrieval Based on Concept-Enhanced Pre-Training ModelACM Transactions on Knowledge Discovery from Data10.1145/355231117:3(1-32)Online publication date: 22-Feb-2023
  • (2021)Query Expansion With Local Conceptual Word Embeddings in Microblog RetrievalIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.294576433:4(1737-1749)Online publication date: 1-Apr-2021
  • (2021)SAED: Edge-Based Intelligence for Privacy-Preserving Enterprise Search on the Cloud2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid51090.2021.00046(366-375)Online publication date: May-2021
  • Show More Cited By

Index Terms

  1. Exploiting real-time information retrieval in the microblogosphere

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
    June 2012
    458 pages
    ISBN:9781450311540
    DOI:10.1145/2232817
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 June 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. language model
    2. query expansion
    3. real-time search
    4. temporal search

    Qualifiers

    • Research-article

    Conference

    JCDL '12
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 415 of 1,482 submissions, 28%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Microblog Retrieval Based on Concept-Enhanced Pre-Training ModelACM Transactions on Knowledge Discovery from Data10.1145/355231117:3(1-32)Online publication date: 22-Feb-2023
    • (2021)Query Expansion With Local Conceptual Word Embeddings in Microblog RetrievalIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.294576433:4(1737-1749)Online publication date: 1-Apr-2021
    • (2021)SAED: Edge-Based Intelligence for Privacy-Preserving Enterprise Search on the Cloud2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid51090.2021.00046(366-375)Online publication date: May-2021
    • (2018)SNS Retrieval Based on User Profile Estimation Using Transfer Learning from Web Search2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)10.1109/WI.2018.00-79(278-285)Online publication date: Dec-2018
    • (2018)Spatio‐temporal query contextualization for microtext retrieval in social mediaConcurrency and Computation: Practice and Experience10.1002/cpe.445830:15Online publication date: 28-Feb-2018
    • (2017)Microblog Retrieval Using Ensemble of Feature Sets through Supervised Feature SelectionIEICE Transactions on Information and Systems10.1587/transinf.2016DAP0032E100.D:4(793-806)Online publication date: 2017
    • (2017)Context-aware relevance feedback over SNS graph dataProceedings of the International Conference on Web Intelligence10.1145/3106426.3106527(823-830)Online publication date: 23-Aug-2017
    • (2017)Query Expansion Based on a Feedback Concept Model for Microblog RetrievalProceedings of the 26th International Conference on World Wide Web10.1145/3038912.3052710(559-568)Online publication date: 3-Apr-2017
    • (2017)Effective pseudo-relevance for Microblog retrievalProceedings of the Australasian Computer Science Week Multiconference10.1145/3014812.3014865(1-6)Online publication date: 30-Jan-2017
    • (2017)Spatio-Temporal Contextualization of Queries for Microtexts in Social Media: Mathematical ModelingProcedia Computer Science10.1016/j.procs.2017.08.317113(525-530)Online publication date: 2017
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media