Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Behavioral dynamics on the web: Learning, modeling, and prediction

Published: 05 August 2013 Publication History

Abstract

The queries people issue to a search engine and the results clicked following a query change over time. For example, after the earthquake in Japan in March 2011, the query japan spiked in popularity and people issuing the query were more likely to click government-related results than they would prior to the earthquake. We explore the modeling and prediction of such temporal patterns in Web search behavior. We develop a temporal modeling framework adapted from physics and signal processing and harness it to predict temporal patterns in search behavior using smoothing, trends, periodicities, and surprises. Using current and past behavioral data, we develop a learning procedure that can be used to construct models of users' Web search activities. We also develop a novel methodology that learns to select the best prediction model from a family of predictive models for a given query or a class of queries. Experimental results indicate that the predictive models significantly outperform baseline models that weight historical evidence the same for all queries. We present two applications where new methods introduced for the temporal modeling of user behavior significantly improve upon the state of the art. Finally, we discuss opportunities for using models of temporal dynamics to enhance other areas of Web search and information retrieval.

References

[1]
Adar, E., Weld, D. S., Bershad, B. N., and Gribble, S. D. 2007. Why we search: Visualizing and predicting user behavior. In Proceedings of the International World Wide Web Conference (WWW).
[2]
Agichtein, E., Brill, E., and Dumais, S. T. 2006. Improving Web search ranking by incorporating user behavior information. In Proceedings of the Annual Special Interest Group on Information Retrieval Conference (SIGIR).
[3]
Bar-Yossef, Z. and Kraus, N. 2011. Context-sensitive query auto-completion. In Proceedings of the International World Wide Web Conference (WWW). 107--116.
[4]
Beitzel, S. M., Jensen, E. C., Chowdhury, A., Grossman, D., and Frieder, O. 2004. Hourly analysis of a very large topically categorized Web query log. In Proceedings of the Annual Special Interest Group on Information Retrieval Conference (SIGIR).
[5]
Bennett, P. N., Svore, K., and Dumais, S. T. 2010. Classification-enhanced ranking. In Proceedings of the International World Wide Web Conference (WWW).
[6]
Bickel, S., Haider, P., and Scheffer, T. 2005. Learning to complete sentences. In Proceedings of the European Conference on Machine Learning (ECML). 497--504.
[7]
Blei, D. M., Ng, A. Y., and Jordan, M. I. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993--1022.
[8]
Bogert, B., Healy, M., and Tukey, J. 1967. Cepstrum pitch determination. J. Acoust. Soc. Amer. 41, 2, 293--309.
[9]
Burges, C. J. C. 2010. From RankNet to LambdaRank to LambdaMART: An overview. Tech. rep. MSR-TR-2010-82, Microsoft Research.
[10]
Chaudhuri, S. and Kaushik, R. 2009. Extending autocompletion to tolerate errors. In Proceedings of the International Conference on Management of Data (SIGMOD). 707--718.
[11]
Chien, S. and Immorlica, N. 2005. Semantic similarity between search engine queries using temporal correlation. In Proceedings of the International World Wide Web Conference (WWW).
[12]
Childers, D., Skinner, D., and Kemerait, R. 1977. The cepstrum: A guide to processing. Proc. IEEE 65, 10, 1428--1443.
[13]
Dai, N., Shokouhi, M., and Davison, B. D. 2011. Learning to rank for freshness and relevance. In Proceedings of the Annual Special Interest Group on Information Retrieval Conference (SIGIR).
[14]
Dakka, W., Gravano, L., and Ipeirotis, P. G. 2008. Answering general time sensitive queries. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM).
[15]
Darragh, J. J., Witten, I. H., and James, M. L. 1990. The reactive keyboard: A predictive typing aid. Computer 23, 41--49.
[16]
Diaz, F. 2009. Integration of news content into Web results. In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM).
[17]
Dong, A., Chang, Y., Zheng, Z., Mishne, G., Bai, J., Zhang, R., Buchner, K., Liao, C., and Diaz, F. 2010a. Towards recency ranking in Web search. In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM).
[18]
Dong, A., Zhang, R., Kolari, P., Bai, J., Diaz, F., Chang, Y., Zheng, Z., and Zha, H. 2010b. Time is of the essence: Improving recency ranking using Twitter data. In Proceedings of the International World Wide Web Conference (WWW).
[19]
Dunn, P. F. 2005. Measurement and Data Analysis for Engineering and Science. McGraw-Hill, New York, NY.
[20]
Durbin, J. and Koopman, S. 2008. Time Series Analysis by State Space Methods. Oxford University Press, Oxford, UK.
[21]
Efron, M. 2010. Linear time series models for term weighting in information retrieval. J. Amer. Soc. Inf. Sci. Technol. 6, 7.
[22]
Efron, M. and Golovchinksy, G. 2011. Estimation methods for ranking recent information. In Proceedings of the Annual Special Interest Group on Information Retrieval Conference (SIGIR).
[23]
Elsas, J. L. and Dumais, S. T. 2010. Leveraging temporal dynamics of document content in relevance ranking. In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM).
[24]
Fan, J., Wu, H., Li, G., and Zhou, L. 2010. Suggesting topic-based query terms as you type. In Proceedings of the International Asia-Pacific Web Conference (APWeb). 61--67.
[25]
Ginsberg, J., Mohebbi, M., Patel, R., Brammer, L., Smolinski, M., and Brilliant, L. 2009. Detecting influenza epidemics using search engine query data. Nature 457, 7232, 1012--4.
[26]
Grabski, K. and Scheffer, T. 2004. Sentence completion. In Proceedings of the Annaul Special Interest Group on Information Retrieval Conference (SIGIR). 433--439.
[27]
Holt, C. C. 2004. Forecasting seasonals and trends by exponentially weighted moving averages. Int. J. Forecas. 20, 1, 5--10.
[28]
Hyndman, R., Koehler, A., Ord, J., and Snyder, R. 2008. Forecasting with Exponential Smoothing (The State Space Approach). Springer, Berlin.
[29]
Ji, S., Li, G., Li, C., and Feng, J. 2009. Efficient interactive fuzzy keyword search. In Proceedings of the International World Wide Web Conference (WWW). 371--380.
[30]
Jones, R. and Diaz, F. 2007. Temporal profiles of queries. ACM Trans. Inform. Syst. 25, 3, 14.
[31]
Kleinberg, J. 2002. Bursty and hierarchical structure in streams. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (KDD).
[32]
Kleinberg, J. 2006. Temporal dynamics of on-line information systems. In Data Stream Management: Processing High-Speed Data Streams, Springer, Berlin.
[33]
König, A. C., Gamon, M., and Wu, Q. 2009. Click-through prediction for news queries. In Proceedings of the Annual Special Interest Group on Information Retrieval Conference (SIGIR).
[34]
Koren, Y. 2009. Collaborative filtering with temporal dynamics. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (KDD).
[35]
Kulkarni, A., Teevan, J., Svore, K. M., and Dumais, S. T. 2011. Understanding temporal query dynamics. In Proceedings of the 4th International Conference on Web Search and Data Mining (WSDM'11).
[36]
Lau, T. and Horvitz, E. 1998. Patterns of search: Analyzing and modeling Web query refinement. In Proceedings of the 7th International Conference on User Modeling.
[37]
Li, X. and Croft, W. B. 2003. Time-based language models. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM).
[38]
Metzler, D., Jones, R., Peng, F., and Zhang, R. 2009. Improving search relevance for implicitly temporal queries. In Proceedings of the Annual Special Interest Group on Information Retrieval Conference (SIGIR).
[39]
Nandi, A. and Jagadish, H. V. 2007. Effective phrase prediction. In Proceedings of the Conference on Very Large Databases (VLDB). 219--230.
[40]
Radinsky, K., Agichtein, E., Gabrilovich, E., and Markovitch, S. 2011. A word at a time: Computing word relatedness using temporal semantic analysis. In Proceedings of the International World Wide Web Conference (WWW).
[41]
Radinsky, K., Davidovich, S., and Markovitch, S. 2008. Predicting the news of tomorrow using patterns in Web search queries. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (WI).
[42]
Radinsky, K., Svore, K., Dumais, S., Teevan, J., Bocharov, A., and Horvitz, E. 2012. Modeling and predicting behavioral dynamics on the Web. In Proceedings of the International World Wide Web Conference (WWW).
[43]
Robertson, S., Zaragoza, H., and Taylor, M. 2004. Simple bm25 extension to multiple weighted fields. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM).
[44]
Schwarz, G. E. 1978. Estimating the dimension of a model. Ann. Stat. 2, 6, 461--464.
[45]
Shimshoni, Y., Efron, N., and Matias, Y. 2009. On the predictability of search trends. Tech. rep. Microsoft Research.
[46]
Shokouhi, M. 2011. Detecting seasonal queries by time-series analysis. In Proceedings of the Annual Special Interest Group on Information Retrieval Conference (SIGIR).
[47]
Shokouhi, M. and Radinsky, K. 2012. Time-sensitive query auto-completion. In Proceedings of the Annual Special Interest Group on Information Retrieval Conference (SIGIR).
[48]
Snyman, J. A. 2005. Practical Mathematical Optimization: An Introduction to Basic Optimization Theory and Classical and New Gradient-Based Algorithms. Springer, Berlin.
[49]
Teevan, J., Dumais, S. T., and Horvitz, E. 2005. Personalizing search via automated analysis of interests and activities. In Proceedings of the Annual Special Interest Group on Information Retrieval Conference (SIGIR).
[50]
Vlachos, M., Meek, C., Vagena, Z., and Gunopulos, D. 2004. Identifying similarities, periodicities and bursts for online search queries. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 131--142.
[51]
Wang, P., Berry, M. W., and Yang, Y. 2003. Mining longitudinal Web queries: trends and patterns. J. Amer. Soc. Inf. Sci. Techno. 54, 8, 743--758.
[52]
Wang, X., Zhai, C., Hu, X., and Sproat, R. 2007. Mining correlated bursty topic patterns from coordinated text streams. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (KDD).
[53]
White, R. W., Bennett, P. N., and Dumais, S. T. 2010. Predicting short-term interests using activity-based search context. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM).
[54]
White, R. W. and Marchionini, G. 2007. Examining the effectiveness of real-time query expansion. J. Inf. Process. Manage. 43, 3, 685--704.
[55]
Yue, Y., Patel, R., and Roehrig, H. 2010. Beyond position bias: Examining result attractiveness as a source of presentation bias in clickthrough data. In Proceedings of International World Wide Web Conference (WWW).

Cited By

View all
  • (2023)A Review Selection Method Based on Consumer Decision Phases in E-commerceACM Transactions on Information Systems10.1145/358726542:1(1-27)Online publication date: 21-Aug-2023
  • (2022)Interest Points Analysis for Internet Forum Based on Long-Short Windows SimilarityComputers, Materials & Continua10.32604/cmc.2022.02669872:2(3247-3267)Online publication date: 2022
  • (2022)Fair ranking: a critical review, challenges, and future directionsProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency10.1145/3531146.3533238(1929-1942)Online publication date: 21-Jun-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems
ACM Transactions on Information Systems  Volume 31, Issue 3
July 2013
202 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/2493175
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 August 2013
Accepted: 01 March 2013
Revised: 01 January 2013
Received: 01 May 2012
Published in TOIS Volume 31, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Behavioral analysis
  2. predictive behavioral models

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)A Review Selection Method Based on Consumer Decision Phases in E-commerceACM Transactions on Information Systems10.1145/358726542:1(1-27)Online publication date: 21-Aug-2023
  • (2022)Interest Points Analysis for Internet Forum Based on Long-Short Windows SimilarityComputers, Materials & Continua10.32604/cmc.2022.02669872:2(3247-3267)Online publication date: 2022
  • (2022)Fair ranking: a critical review, challenges, and future directionsProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency10.1145/3531146.3533238(1929-1942)Online publication date: 21-Jun-2022
  • (2022)Revisit Behavior in Social Media: The Phoenix-R Model and DiscoveriesMachine Learning and Knowledge Discovery in Databases10.1007/978-3-662-44848-9_25(386-401)Online publication date: 10-Mar-2022
  • (2021)Content-Based Model of Web Search BehaviorManagement Science10.1287/mnsc.2020.382767:10(6378-6398)Online publication date: 1-Oct-2021
  • (2021)Challenges and research opportunities in eCommerce search and recommendationsACM SIGIR Forum10.1145/3451964.345196654:1(1-23)Online publication date: 19-Feb-2021
  • (2020)A Framework for Event-oriented Text Retrieval Based on Temporal AspectsProceedings of the 2020 12th International Conference on Machine Learning and Computing10.1145/3383972.3384051(39-46)Online publication date: 15-Feb-2020
  • (2019)Coagmento v3.0Proceedings of the 2019 Conference on Human Information Interaction and Retrieval10.1145/3295750.3298917(367-371)Online publication date: 8-Mar-2019
  • (2019)Behavior Analysis for Electronic Commerce Trading Systems: A SurveyIEEE Access10.1109/ACCESS.2019.29332477(108703-108728)Online publication date: 2019
  • (2019)European urban destinations’ attractors at the frontier between competitiveness and a unique destination image. A benchmark study of communication practicesJournal of Destination Marketing & Management10.1016/j.jdmm.2019.02.00612(37-45)Online publication date: Jun-2019
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media