Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3485447.3511967acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

ParClick: A Scalable Algorithm for EM-based Click Models

Published: 25 April 2022 Publication History

Abstract

Research on click models usually focuses on developing effective approaches to reduce biases in user clicks. However, one of the major drawbacks of existing click models is the lack of scalability. In this work, we tackle the scalability of Expectation-Maximization (EM)-based click models by introducing ParClick, a new parallel algorithm designed by following the Partitioning-Communication-Aggregation-Mapping (PCAM) method. To this end, we first provide a generic formulation of EM-based click models. Then, we design an efficient parallel version of this generic click model following the PCAM approach: we partition user click logs and model parameters into separate tasks, analyze communication among them, and aggregate these tasks to reduce communication overhead. Finally, we provide a scalable, parallel implementation of the proposed design, which maps well on a multi-core machine. Our experiments on the Yandex relevance prediction dataset show that ParClick scales well when increasing the amount of training data and computational resources. In particular, ParClick is 24.7 times faster to train with 40 million search sessions and 40 threads compared to the standard sequential version of the Click Chain Model (CCM) without any degradation in effectiveness.

References

[1]
Aman Agarwal, Ivan Zaitsev, Xuanhui Wang, Cheng Li, Marc Najork, and Thorsten Joachims. 2019. Estimating Position Bias without Intrusive Interventions. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining(WSDM ’19). Association for Computing Machinery, 474–482.
[2]
Muzaffer Can Altinigneli, Claudia Plant, and Christian Böhm. 2013. Massively Parallel Expectation Maximization Using Graphics Processing Units. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Chicago, Illinois, USA) (KDD ’13). Association for Computing Machinery.
[3]
Alexey Borisov, Ilya Markov, Maarten de Rijke, and Pavel Serdyukov. 2016. A Neural Click Model for Web Search. In Proceedings of the 25th International Conference on World Wide Web(WWW ’16). International World Wide Web Conferences Steering Committee, 531–541.
[4]
Alexey Borisov, Martijn Wardenaar, Ilya Markov, and Maarten de Rijke. 2018. A Click Sequence Model for Web Search. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval(SIGIR ’18). Association for Computing Machinery, 45–54.
[5]
Olivier Cappé and Eric Moulines. 2009. On-line expectation–maximization algorithm for latent data models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 71(2009), 593–613.
[6]
Olivier Chapelle and Ya Zhang. 2009. A Dynamic Bayesian Network Click Model for Web Search Ranking. In Proceedings of the 18th International Conference on World Wide Web(WWW ’09). Association for Computing Machinery, 1–10.
[7]
Danqi Chen, Weizhu Chen, Haixun Wang, Zheng Chen, and Qiang Yang. 2012. Beyond Ten Blue Links: Enabling User Click Modeling in Federated Web Search. In Proceedings of the Fifth ACM International Conference on Web Search and Data Mining(WSDM ’12). Association for Computing Machinery, 463–472.
[8]
Jia Chen, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma. 2020. A Context-Aware Click Model for Web Search. In Proceedings of the 13th International Conference on Web Search and Data Mining(WSDM ’20). Association for Computing Machinery, New York, NY, USA, 88–96.
[9]
Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. 2015. Click Models for Web Search. Morgan & Claypool.
[10]
Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. 2008. An Experimental Comparison of Click Position-Bias Models. In Proceedings of the 2008 International Conference on Web Search and Data Mining(WSDM ’08). 87–94.
[11]
Xinyi Dai, Jianghao Lin, Weinan Zhang, Shuai Li, Weiwen Liu, Ruiming Tang, Xiuqiang He, Jianye Hao, Jun Wang, and Yong Yu. 2021. An Adversarial Imitation Click Model for Information Retrieval. In Proceedings of the Web Conference 2021(WWW ’21). Association for Computing Machinery, 1809–1820.
[12]
Georges E. Dupret and Benjamin Piwowarski. 2008. A User Browsing Model to Predict Search Engine Click Data from Past Observations. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’08). Association for Computing Machinery, 331–338.
[13]
Ian Foster. 1995. Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Addison-Wesley Longman Publishing Co., Inc.
[14]
A. Grama, V. Kumar, A. Gupta, and G. Karypis. 2003. Introduction to Parallel Computing. Addison-Wesley.
[15]
Artem Grotov, Aleksandr Chuklin, Ilya Markov, Luka Stout, Finde Xumara, and Maarten Rijke. 2015. A Comparative Study of Click Models for Web Search. In Proceedings of the 6th International Conference on Experimental IR Meets Multilinguality, Multimodality, and Interaction - Volume 9283(CLEF’15). Springer-Verlag, 78–90.
[16]
Fan Guo, Chao Liu, Anitha Kannan, Tom Minka, Michael Taylor, Yi-Min Wang, and Christos Faloutsos. 2009. Click Chain Model in Web Search. In Proceedings of the 18th International Conference on World Wide Web(WWW ’09). Association for Computing Machinery, 11–20.
[17]
Fan Guo, Chao Liu, and Yi Min Wang. 2009. Efficient Multiple-Click Models in Web Search. In Proceedings of the Second ACM International Conference on Web Search and Data Mining(WSDM ’09). Association for Computing Machinery, 124–131.
[18]
Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased Learning-to-Rank with Biased Feedback(WSDM ’17). Association for Computing Machinery, 781–789.
[19]
Daphne Koller and Nir Friedman. 2009. Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning. The MIT Press.
[20]
Paul Lagrée, Claire Vernade, and Olivier Cappé. 2016. Multiple-Play Bandits in the Position-Based Model. In Proceedings of the 30th International Conference on Neural Information Processing Systems(NIPS’16). Curran Associates Inc., 1605–1613.
[21]
Jianghao Lin, Weiwen Liu, Xinyi Dai, Weinan Zhang, Shuai Li, Ruiming Tang, Xiuqiang He, Jianye Hao, and Yong Yu. 2021. A Graph-Enhanced Click Model for Web Search. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’21). Association for Computing Machinery, 1259–1268.
[22]
Chao Liu, Fan Guo, and Christos Faloutsos. 2009. BBM: Bayesian Browsing Model from Petabyte-scale Data. In Proceedings of KDD (Paris, France). 537–546.
[23]
Ilya Markov, Alexey Borisov, and Maarten de Rijke. 2017. Online Expectation-Maximization for Click Models. Association for Computing Machinery, 2195–2198.
[24]
Radford M. Neal and Geoffrey E. Hinton. 1999. A View of the EM Algorithm That Justifies Incremental, Sparse, and Other Variants. MIT Press, 355–368.
[25]
Filip Radlinski, Madhu Kurup, and Thorsten Joachims. 2008. How Does Clickthrough Data Reflect Retrieval Quality?. In Proceedings of the 17th ACM Conference on Information and Knowledge Management(CIKM ’08). Association for Computing Machinery, 43–52.
[26]
Pavel Serdyukov, Nick Craswell, and Georges Dupret. 2012. WSCD 2012: Workshop on Web Search Click Data 2012. In Proceedings of the Fifth ACM International Conference on Web Search and Data Mining(WSDM ’12). Association for Computing Machinery, New York, NY, USA, 771–772.
[27]
Si Shen, Botao Hu, Weizhu Chen, and Qiang Yang. 2012. Personalized Click Model through Collaborative Filtering. In Proceedings of the Fifth ACM International Conference on Web Search and Data Mining(WSDM ’12). Association for Computing Machinery, 323–332.
[28]
Ali Vardasbi, Maarten de Rijke, and Ilya Markov. 2020. Cascade Model-Based Propensity Estimation for Counterfactual Learning to Rank. Association for Computing Machinery, 2089–2092.
[29]
Hongning Wang, ChengXiang Zhai, Anlei Dong, and Yi Chang. 2013. Content-Aware Click Modeling. In Proceedings of the 22nd International Conference on World Wide Web(WWW ’13). Association for Computing Machinery, 1365–1376.
[30]
Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018. Position Bias Estimation for Unbiased Learning to Rank in Personal Search. In WSDM. ACM, 610–618.
[31]
Hai-Tao Yu, Adam Jatowt, Roi Blanco, Joemon M. Jose, and Ke Zhou. 2019. A Rank-Biased Neural Network Model for Click Modeling. In Proceedings of the 2019 Conference on Human Information Interaction and Retrieval(CHIIR ’19). Association for Computing Machinery, 183–191.
[32]
Yuchen Zhang, Weizhu Chen, Dong Wang, and Qiang Yang. 2011. User-Click Modeling for Understanding and Predicting Search-Behavior. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD ’11). Association for Computing Machinery, 1388–1396.

Cited By

View all
  • (2024)Validating Synthetic Usage Data in Living Lab EnvironmentsJournal of Data and Information Quality10.1145/362364016:1(1-33)Online publication date: 6-Mar-2024
  • (2024)Probabilistic graph model and neural network perspective of click models for web searchKnowledge and Information Systems10.1007/s10115-024-02145-z66:10(5829-5873)Online publication date: 6-Jun-2024

Index Terms

  1. ParClick: A Scalable Algorithm for EM-based Click Models
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WWW '22: Proceedings of the ACM Web Conference 2022
      April 2022
      3764 pages
      ISBN:9781450390965
      DOI:10.1145/3485447
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 April 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Click model
      2. Parallel computing
      3. User Modeling
      4. Web Search

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      WWW '22
      Sponsor:
      WWW '22: The ACM Web Conference 2022
      April 25 - 29, 2022
      Virtual Event, Lyon, France

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)31
      • Downloads (Last 6 weeks)5
      Reflects downloads up to 18 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Validating Synthetic Usage Data in Living Lab EnvironmentsJournal of Data and Information Quality10.1145/362364016:1(1-33)Online publication date: 6-Mar-2024
      • (2024)Probabilistic graph model and neural network perspective of click models for web searchKnowledge and Information Systems10.1007/s10115-024-02145-z66:10(5829-5873)Online publication date: 6-Jun-2024

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media