Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3459637.3481901acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

CausCF: Causal Collaborative Filtering for Recommendation Effect Estimation

Published: 30 October 2021 Publication History

Abstract

To improve user experience and profits of corporations, modern industrial recommender systems usually aim to select the items that are most likely to be interacted with (e.g., clicks and purchases). However, they overlook the fact that users may purchase the items even without recommendations. The real effective items are the ones that can contribute to purchase probability uplift. To select these effective items, it is essential to estimate the causal effect of recommendations. Nevertheless, it is difficult to obtain the real causal effect since we can only recommend or not recommend an item to a user at one time. Furthermore, previous works usually rely on the randomized controlled trial (RCT) experiment to evaluate their performance. However, it is usually not practicable in the recommendation scenario due to its expensive experimental cost. To tackle these problems, in this paper, we propose a causal collaborative filtering (CausCF) method inspired by the widely adopted collaborative filtering (CF) technique. It is based on the idea that similar users not only have a similar taste on items but also have similar treatment effects under recommendations. CausCF extends the classical matrix factorization to the tensor factorization with three dimensions---user, item, and treatment. Furthermore, we also employ regression discontinuity design (RDD) to evaluate the precision of the estimated causal effects from different models. With the testable assumptions, RDD analysis can provide an unbiased causal conclusion without RCT experiments. Through dedicated experiments on both offline and online experiments, we demonstrate the effectiveness of our proposed CausCF on the causal effect estimation and ranking performance improvement.

Supplementary Material

MP4 File (CIKM21-fp1326.mp4)
Presentation video

References

[1]
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016).
[2]
Gediminas Adomavicius, Bamshad Mobasher, Francesco Ricci, and Alexander Tuzhilin. 2011. Context-Aware Recommender Systems. AI Mag., Vol. 32, 3 (2011), 67--80.
[3]
Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, and W Bruce Croft. 2018. Unbiased learning to rank with unbiased propensity estimation. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018. 385--394.
[4]
Michael Anderson and Jeremy Magruder. 2012. Learning from the crowd: Regression discontinuity estimates of the effects of an online review database. The Economic Journal, Vol. 122, 563 (2012), 957--989.
[5]
Joshua D Angrist and Guido W Imbens. 1995. Two-stage least squares estimation of average causal effects in models with variable treatment intensity. Journal of the American statistical Association, Vol. 90, 430 (1995), 431--442.
[6]
Joshua D Angrist, Guido W Imbens, and Donald B Rubin. 1996. Identification of causal effects using instrumental variables. Journal of the American statistical Association, Vol. 91, 434 (1996), 444--455.
[7]
Joshua D. Angrist and Jörn-Steffen Pischke. 2008. Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press.
[8]
Peter C Austin. 2009. Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Statistics in medicine, Vol. 28, 25 (2009), 3083--3107.
[9]
Peter C Austin. 2011. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate behavioral research, Vol. 46, 3 (2011), 399--424.
[10]
Stephen Bonner and Flavian Vasile. 2018. Causal embeddings for recommendation. In Proceedings of the 12th ACM Conference on Recommender Systems, RecSys 2018. 104--112.
[11]
John S. Breese, David Heckerman, and Carl Myers Kadie. 1998. Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In UAI '98: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence. 43--52.
[12]
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, DLRS@RecSys 2016, Boston, MA, USA, September 15, 2016. 7--10.
[13]
Ben Derrick, Deirdre Toher, and Paul White. 2017. How to compare the means of two samples that include paired observations and independent observations: A companion to Derrick, Russ, Toher and White (2017). The Quantitative Methods in Psychology, Vol. 13, 2 (2017).
[14]
John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of machine learning research, Vol. 12, 7 (2011).
[15]
Andrew C Eggers, Ronny Freier, Veronica Grembi, and Tommaso Nannicini. 2018. Regression discontinuity designs based on population thresholds: Pitfalls and solutions. American Journal of Political Science, Vol. 62, 1 (2018), 210--229.
[16]
Zhabiz Gharibshah, Xingquan Zhu, Arthur Hainline, and Michael Conway. 2020. Deep learning for user interest and response prediction in online display advertising. Data Science and Engineering, Vol. 5, 1 (2020), 12--26.
[17]
Arthur Gretton, Karsten M Borgwardt, Malte J Rasch, Bernhard Schölkopf, and Alexander Smola. 2012. A kernel two-sample test. The Journal of Machine Learning Research, Vol. 13, 1 (2012), 723--773.
[18]
Ruocheng Guo, Lu Cheng, Jundong Li, P. Richard Hahn, and Huan Liu. 2018. A Survey of Learning Causality with Data: Problems and Methods. CoRR, Vol. abs/1809.09337 (2018).
[19]
Jason Hartford, Greg Lewis, Kevin Leyton-Brown, and Matt Taddy. 2017. Deep IV: A flexible approach for counterfactual prediction. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017. 1414--1423.
[20]
Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, YongDong Zhang, and Meng Wang. 2020. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation. 639--648.
[21]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web, WWW 2017. 173--182.
[22]
William Herlands, Edward McFowland III, Andrew Gordon Wilson, and Daniel B Neill. 2018. Automated local regression discontinuity design discovery. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018. 1512--1520.
[23]
Jonathan L. Herlocker, Joseph A. Konstan, Al Borchers, and John Riedl. 1999. An Algorithmic Framework for Performing Collaborative Filtering. In SIGIR '99: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 230--237.
[24]
Keisuke Hirano, Guido W Imbens, and Geert Ridder. 2003. Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, Vol. 71, 4 (2003), 1161--1189.
[25]
Guido Imbens. 2014. Instrumental variables: an econometrician's perspective. Technical Report. National Bureau of Economic Research.
[26]
Guido W Imbens and Thomas Lemieux. 2008. Regression discontinuity designs: A guide to practice. Journal of econometrics, Vol. 142, 2 (2008), 615--635.
[27]
Dietmar Jannach and Michael Jugovac. 2019. Measuring the Business Value of Recommender Systems. ACM Trans. Manag. Inf. Syst., Vol. 10, 4 (2019), 16:1--16:23.
[28]
Jiawei Jiang, Fangcheng Fu, Tong Yang, Yingxia Shao, and Bin Cui. 2020. SKCompress: compressing sparse and nonuniform gradient in distributed machine learning. The VLDB Journal, Vol. 29, 5 (2020), 945--972.
[29]
Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased learning-to-rank with biased feedback. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, WSDM 2017. 781--789.
[30]
Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recommendation. In IEEE International Conference on Data Mining, ICDM 2018. IEEE Computer Society, 197--206.
[31]
Tamara G Kolda and Brett W Bader. 2009. Tensor decompositions and applications. SIAM review, Vol. 51, 3 (2009), 455--500.
[32]
Yehuda Koren. 2008. Factorization Meets the Neighborhood: A Multifaceted Collaborative Filtering Model. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, SIGKDD 2008. 426--434.
[33]
Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer, Vol. 42, 8 (2009), 30--37.
[34]
David S. Lee and Thomas Lemieux. 2010. Regression discontinuity designs in economics. Journal of economic literature, Vol. 48, 2 (2010), 281--355.
[35]
Guangjie Li, Hui Liu, Ge Li, Sijie Shen, and Hanlin Tang. 2020. LSTM-based argument recommendation for non-API methods. Science China Information Sciences, Vol. 63, 9 (2020), 1--22.
[36]
Dawen Liang, Laurent Charlin, and David M Blei. 2016. Causal inference for recommendation. In Causation: Foundation to Application, Workshop at UAI. AUAI.
[37]
Christos Louizos, Uri Shalit, Joris M Mooij, David Sontag, Richard Zemel, and Max Welling. 2017. Causal effect inference with deep latent-variable models. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. 6446--6456.
[38]
Xupeng Miao, Lingxiao Ma, Zhi Yang, Yingxia Shao, Bin Cui, Lele Yu, and Jiawei Jiang. 2021. CuWide: Towards Efficient Flow-based Training for Sparse Wide Models on GPUs. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 2330--2331.
[39]
Steffen Rendle, Zeno Gantner, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2011. Fast context-aware recommendations with factorization machines. In The 34th International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2011. 635--644.
[40]
Paul R Rosenbaum and Donald B Rubin. 1985. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. The American Statistician, Vol. 39, 1 (1985), 33--38.
[41]
Shiv Kumar Saini, Sunny Dhamnani, Akil Arif Ibrahim, and Prithviraj Chavan. 2019. Multiple Treatment Effect Estimation using Deep Generative Model with Task Embedding. In The World Wide Web Conference, WWW 2019. 1601--1611.
[42]
Masahiro Sato, Hidetaka Izumo, and Takashi Sonoda. 2016. Modeling Individual Users' Responsiveness to Maximize Recommendation Impact. In Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization. 259--267.
[43]
Masahiro Sato, Janmajay Singh, Sho Takemori, Takashi Sonoda, Qian Zhang, and Tomoko Ohkuma. 2019. Uplift-based evaluation and optimization of recommenders. In Proceedings of the 13th ACM Conference on Recommender Systems, RecSys 2019. 296--304.
[44]
Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016. Recommendations as treatments: Debiasing learning and evaluation. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016. JMLR.org, 1670--1679.
[45]
Uri Shalit, Fredrik D Johansson, and David Sontag. 2017. Estimating individual treatment effect: generalization bounds and algorithms. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017. PMLR, 3076--3085.
[46]
Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang. 2019. BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM 2019. 1441--1450.
[47]
Adith Swaminathan and Thorsten Joachims. 2015. The self-normalized estimator for counterfactual learning. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015. 3231--3239.
[48]
Donald L Thistlethwaite and Donald T Campbell. 1960. Regression-discontinuity analysis: An alternative to the ex post facto experiment. Journal of Educational psychology, Vol. 51, 6 (1960), 309.
[49]
Clifford H Wagner. 1982. Simpson's paradox in real life. The American Statistician, Vol. 36, 1 (1982), 46--48.
[50]
Qi Wang, Weidong Min, Daojing He, Song Zou, Tiemei Huang, Yu Zhang, and Ruikang Liu. 2020. Discriminative fine-grained network for vehicle re-identification using two-stage re-ranking. Science China Information Sciences, Vol. 63, 11 (2020), 1--12.
[51]
Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to rank with selection bias in personal search. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, SIGIR 2016. 115--124.
[52]
Xiaojie Wang, Rui Zhang, Yu Sun, and Jianzhong Qi. 2019. Doubly robust joint learning for recommendation on data missing not at random. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019. 6638--6647.
[53]
Xu Xie, Fei Sun, Xiaoyong Yang, Zhao Yang, Jinyang Gao, Wenwu Ou, and Bin Cui. 2021. Explore User Neighborhood for Real-time E-commerce Recommendation. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 2464--2475.
[54]
Liuyi Yao, Zhixuan Chu, Sheng Li, Yaliang Li, Jing Gao, and Aidong Zhang. 2020. A Survey on Causal Inference. CoRR, Vol. abs/2002.02770 (2020).
[55]
Liuyi Yao, Sheng Li, Yaliang Li, Mengdi Huai, Jing Gao, and Aidong Zhang. 2018. Representation learning for treatment effect estimation from observational data. Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, Vol. 31 (2018), 2633--2643.
[56]
Liuyi Yao, Sheng Li, Yaliang Li, Mengdi Huai, Jing Gao, and Aidong Zhang. 2019. Ace: Adaptively similarity-preserved representation learning for individual treatment effect estimation. In 2019 IEEE International Conference on Data Mining, ICDM 2019. IEEE, 1432--1437.
[57]
Huilin Yu, Tieyun Qian, Yile Liang, and Bing Liu. 2020. AGTR: Adversarial generation of target review for rating prediction. Data Science and Engineering, Vol. 5, 4 (2020), 346--359.
[58]
Yu Zheng, Chen Gao, Xiang Li, Xiangnan He, Yong Li, and Depeng Jin. 2020. Disentangling user interest and popularity bias for recommendation with causal embedding. arXiv preprint arXiv:2006.11011 (2020).
[59]
Guorui Zhou, Xiaoqiang Zhu, Chengru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep Interest Network for Click-Through Rate Prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, SIGKDD 2018. 1059--1068.

Cited By

View all
  • (2024)Causal Inference in Recommender Systems: A Survey and Future DirectionsACM Transactions on Information Systems10.1145/363904842:4(1-32)Online publication date: 9-Feb-2024
  • (2024)Revisiting Reciprocal Recommender Systems: Metrics, Formulation, and MethodProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671734(3714-3723)Online publication date: 25-Aug-2024
  • (2024)Invariant Graph Contrastive Learning for Mitigating Neighborhood Bias in Graph Neural Network Based Recommender SystemsArtificial Neural Networks and Machine Learning – ICANN 202410.1007/978-3-031-72344-5_10(143-158)Online publication date: 17-Sep-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. causal collaborative filtering
  2. recommender system
  3. regression discontinuity design

Qualifiers

  • Research-article

Conference

CIKM '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)69
  • Downloads (Last 6 weeks)8
Reflects downloads up to 23 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Causal Inference in Recommender Systems: A Survey and Future DirectionsACM Transactions on Information Systems10.1145/363904842:4(1-32)Online publication date: 9-Feb-2024
  • (2024)Revisiting Reciprocal Recommender Systems: Metrics, Formulation, and MethodProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671734(3714-3723)Online publication date: 25-Aug-2024
  • (2024)Invariant Graph Contrastive Learning for Mitigating Neighborhood Bias in Graph Neural Network Based Recommender SystemsArtificial Neural Networks and Machine Learning – ICANN 202410.1007/978-3-031-72344-5_10(143-158)Online publication date: 17-Sep-2024
  • (2023)Causal Collaborative FilteringProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3578337.3605122(235-245)Online publication date: 9-Aug-2023
  • (2022)Disentangling Interest and Causality for Recommendation Effectiveness2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)10.1109/ICCWAMTIP56608.2022.10016563(1-6)Online publication date: 16-Dec-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media