Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3637528.3672352acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

STATE: A Robust ATE Estimator of Heavy-Tailed Metrics for Variance Reduction in Online Controlled Experiments

Published: 24 August 2024 Publication History

Abstract

Online controlled experiments play a crucial role in enabling data-driven decisions across a wide range of companies. Variance reduction is an effective technique to improve the sensitivity of experiments, achieving higher statistical power while using fewer samples and shorter experimental periods. However, typical variance reduction methods (e.g., regression-adjusted estimators) are built upon the intuitional assumption of Gaussian distributions and cannot properly characterize the real business metrics with heavy-tailed distributions. Furthermore, outliers diminish the correlation between pre-experiment covariates and outcome metrics, greatly limiting the effectiveness of variance reduction.
In this paper, we develop a novel framework that integrates the Student's t-distribution with machine learning tools to fit heavy-tailed metrics and construct a robust average treatment effect estimator in online controlled experiments, which we call STATE. By adopting a variational EM method to optimize the loglikehood function, we can infer a robust solution that greatly eliminates the negative impact of outliers and achieves significant variance reduction. Moreover, we extend the STATE method from count metrics to ratio metrics by utilizing linear transformation that preserves unbiased estimation, whose variance reduction is more complex but less investigated in existing works. Finally, both simulations on synthetic data and long-term empirical results on Meituan experiment platform demonstrate the effectiveness of our method. Compared with the state-of-the-art estimators (CUPAC/MLRATE), STATE achieves over 50% variance reduction, indicating it can reach the same statistical power with only half of the observations, or half the experimental duration.

Supplemental Material

MP4 File - STATE: A Robust ATE Estimator of Heavy-Tailed Metrics for Variance Reduction in Online Controlled Experiments
Online controlled experiments, also known as A/B tests, are the most widely adopted method for measuring causal effects and play a crucial role in enabling data-driven decisions across various companies. Our work introduces a new robust estimator for the Average Treatment Effect (ATE) to enhance the sensitivity of A/B tests, particularly in the presence of heavy-tailed distributions.
MP4 File - STATE: A Robust ATE Estimator of Heavy-Tailed Metrics for Variance Reduction in Online Controlled Experiments
Online controlled experiments, also known as A/B tests, are the most widely adopted method for measuring causal effects and play a crucial role in enabling data-driven decisions across various companies. Our work introduces a new robust estimator for the Average Treatment Effect (ATE) to enhance the sensitivity of A/B tests, particularly in the presence of heavy-tailed distributions.

References

[1]
Cédric Archambeau, Nicolas Delannay, and Michel Verleysen. 2006. Robust Probabilistic Projections. In ACM International Conference on Machine Learning (ICML). 33--40.
[2]
Peter M Aronow and Joel A Middleton. 2013. A Class of Unbiased Estimators of the Average Treatment Effect in Randomized Experiments. Journal of Causal Inference, Vol. 1, 1 (2013), 135--154.
[3]
Eytan Bakshy and Dean Eckles. 2013. Uncertainty in Online Experiments with Dependent Data: An Evaluation of Bootstrap Methods. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 1303--1311.
[4]
Roman Budylin, Alexey Drutsa, Ilya Katsev, and Valeriya Tsoy. 2018. Consistent Transformation of Ratio Metrics for Efficient Online Controlled Experiments. In ACM International Conference on Web Search and Data Mining (WSDM). 55--63.
[5]
Pauline Burke et al. 2019. Measuring Average Treatment Effect from Heavy-tailed Data. arXiv preprint arXiv:1905.09252 (2019).
[6]
Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. 2018. Double/Debiased Machine Learning for Treatment and Structural Parameters. The Econometrics Journal, Vol. 21, 1 (2018), C1-C68.
[7]
Anirban DasGupta. 2008. Asymptotic Theory of Statistics and Probability. Vol. 180. Springer.
[8]
Alex Deng, Michelle Du, Anna Matlin, and Qing Zhang. 2023. Variance Reduction Using In-Experiment Data: Efficient and Targeted Online Measurement for Sparse and Delayed Outcomes. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). 3937--3946.
[9]
Alex Deng, Ya Xu, Ron Kohavi, and Toby Walker. 2013. Improving the Sensitivity of Online Controlled Experiments by Utilizing Pre-Experiment Data. In ACM International Conference on Web Search and Data Mining (WSDM). 123--132.
[10]
Alex Deng, Lo-Hua Yuan, Naoya Kanai, and Alexandre Salama-Manteau. 2023. Zero to Hero: Exploiting Null Effects to Achieve Variance Reduction in Experiments with One-sided Triggering. In ACM International Conference on Web Search and Data Mining (WSDM). 823--831.
[11]
Wilfrid J Dixon. 1960. Simplified estimation from censored normal samples. The Annals of Mathematical Statistics (1960), 385--391.
[12]
David A Freedman. 2008. On Regression Adjustments to Experimental Data. Advances in Applied Mathematics, Vol. 40, 2 (2008), 180--193.
[13]
Yongyi Guo, Dominic Coey, Mikael Konutgan, Wenting Li, Chris Schoener, and Matt Goldman. 2021. Machine Learning for Variance Reduction in Online Experiments. In Conference on Neural Information Processing Systems (NIPS). 8637--8648.
[14]
Henning Hohnhold, Deirdre O'Brien, and Diane Tang. 2015. Focusing on the Long-term: It's Good for Users and Business. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 1849--1858.
[15]
Ying Jin and Shan Ba. 2023. Toward Optimal Variance Reduction in Online Controlled Experiments. Technometrics, Vol. 65, 2 (2023), 231--242.
[16]
Kenneth Lange, David R Hunter, and Ilsoon Yang. 2000. Optimization Transfer Using Surrogate Objective Functions. Journal of Computational and Graphical Statistics, Vol. 9, 1 (2000), 1--20.
[17]
Lin and Winston. 2013. Agnostic Notes on Regression Adjustments to Experimental Data: Reexamining Freedman's Critique. The Annals of Applied Statistics, Vol. 7, 1 (2013), 295--318.
[18]
Chuanhai Liu and Donald B Rubin. 1995. ML Estimation of the T Distribution using EM and Its Extensions, ECM and ECME. Statistica Sinica, Vol. 5, 1 (1995), 19--39.
[19]
Sobhan Naderi Parizi, Kun He, Reza Aghajani, Stan Sclaroff, and Pedro Felzenszwalb. 2019. Generalized Majorization-Minimization. In ACM International Conference on Machine Learning (ICML). 5022--5031.
[20]
David Peel and Geoffrey J McLachlan. 2000. Robust Mixture Modelling Using the T Distribution. Statistics and Computing, Vol. 10 (2000), 339--348.
[21]
Jasjeet S Sekhon. 2008. The Neyman-Rubin Model of Causal Inference and Estimation via Matching Methods. The Oxford Handbook of Political Methodology, Vol. 2 (2008), 1--32.
[22]
Markus Svensén and Christopher M Bishop. 2005. Robust Bayesian Mixture Modelling. Neurocomputing, Vol. 64 (2005), 235--252.
[23]
Yixin Tang, Caixia Huang, David Kastelman, and Jared Bauman. 2020. Control Using Predictions as Covariates in Switchback Experiments. (2020).
[24]
Stefan Wager, Wenfei Du, Jonathan Taylor, and Robert J Tibshirani. 2016. High-Dimensional Regression Adjustments in Randomized Experiments. National Academy of Sciences, Vol. 113, 45 (2016), 12673--12678.
[25]
Edward Wu and Johann A Gagnon-Bartsch. 2018. The LOOP Estimator: Adjusting for Covariates in Randomized Experiments. Evaluation Review, Vol. 42, 4 (2018), 458--488.
[26]
Ya Xu and Nanyu Chen. 2016. Evaluating Mobile Apps with A/B and Quasi A/B Tests. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 313--322.
[27]
Li Yang and Anastasios A Tsiatis. 2001. Efficiency Study of Estimators for a Treatment Effect in a Pretest--Posttest Trial. The American Statistician, Vol. 55, 4 (2001), 314--321.
[28]
Wenjing Zheng and Mark J van der Laan. 2011. Cross-Validated Targeted Minimum-Loss-Based Estimation. Targeted Learning: Causal Inference for Observational and Experimental Data (2011), 459--474.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2024
6901 pages
ISBN:9798400704901
DOI:10.1145/3637528
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. causal inference
  2. controlled experiments
  3. heavy-tailed
  4. robust estimation
  5. variance reduction

Qualifiers

  • Research-article

Funding Sources

  • the NSF of China
  • the Xiaomi Foundation
  • National Key R&D Program of China

Conference

KDD '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 139
    Total Downloads
  • Downloads (Last 12 months)139
  • Downloads (Last 6 weeks)8
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media