Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2939672.2939733acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open access

Improving the Sensitivity of Online Controlled Experiments: Case Studies at Netflix

Published: 13 August 2016 Publication History

Abstract

Controlled experiments are widely regarded as the most scientific way to establish a true causal relationship between product changes and their impact on business metrics. Many technology companies rely on such experiments as their main data-driven decision-making tool. The sensitivity of a controlled experiment refers to its ability to detect differences in business metrics due to product changes. At Netflix, with tens of millions of users, increasing the sensitivity of controlled experiments is critical as failure to detect a small effect, either positive or negative, can have a substantial revenue impact. This paper focuses on methods to increase sensitivity by reducing the sampling variance of business metrics. We define Netflix business metrics and share context around the critical need for improved sensitivity. We review popular variance reduction techniques that are broadly applicable to any type of controlled experiment and metric. We describe an innovative implementation of stratified sampling at Netflix where users are assigned to experiments in real time and discuss some surprising challenges with the implementation. We conduct case studies to compare these variance reduction techniques on a few Netflix datasets. Based on the empirical results, we recommend to use post-assignment variance reduction techniques such as post stratification and CUPED instead of at-assignment variance reduction techniques such as stratified sampling in large-scale controlled experiments.

References

[1]
O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. Large-scale validation and analysis of interleaved search evaluation. ACM Trans. on Information Systems, 30(1), February 2012.
[2]
W. G. Cochran. Sampling Techniques. Wiley, 1977.
[3]
A. Deng, Y. Xu, R. Kohavi, and T. Walker. Improving the sensitivity of online controlled experiments by utilizing pre-experiment data. In WSDM 13 Proceedings of the sixth ACM International Conference on Web Search and Data Mining, pages 123--132, 2013.
[4]
B. Efron and R. Tibshirani. An Introduction to the Bootstrap. Chapman & Hall, 1993.
[5]
P. Glasserman. Monte Carlo Methods in Financial Engineering. Springer, 2003.
[6]
C. A. Gomez-Uribe and N. Hunt. The Netflix recommender system: Algorithms, business value, and innovation. ACM Trans. on Management Information Systems, 6(4), December 2015.
[7]
D. Holt and T. Smith. Post stratification. J.R. Statist. Soc. A, 142(1):33--46, 1979.
[8]
C. P. Robert and G. Casella. Monte Carlo Statistical Methods. Springer, 2010.
[9]
F. F. Stephan. The expected value and variance of the reciprocal and other negative powers of a positive bernoullian variate. The Annals of Mathematical Statistics, 16(1):50--61, March 1945.
[10]
C. J. Wu and M. S. Hamada. Experiments: Planning, Analysis, and Optimization. Wiley, 2009.

Cited By

View all
  • (2024)Country-diverted experiments for mitigation of network effectsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688046(765-767)Online publication date: 8-Oct-2024
  • (2024)Learning the Covariance of Treatment Effects Across Many Weak ExperimentsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672034(153-162)Online publication date: 25-Aug-2024
  • (2024)Learning Metrics that Maximise Power for Accelerated A/B-TestsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671512(5183-5193)Online publication date: 25-Aug-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2016
2176 pages
ISBN:9781450342322
DOI:10.1145/2939672
This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. a/b testing
  2. controlled experiment
  3. randomized experiment
  4. sensitivity
  5. variance reduction

Qualifiers

  • Research-article

Conference

KDD '16
Sponsor:

Acceptance Rates

KDD '16 Paper Acceptance Rate 66 of 1,115 submissions, 6%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)758
  • Downloads (Last 6 weeks)60
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Country-diverted experiments for mitigation of network effectsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688046(765-767)Online publication date: 8-Oct-2024
  • (2024)Learning the Covariance of Treatment Effects Across Many Weak ExperimentsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672034(153-162)Online publication date: 25-Aug-2024
  • (2024)Learning Metrics that Maximise Power for Accelerated A/B-TestsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671512(5183-5193)Online publication date: 25-Aug-2024
  • (2024)Covariate Ordered Systematic Sampling as an Improvement to Randomized Controlled TrialsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679892(3812-3816)Online publication date: 21-Oct-2024
  • (2024)Practical Batch Bayesian Sampling Algorithms for Online Adaptive Traffic ExperimentationCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3648347(471-480)Online publication date: 13-May-2024
  • (2023)Clustering-Based Imputation for Dropout Buyers in Large-Scale Online ExperimentationThe New England Journal of Statistics in Data Science10.51387/23-NEJSDS33(415-425)Online publication date: 24-May-2023
  • (2023)The Price is Right: Removing A/B Test Bias in a Marketplace of Expirable GoodsProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615502(4681-4687)Online publication date: 21-Oct-2023
  • (2023)All about Sample-Size Calculations for A/B Testing: Novel Extensions & Practical GuideProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614779(3574-3583)Online publication date: 21-Oct-2023
  • (2023)Variance Reduction Using In-Experiment Data: Efficient and Targeted Online Measurement for Sparse and Delayed OutcomesProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599928(3937-3946)Online publication date: 6-Aug-2023
  • (2023)Examining User Heterogeneity in Digital ExperimentsACM Transactions on Information Systems10.1145/357893141:4(1-34)Online publication date: 22-Mar-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media