A Non-parametric Bayesian Approach for Uplift Discretization and Feature Selection

Mina Rafla^13,14,
Nicolas Voisine¹³,
Bruno Crémilleux¹⁴ &
…
Marc Boullé¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13717))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

838 Accesses
2 Citations

Abstract

Uplift modeling aims to estimate the incremental impact of a treatment, such as a marketing campaign or a drug, on an individual’s outcome. Bank or Telecom uplift data often have hundreds to thousands of features. In such situations, detection of irrelevant features is an essential step to reduce computational time and increase model performance. We present a parameter-free feature selection method for uplift modeling founded on a Bayesian approach. We design an automatic feature discretization method for uplift based on a space of discretization models and a prior distribution. From this model space, we define a Bayes optimal evaluation criterion of a discretization model for uplift. We then propose an optimization algorithm that finds near-optimal discretization for estimating uplift in $O(n \log n)$ time. Experiments demonstrate the high performances obtained by this new discretization method. Then we describe a parameter-free feature selection method for uplift. Experiments show that the new method both removes irrelevant features and achieves better performances than state of the art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Feature Selection Methods for Uplift Modeling and Heterogeneous Treatment Effect

Parameter-Free Bayesian Decision Trees for Uplift Modeling

Customer feature selection from high-dimensional bank direct marketing data for uplift modeling

Article 11 February 2022

Notes

1.
The terms treatment effect and uplift address the same notion. CATE is an estimation of uplift and we use “CATE” for speaking of the estimated uplift values.
2.
Our implementation is provided at https://github.com/MinaWagdi/UMODL.
3.
Other patterns can be found using the github link provided previously.
4.
https://doi.org/10.5281/zenodo.3653141.

References

Boullé, M.: MODL: a bayes optimal discretization method for continuous attributes. Mach. Learn. 65(1), 131–165 (2006)
Article MATH Google Scholar
Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)
Article Google Scholar
Devriendt, F., Van Belle, J., Guns, T., Verbeke, W.: Learning to rank for uplift modeling. IEEE Trans. Knowl. Data Eng. 34(10), 4888–4904 (2020)
Article Google Scholar
Diemert, E., Betlei, A., Renaudin, C., Amini, M.R.: A large scale benchmark for uplift modeling. In: KDD, London, United Kingdom (2018)
Google Scholar
Glover, S., Dixon, P.: Likelihood ratios: a simple and flexible statistic for empirical psychologists. Psychon. Bull. Rev. 11, 791–806 (2004)
Article Google Scholar
Grünwald, P.: The Minimum Description Length Principle. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2007)
Google Scholar
Guelman, L.: Optimal personalized treatment learning models with insurance applications. Ph.D. thesis, Universitat de Barcelona (2015)
Google Scholar
Gutierrez, P., Gérardy, J.Y.: Causal inference and uplift modelling: a review of the literature. In: PAPIs (2016)
Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
MATH Google Scholar
Habbema, J., Hermans, J.: Selection of variables in discriminant analysis by F-statistic and error rate. Technometrics 19(4), 487–493 (1977)
Article MATH Google Scholar
Hitsch, G.J., Misra, S.: Heterogeneous treatment effects and optimal targeting policy evaluation. Randomized Soc. Exp. eJournal (2018)
Google Scholar
Hu, J.: Customer feature selection from high-dimensional bank direct marketing data for uplift modeling. J. Mark. Anal. 1–12 (2022)
Google Scholar
Jacob, D.: Cate meets ML. Digit. Finance 3(2), 99–148 (2021)
Article Google Scholar
Jaskowski, M., Jaroszewicz, S.: Uplift modeling for clinical trial data. In: ICML Workshop on Clinical Data Analysis (2012)
Google Scholar
Kennedy, E.H.: Towards optimal doubly robust estimation of heterogeneous causal effects (2020). https://arxiv.org/abs/2004.14497
Liu, H., Setiono, R.: Feature selection via discretization. IEEE Trans. Knowl. Data Eng. 9(4), 642–645 (1997)
Article Google Scholar
Lo, V.: Pachamanova: from predictive uplift modeling to prescriptive uplift analytics: a practical approach to treatment optimization while accounting for estimation risk. J. Mark. Anal. 3, 79–95 (2015)
Article Google Scholar
Lunceford, J.K., Davidian, M.: Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat. Med. 23(19), 2937–60 (2004)
Article Google Scholar
Radcliffe, N.: Using control groups to target on predicted lift: building and assessing uplift model. Direct Mark. Anal. J. 14–21 (2007)
Google Scholar
Radcliffe, N., Surry, P.: Differential response analysis: modeling true responses by isolating the effect of a single action. Credit Scoring and Credit Control IV (1999)
Google Scholar
Radcliffe, N.J., Surry, P.D.: Real-world uplift modelling with significance-based uplift trees. Stochastic Solutions (2011)
Google Scholar
Rissanen, J.: Modeling by shortest data description. Automatica 14(5), 465–471 (1978)
Article MATH Google Scholar
Rubin, D.B.: Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66, 688–701 (1974)
Article Google Scholar
Rzepakowski, P., Jaroszewicz, S.: Decision trees for uplift modeling with single and multiple treatments. Knowl. Inf. Syst. 32(2), 303–327 (2012)
Article Google Scholar
Sharmin, S., Shoyaib, M., Ali, A.A., Khan, M.A.H., Chae, O.: Simultaneous feature selection and discretization based on mutual information. Pattern Recognit. 91, 162–174 (2019)
Article Google Scholar
Zhao, Y., Fang, X., Simchi-Levi, D.: Uplift modeling with multiple treatments and general response types. In: Chawla, N.V., Wang, W. (eds.) SIAM International Conference on Data Mining, Houston, Texas, USA, 27–29 April 2017, pp. 588–596. SIAM (2017)
Google Scholar
Zhao, Z., Zhang, Y., Harinen, T., Yung, M.: Feature selection methods for uplift modeling. CoRR abs/2005.03447 (2020). https://arxiv.org/abs/2005.03447

Download references

Author information

Authors and Affiliations

Orange Labs, 22300, Lannion, France
Mina Rafla, Nicolas Voisine & Marc Boullé
UNICAEN, ENSICAEN, CNRS - UMR GREYC, Normandie Univ, 14000, Caen, France
Mina Rafla & Bruno Crémilleux

Authors

Mina Rafla
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Voisine
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Crémilleux
View author publications
You can also search for this author in PubMed Google Scholar
Marc Boullé
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mina Rafla .

Editor information

Editors and Affiliations

Grenoble Alpes University, Saint Martin d’Hères, France
Massih-Reza Amini
INSA Rouen Normandy, Saint Etienne du Rouvray, France
Stéphane Canu
Ruhr-Universität Bochum, Bochum, Germany
Asja Fischer
KU Leuven, Leuven, Belgium
Tias Guns
Central European University, Vienna, Austria
Petra Kralj Novak
Aristotle University of Thessaloniki, Thessaloniki, Greece
Grigorios Tsoumakas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rafla, M., Voisine, N., Crémilleux, B., Boullé, M. (2023). A Non-parametric Bayesian Approach for Uplift Discretization and Feature Selection. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13717. Springer, Cham. https://doi.org/10.1007/978-3-031-26419-1_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-26419-1_15
Published: 17 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26418-4
Online ISBN: 978-3-031-26419-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

A Non-parametric Bayesian Approach for Uplift Discretization and Feature Selection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Feature Selection Methods for Uplift Modeling and Heterogeneous Treatment Effect

Parameter-Free Bayesian Decision Trees for Uplift Modeling

Customer feature selection from high-dimensional bank direct marketing data for uplift modeling

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

A Non-parametric Bayesian Approach for Uplift Discretization and Feature Selection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Feature Selection Methods for Uplift Modeling and Heterogeneous Treatment Effect

Parameter-Free Bayesian Decision Trees for Uplift Modeling

Customer feature selection from high-dimensional bank direct marketing data for uplift modeling

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation