C-XGBoost: A Tree Boosting Model for Causal Effect Estimation

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 713))

Included in the following conference series:

IFIP International Conference on Artificial Intelligence Applications and Innovations

311 Accesses

Abstract

Causal effect estimation aims at estimating the Average Treatment Effect as well as the Conditional Average Treatment Effect of a treatment to an outcome from the available data. This knowledge is important in many safety-critical domains, where it often needs to be extracted from observational data. In this work, we propose a new causal inference model, named C-XGBoost, for the prediction of potential outcomes. The motivation of our approach is to exploit the superiority of tree-based models for handling tabular data together with the notable property of causal inference neural network-based models to learn representations that are useful for estimating the outcome for both the treatment and non-treatment cases. The proposed model also inherits the considerable advantages of XGBoost model such as efficiently handling features with missing values requiring minimum preprocessing effort, as well as it is equipped with regularization techniques to avoid overfitting/bias. Furthermore, we propose a new loss function for efficiently training the proposed causal inference model. The experimental analysis, which is based on the performance profiles of Dolan and Moré as well as on post-hoc and non-parametric statistical tests, provide strong evidence about the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

G-computation and machine learning for estimating the causal effects of binary exposure statuses on binary outcomes

Article Open access 14 January 2021

Causal machine learning for predicting treatment outcomes

Article 19 April 2024

Decision Trees in Causal Inference

References

Alaa, A., Schaar, M.: Limits of estimating heterogeneous treatment effects: guidelines for practical algorithm design. In: International Conference on Machine Learning, pp. 129–138. PMLR (2018)
Google Scholar
Atashgahi, Z., et al.: Supervised feature selection with neuron evolution in sparse neural networks. Trans. Mach. Learn. Res. 2023 (2023)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article Google Scholar
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Google Scholar
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002)
Article MathSciNet Google Scholar
Finner, H.: On a monotonicity problem in step-down multiple test procedures. J. Am. Stat. Assoc. 88(423), 920–923 (1993)
Article MathSciNet Google Scholar
Grinsztajn, L., Oyallon, E., Varoquaux, G.: Why do tree-based models still outperform deep learning on typical tabular data? In: Advances in Neural Information Processing Systems, vol. 35, pp. 507–520 (2022)
Google Scholar
Hodges, J., Lehmann, E.L.: Rank methods for combination of independent experiments in analysis of variance. In: Rojo, J. (eds) Selected Works of E. L. Lehmann. Selected Works in Probability and Statistics, pp. 403–418. Springer, Boston (2012). https://doi.org/10.1007/978-1-4614-1412-4_35
Johansson, F., Shalit, U., Sontag, D.: Learning representations for counterfactual inference. In: International Conference on Machine Learning, pp. 3020–3029. PMLR (2016)
Google Scholar
Johansson, F.D., Shalit, U., Kallus, N., Sontag, D.: Generalization bounds and representation learning for estimation of potential outcomes and causal effects. J. Mach. Learn. Res. 23(1), 7489–7538 (2022)
MathSciNet Google Scholar
Kiriakidou, N., Diou, C.: An evaluation framework for comparing causal inference models. In: Proceedings of the 12th Hellenic Conference on Artificial Intelligence, pp. 1–9 (2022)
Google Scholar
Kiriakidou, N., Diou, C.: An improved neural network model for treatment effect estimation. In: Maglogiannis, I., Iliadis, L., Macintyre, J., Cortez, P. (eds.) AIAI 2022. IFIP AICT, vol. 647, pp. 147–158. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08337-2_13
Chapter Google Scholar
Kiriakidou, N., Diou, C.: Integrating nearest neighbors with neural network models for treatment effect estimation. Int. J. Neural Syst. (2023)
Google Scholar
Kiriakidou, N., Livieris, I.E., Pintelas, P.: Mutual information-based neighbor selection method for causal effect estimation. Neural Comput. Appl. 1–15 (2024)
Google Scholar
Künzel, S.R., Sekhon, J.S., Bickel, P.J., Yu, B.: Metalearners for estimating heterogeneous treatment effects using machine learning. Proc. Natl. Acad. Sci. 116(10), 4156–4165 (2019)
Article Google Scholar
Livieris, I.E., Karacapilidis, N., Domalis, G., Tsakalidis, D.: An advanced explainable and interpretable ML-based framework for educational data mining. In: Kubincová, Z., Caruso, F., Kim, T., Ivanova, M., Lancia, L., Pellegrino, M.A. (eds.) MIS4TEL 2023. LNNS, vol. 769, pp. 87–96. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-42134-1_9
Chapter Google Scholar
Livieris, I.E., Pintelas, P.: A new class of spectral conjugate gradient methods based on a modified secant equation for unconstrained optimization. J. Comput. Appl. Math. 239, 396–405 (2013)
Article MathSciNet Google Scholar
Livieris, I.E., Stavroyiannis, S., Iliadis, L., Pintelas, P.: Smoothing and stationarity enforcement framework for deep learning time-series forecasting. Neural Comput. Appl. 33(20), 14021–14035 (2021)
Article Google Scholar
Louizos, C., Shalit, U., Mooij, J.M., Sontag, D., Zemel, R., Welling, M.: Causal effect inference with deep latent-variable models. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
MacDorman, M.F., Atkinson, J.O.: Infant mortality statistics from the 1996 period linked birth/infant death data set. Mon. Vital Stat. Rep. 46(12) (1998)
Google Scholar
Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)
Google Scholar
Pouyanfar, S.: A survey on deep learning: algorithms, techniques, and applications. ACM Comput. Surv. (CSUR) 51(5), 1–36 (2018)
Article Google Scholar
Rubin, D.B.: Causal inference using potential outcomes: design, modeling, decisions. J. Am. Stat. Assoc. 100(469), 322–331 (2005)
Article MathSciNet Google Scholar
Shalit, U., Johansson, F.D., Sontag, D.: Estimating individual treatment effect: generalization bounds and algorithms. In: International Conference on Machine Learning, pp. 3076–3085. PMLR (2017)
Google Scholar
Shi, C., Blei, D., Veitch, V.: Adapting neural networks for the estimation of treatment effects. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Shimoni, Y., Yanover, C., Karavani, E., Goldschmnidt, Y.: Benchmarking framework for performance-evaluation of causal inference analysis. arXiv preprint arXiv:1802.05046 (2018)
Shinde, P.P., Shah, S.: A review of machine learning and deep learning applications. In: 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), pp. 1–6. IEEE (2018)
Google Scholar
Wager, S., Athey, S.: Estimation and inference of heterogeneous treatment effects using random forests. J. Am. Stat. Assoc. 113(523), 1228–1242 (2018)
Article MathSciNet Google Scholar
Zhou, G., Yao, L., Xu, X., Wang, C., Zhu, L.: Cycle-balanced representation learning for counterfactual inference. In: Proceedings of the 2022 SIAM International Conference on Data Mining (SDM), pp. 442–450. SIAM (2022)
Google Scholar

Download references

Acknowledgements

The work leading to these results has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 965231, project REBECCA (REsearch on BrEast Cancer induced chronic conditions supported by Causal Analysis of multi-source data).

Author information

Authors and Affiliations

Department of Informatics and Telematics, Harokopio University, Kallithea, Greece
Niki Kiriakidou & Christos Diou
Department of Statistics and Insurance Science, University of Pireaus, Piraeus, Greece
Ioannis E. Livieris

Authors

Niki Kiriakidou
View author publications
You can also search for this author in PubMed Google Scholar
Ioannis E. Livieris
View author publications
You can also search for this author in PubMed Google Scholar
Christos Diou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Niki Kiriakidou .

Editor information

Editors and Affiliations

University of Piraeus, Piraeus, Greece
Ilias Maglogiannis
Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
University of Abertay, Dundee, UK
John Macintyre
Ionian University, Corfu, Greece
Markos Avlonitis
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kiriakidou, N., Livieris, I.E., Diou, C. (2024). C-XGBoost: A Tree Boosting Model for Causal Effect Estimation. In: Maglogiannis, I., Iliadis, L., Macintyre, J., Avlonitis, M., Papaleonidas, A. (eds) Artificial Intelligence Applications and Innovations. AIAI 2024. IFIP Advances in Information and Communication Technology, vol 713. Springer, Cham. https://doi.org/10.1007/978-3-031-63219-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-63219-8_5
Published: 22 June 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63218-1
Online ISBN: 978-3-031-63219-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)

C-XGBoost: A Tree Boosting Model for Causal Effect Estimation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

G-computation and machine learning for estimating the causal effects of binary exposure statuses on binary outcomes

Causal machine learning for predicting treatment outcomes

Decision Trees in Causal Inference

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

C-XGBoost: A Tree Boosting Model for Causal Effect Estimation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

G-computation and machine learning for estimating the causal effects of binary exposure statuses on binary outcomes

Causal machine learning for predicting treatment outcomes

Decision Trees in Causal Inference

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation