research-article

Regression by dependence minimization and its application to causal inference in additive noise models

Authors:

Dominik Janzing,

Bernhard SchölkopfAuthors Info & Claims

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

Pages 745 - 752

https://doi.org/10.1145/1553374.1553470

Published: 14 June 2009 Publication History

Abstract

Motivated by causal inference problems, we propose a novel method for regression that minimizes the statistical dependence between regressors and residuals. The key advantage of this approach to regression is that it does not assume a particular distribution of the noise, i.e., it is non-parametric with respect to the noise distribution. We argue that the proposed regression method is well suited to the task of causal inference in additive noise models. A practical disadvantage is that the resulting optimization problem is generally non-convex and can be difficult to solve. Nevertheless, we report good results on one of the tasks of the NIPS 2008 Causality Challenge, where the goal is to distinguish causes from effects in pairs of statistically dependent variables. In addition, we propose an algorithm for efficiently inferring causal models from observational data for more than two variables. The required number of regressions and independence tests is quadratic in the number of variables, which is a significant improvement over the simple method that tests all possible DAGs.

References

[1]

Bollen, K. A. (1989). Structural equations with latent variables. John Wiley & Sons.

[2]

Geiger, D., & Heckerman, D. (1994). Learning Gaussian networks. Proc. of the 10th Annual Conference on Uncertainty in Artificial Intelligence (pp. 235--243).

[3]

Gretton, A., Bousquet, O., Smola, A., & Schöölkopf, B. (2005). Measuring statistical dependence with Hilbert-Schmidt norms. Algorithmic Learning Theory: 16th International Conference (ALT 2005) (pp. 63--78).

Digital Library

[4]

Györfi, L., Kohler, M., Krzyżak, A., & Walk, H. (2002). A distribution - free theory of nonparametric regression. New York: Springer Verlag.

[5]

Hoyer, P. O., Janzing, D., Mooij, J. M., Peters, J., & Schöölkopf, B. (2009). Nonlinear causal discovery with additive noise models. In D. Koller, D. Schuurmans, Y. Bengio and L. Bottou (Eds.), Advances in Neural Information Processing Systems 21 (NIPS* 2008), 689--696.

[6]

Liu, D. C., & Nocedal, J. (1989). On the limited memory method for large scale optimization. Mathematical Programming B, 45, 503--528.

Digital Library

[7]

Mooij, J., Janzing, D., & Schöölkopf, B. (2008). Distinguishing between cause and effect. http://www.kyb.tuebingen.mpg.de/bs/people/jorism/causality-data/.

[8]

Okazaki, N., & Nocedal, J. (2008). libLBFGS: C library of limited-memory BFGS (L-BFGS). http://www.chokkan.org/software/liblbfgs/.

[9]

Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge University Press.

Digital Library

[10]

Rasmussen, C. E., & Williams, C. (2006). Gaussian Processes for Machine Learning. MIT Press.

Digital Library

[11]

Rasmussen, C. E., & Williams, C. (2007). GPML code. http://www.gaussianprocess.org/gpml/code.

[12]

Schölkopf, B., & Smola, A. (2002). Learning with kernels. MIT Press.

[13]

Shimizu, S., Hoyer, P. O., Hyvärinen, A., & Kerminen, A. J. (2006). A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7, 2003--2030.

Digital Library

[14]

Spirtes, P., Glymour, C., & Scheines, R. (1993). Causation, prediction, and search. Springer-Verlag. (2nd ed. MIT Press 2000).

[15]

Steinwart, I. (2002). On the influence of the kernel on the consistency of support vector machines. Journal of Machine Learning Research, 2, 67--93.

Digital Library

[16]

Zhang, K., & Hyväärinen, A. (2008). Distinguishing causes from effects using nonlinear acyclic causal models. http://videolectures.net/coa08_zhang_hyvarinen_dcfeu/. Talk at the NIPS 2008 Workshop on Causality: objectives and assessment.

Cited By

Waldorp LKossakowski Jvan der Maas H(2024)Perturbation graphs, invariant causal prediction and causal relations in psychologyBritish Journal of Mathematical and Statistical Psychology10.1111/bmsp.12361Online publication date: 21-Oct-2024
https://doi.org/10.1111/bmsp.12361
Wei YLi XLin LZhu DLi Q(2024)Causal Discovery on Discrete Data via Weighted Normalized Wasserstein DistanceIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.3213641(1-13)Online publication date: 2024
https://doi.org/10.1109/TNNLS.2022.3213641
Wang XBan TChen LWu XLyu DChen H(2024)Knowledge Verification From DataIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.320224435:3(4324-4338)Online publication date: Mar-2024
https://doi.org/10.1109/TNNLS.2022.3202244
Show More Cited By

Recommendations

A Survey on Causal Inference

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy, and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research ...
Causality for Machine Learning
Probabilistic and Causal Inference
A Survey of Learning Causality with Data: Problems and Methods

This work considers the question of how convenient access to copious data impacts our ability to learn causal effects and relations. In what ways is learning causality in the era of big data different from—or the same as—the traditional one? To answer ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

June 2009

1331 pages

ISBN:9781605585161

DOI:10.1145/1553374

General Chair:
Andrea Danyluk
Williams College
,
Program Chairs:
Léon Bottou
NEC Laboratories America
,
Michael Littman
Rutgers University

Copyright © 2009 Copyright 2009 by the author(s)/owner(s).

Sponsors

NSF
Microsoft Research: Microsoft Research
MITACS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

ICML '09

Sponsor:

Microsoft Research

ICML '09: The 26th Annual International Conference on Machine Learning held in conjunction with the 2007 International Conference on Inductive Logic Programming

June 14 - 18, 2009

Quebec, Montreal, Canada

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

55
Total Citations
View Citations
619
Total Downloads

Downloads (Last 12 months)64
Downloads (Last 6 weeks)7

Reflects downloads up to 23 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Waldorp LKossakowski Jvan der Maas H(2024)Perturbation graphs, invariant causal prediction and causal relations in psychologyBritish Journal of Mathematical and Statistical Psychology10.1111/bmsp.12361Online publication date: 21-Oct-2024
https://doi.org/10.1111/bmsp.12361
Wei YLi XLin LZhu DLi Q(2024)Causal Discovery on Discrete Data via Weighted Normalized Wasserstein DistanceIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.3213641(1-13)Online publication date: 2024
https://doi.org/10.1109/TNNLS.2022.3213641
Wang XBan TChen LWu XLyu DChen H(2024)Knowledge Verification From DataIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.320224435:3(4324-4338)Online publication date: Mar-2024
https://doi.org/10.1109/TNNLS.2022.3202244
Vuković MKoutroulis GMutlu BKrahwinkler PThalmann S(2024)Local machine learning model-based multi-objective optimization for managing system interdependencies in productionEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108099133:PBOnline publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1016/j.engappai.2024.108099
Li JZhu LDu ZLi JZhu LDu Z(2024)Criterion Optimization-Based Unsupervised Domain AdaptationUnsupervised Domain Adaptation10.1007/978-981-97-1025-6_3(19-67)Online publication date: 16-Feb-2024
https://doi.org/10.1007/978-981-97-1025-6_3
Jiang YAragam BVeitch VOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Uncovering meanings of embeddings via partial orthogonalityProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667510(31988-32005)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667510
Reisach ATami MSeiler CChambaz AWeichwald SOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)A scale-invariant sorting criterion to find a causal order in additive noise modelsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666158(785-807)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666158
Shi WXu WEvans RShpitser I(2023)Learning nonlinear causal effects via kernel anchor regressionProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3626016(1942-1952)Online publication date: 31-Jul-2023
https://dl.acm.org/doi/10.5555/3625834.3626016
Kremer HNemmour YSchölkopf BZhu JKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Estimation beyond data reweightingProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619138(17745-17783)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619138
Wang JQian YLi FLiang JZhang Q(2023)Generalization Performance of Pure Accuracy and its Application in Selective Ensemble LearningIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.317143645:2(1798-1816)Online publication date: 1-Feb-2023
https://doi.org/10.1109/TPAMI.2022.3171436
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents