Telling cause from effect using MDL-based local and global regression

A Marx, J Vreeken - 2017 IEEE international conference on …, 2017 - ieeexplore.ieee.org
2017 IEEE international conference on data mining (ICDM), 2017ieeexplore.ieee.org
We consider the fundamental problem of inferring the causal direction between two
univariate numeric random variables X and Y from observational data. The two-variable
case is especially difficult to solve since it is not possible to use standard conditional
independence tests between the variables. To tackle this problem, we follow an information
theoretic approach based on Kolmogorov complexity and use the Minimum Description
Length (MDL) principle to provide a practical solution. In particular, we propose a …
We consider the fundamental problem of inferring the causal direction between two univariate numeric random variables X and Y from observational data. The two-variable case is especially difficult to solve since it is not possible to use standard conditional independence tests between the variables. To tackle this problem, we follow an information theoretic approach based on Kolmogorov complexity and use the Minimum Description Length (MDL) principle to provide a practical solution. In particular, we propose a compression scheme to encode local and global functional relations using MDL-based regression. We infer X causes Y in case it is shorter to describe Y as a function of X than the inverse direction. In addition, we introduce Slope, an efficient linear-time algorithm that through thorough empirical evaluation on both synthetic and real world data we show outperforms the state of the art by a wide margin.
ieeexplore.ieee.org