-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[ENH] DMLForecaster
for causal forecasting with confounder adjustment
#8797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to allow X
to be None
. If that happens, simply use only the endogenous forecaster, and ignore the rest of the algorithm. I think that is the natural, "degenerate" special case for this algorithm.
Also please try to ensure that check_estimator
passes.
sktime/forecasting/compose/_dmlf.py
Outdated
) | ||
|
||
X_ex = X[self.exposure_vars].copy() | ||
X_conf = X.drop(columns=self.exposure_vars).copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be explicitly mentioned in the documentation that non-exposure variables will be considered as confounders. Treating non-confounding variables as confounders can lead to biased or high-variance predictions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please see the current code if it address this correctly
sktime/forecasting/compose/_dmlf.py
Outdated
# Step 4: Fit final versions for prediction | ||
self.forecaster_y_final_ = clone(self.forecaster_y) | ||
self.forecaster_y_final_.fit(y, X=X_conf, fh=fh) | ||
|
||
self.forecaster_ex_final_ = clone(self.forecaster_ex) | ||
self.forecaster_ex_final_.fit(X_ex, X=X_conf, fh=fh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are these different from the models fitted earlier? Do we need to do this fitting twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are these different from the models fitted earlier? Do we need to do this fitting twice?
From my understanding (please correct if wrong), fiirst fitting is used to compute residuals for the causal effect estimation and second fitting is used for out-of-sample prediction components. This separation is needed because residual models need to predict on training indices (in sample) and final models need to predict on future indices (oos). Using the same fitted model for both can lead to overfitting issues ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, in-short, I think its because of the forecasting horizon, first fit uses the insample as fh
and second fit uses the user-provided fh
.
But I would argue that for some forecasters (depending on the internal implementation) this could lead to 2 models with different weights
sktime/forecasting/compose/_dmlf.py
Outdated
# Combine intervals (assumes independence) | ||
return pred_int_conf + pred_int_causal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this would be as straightforward as adding these two predictions. Depending on the estimator used, both of the quantities would have some posterior distribution, and the final value would depend on the distributions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you see the current implementation if that does it the right way? I tried to use some logic from ResidualBoostingForecaster, but it needs a review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added review:
- first fit of the residual forecasters should be in-sample, so
fh
should be the same as they
passed - the probabilistic foreceasting methods should use shifting of the probabilistic residual forecast by the point forecast of the original forecast, not adding the proba forecast. The logic for this is the same as in the
ResidualForecaster
, from which the utility could be reused (maybe move to a common location)
Please also address my review on X=None
above.
- since they are covered by `check_estimator` - exceptions are removed in code
DMLForecaster
for causal forecasting with confounder adjustmentDMLForecaster
for causal forecasting with confounder adjustment
sktime/forecasting/causal/_dmlf.py
Outdated
>>> fh = [1, 2, 3] | ||
>>> dml_forecaster.fit(y_train, X=X_train, fh=fh) | ||
DoubleMLForecaster(exposure_vars=['GNP'], outcome_forecaster=NaiveForecaster(), | ||
residual_forecaster=RecursiveTabularRegressionForecaster(estimator=LinearRegression()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this line not failing code formatting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't really know, should they?
are you expecting the "line >80 chars" check failure? I don't think it fails for code blocks starting with ">>>"
sktime/forecasting/compose/_dmlf.py
Outdated
3. Fit the *residual forecaster* on these residuals to estimate the causal | ||
effect of the exposure on the outcome. | ||
|
||
The residual forecaster is typically a simple linear model, ensuring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be dis 225B played to describe this comment to others. Learn more.
"typically" is not precise, remove or rework
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed in 8d987a4
outcome and treatment. If None, defaults to a recursive reduction | ||
forecaster wrapping a linear regression model. | ||
|
||
exposure_vars : list of str, optional (default=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, explain the default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 8d987a4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Also extremely neat that it works with hierarchical forecasters!
A few small requests:
- can we move the new file to a folder
forecasting.causal
? - can we add the exact algorithm from the issue to the docstring preamble?
- question: can we think of a way to shorten the forecaster names?
- there is some code duplication with
ResidualBoostingForecaster
should we try to deduplicate by moving common code to a common utility or mixin?
Yeah, I agree the names are a bit long, but I'd still lean toward keeping them for clarity and consistency with DML terminology.
We could also drop |
Hm, I think "forecaster" is the redundant term, not "outcome" etc. How about: |
This PR is ready for review, failing tests are unrelated. Please take a look at your convenience. |
**Fit procedure** | ||
1. Split exogenous data ``X`` into exposure variables ``X_exposure`` and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please make this more precise? Which forecaster, which fh
; etc. Could you simply transfer my specification to here?
class DoubleMLForecaster(BaseForecaster): | ||
"""Double Machine Learning forecaster for causal time-series forecasting. | ||
Implements the Double Machine Learning (DML) framework [1]_ for time-series, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not exactly the algorithm in [1], but an adaptation for time series (by myself) afaik
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Only minor documentation requests.
Reference Issues/PRs
Fixes #8785
What does this implement/fix? Explain your changes.
Does your contribution introduce a new dependency? If yes, which one?
What should a reviewer concentrate their feedback on?
Did you add any tests for the change?
Any other comments?
FYI @marrov @ankurankan