Nothing Special   »   [go: up one dir, main page]

The Table 2 Fallacy - Presenting and Interpreting Confounder and Modifier Coefficients

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

American Journal of Epidemiology Vol. 177, No.

4
© The Author 2013. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of DOI: 10.1093/aje/kws412
Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. Advance Access publication:
January 30, 2013

Commentary

The Table 2 Fallacy: Presenting and Interpreting Confounder and Modifier


Coefficients

Daniel Westreich* and Sander Greenland


* Correspondence to Dr. Daniel Westreich, Department of Obstetrics and Gynecology, Duke Global Health Institute, Duke University, DUMC
3967, Durham, NC 27710 (e-mail: daniel.westreich@duke.edu).

Initially submitted January 13, 2012; accepted for publication October 11, 2012.

It is common to present multiple adjusted effect estimates from a single model in a single table. For example,
a table might show odds ratios for one or more exposures and also for several confounders from a single logistic
regression. This can lead to mistaken interpretations of these estimates. We use causal diagrams to display the
sources of the problems. Presentation of exposure and confounder effect estimates from a single model may
lead to several interpretative difficulties, inviting confusion of direct-effect estimates with total-effect estimates for
covariates in the model. These effect estimates may also be confounded even though the effect estimate for the
main exposure is not confounded. Interpretation of these effect estimates is further complicated by heterogeneity
(variation, modification) of the exposure effect measure across covariate levels. We offer suggestions to limit
potential misunderstandings when multiple effect estimates are presented, including precise distinction between
total and direct effect measures from a single model, and use of multiple models tailored to yield total-effect
estimates for covariates.

causal diagrams; causal inference; confounding; direct effects; epidemiologic methods; mediation analysis;
regression modeling

Abbreviation: HIV, human immunodeficiency virus.

In scientific manuscripts in which the results of population- In the present commentary, we explain how such a table
based research are reported, “Table 1” typically includes a can be misleading. In particular, we illustrate how readers
description of key demographic, social, and clinical char- can be misled by effect estimates for secondary risk factors
acteristics of the study groups, often categorized by the from the model also used to control for confounding of ex-
levels or arms of the primary exposure (treatment). This posure. We also offer suggestions for improving tables with
information is often useful when thinking about both inter- effect estimates for different variables. A related discussion
nal and external validity (generalizability) of the study has been published by VanderWeele and Staudt (2).
results (1). “Primary effect” will refer to the causal effect of an ex-
Also common is a “Table 2” that shows multivariate- posure of primary interest, “secondary effect” will refer to
adjusted associations with the outcome for variables sum- the causal effect of a covariate not of primary interest in the
marized in Table 1. For example, in a study of the impact initial adjustment model (e.g., a confounder or effect-
of aspirin on stroke risk in which the main study associa- measure modifier), “total effect” will refer to the net of all
tion is adjusted for age and sex, Table 2 might contain risk, associations of a variable through all causal pathways to the
rate, or odds ratios for aspirin, age, and sex. Thus, the table outcome, and “direct effect” will refer to an association
contains effect estimates for secondary risk factors for the after blocking or controlling some of those pathways (3–6).
outcome in addition to the effect estimate for the primary We presume the reader understands causal diagrams, as
exposure. Usually all of the estimates are derived from the many introductions are now available (5–8). For simplicity,
same model. we use logistic models (so coefficients of single regressors

292 Am J Epidemiol. 2013;177(4):292–298


The Table 2 Fallacy 293

represent changes in the log odds of the outcome) and we


assume the outcome is uncommon at all covariate levels to
avoid complications arising from noncollapsibility of odds
ratios (9, 10).

TABLE 2 FALLACIES
An example

A recent article (11) illustrates the issues we raise above.


Figure 1. Causal diagram for the effect of human immunodefi-
Page 1,359 shows a Table 2 with the main exposure (work- ciency virus (HIV) seroconversion on 10-year stroke risk, with con-
related violence) and covariates including gender, age, co- founding by smoking level and age.
habitation status, education, and income. The text refers to
“cause-specific hazard ratios for use of psychotropics in
relation to work-related violence and the covariates gen-
der, age, cohabitation, education, income …” (11,
p. 1358) without indicating how interpretations may differ way: In particular, it cannot be interpreted as a total effect
between variables listed in Table 2. Thus, the presentation of smoking. In the adjusted model, β2 is a direct effect (a
may leave the impression that, for example, the hazard ratio direct log odds ratio) of smoking relative to HIV; that is, β2
for gender can be interpreted in the same way as the hazard is the portion of the smoking effect on the log odds of
ratio for the main effect of work-related violence. stroke that is not mediated through the smoking effect on
HIV seroconversion. More precisely, it is the controlled
Same model, different types of effect direct effect of smoking, that is, the causal effect of
smoking on the log odds when HIV is held fixed at a given
How can Table 2 be harmful? By presenting adjusted effect level, thus blocking the smoking effect on HIV. To the
estimates for secondary risk factors alongside the adjusted extent we might allow talk of effects of aging, β3 is similarly
effect estimate for the primary exposure, Table 2 suggests the controlled direct effect of aging on the log odds when
implicitly that all of these estimates can be interpreted simi- HIV and smoking are held fixed, thus blocking the age
larly, if not identically. This is often not the case. effects on smoking and HIV (3, 12).
Consider an observational study of the effect of human To interpret β2 as a direct effect of smoking after block-
immunodeficiency virus (HIV) seroconversion on subse- ing its effect on HIV infection, we must adjust for all con-
quent age-specific 10-year risk of stroke. Figure 1 gives a founders of both the exposure-outcome relationship and the
possible causal diagram for the study that implies that age mediator-outcome relationship (3, 4, 12); given Figure 1,
and smoking status are determined before HIV status. In this these assumptions are met. Given those assumptions (and
figure, the open paths from HIV to stroke passing through the usual assumptions of no other bias source), all 3 coeffi-
smoking and age show that the effect of HIV (the primary cients represent certain causal effects; nonetheless, inter-
exposure) on stroke risk may be confounded by these covari- preting all 3 as the same type of effect is a subtle error. In
ates. This would occur if, for example, the probability of in- particular, β1 is a total effect; β2 is a direct effect after
fection with HIV increases with age and smoking, in both blocking smoking’s effect on HIV; and β3 is a direct effect
cases perhaps due to immunosuppression. We might account after blocking the effects of age on smoking and HIV. Yet,
for this confounding using the following logistic model: many readers would interpret all 3 as total effects.

logitðStrokejHIV; Smoking; AgeÞ


Same model, different degrees of confounding
¼ b0 þ b1  HIV þ b2  Smoking þ b3  Age:
(Model 1) Now suppose Figure 2, which adds an unmeasured co-
variate U that affects only smoking and stroke, is correct.
Model 1 then remains valid for obtaining an unbiased esti-
We might report the estimated coefficients for HIV, mate of the total effect of HIV on the log odds of stroke;
smoking, and age (or their antilogs, which are odds ratios) that is, under model 1, β1 retains its interpretation as the
in model 1 in Table 2. Many readers would assume that total effect of HIV, as it did under Figure 1, because
these 3 coefficients (or odds ratios) could all be interpreted smoking and age satisfy the back-door criterion for suffi-
similarly, simply, and causally; after all, they are mutually ciency for confounding control (4, 5, 7, 13). This means
adjusted. However, even if the model is correct, these 3 co- that, after conditioning on (or blocking) smoking and age,
efficients represent different types of causal effects. there are no open paths between HIV and stroke besides the
Given Figure 1 and model 1, β1 can be interpreted as the direct arrow from HIV to stroke, which represents the
conditional total effect of contracting HIV on the 10-year causal effect of interest in this analysis.
log odds (logit) of stroke, that is, the log odds ratio for the Under Figure 2, adjustment for smoking removes con-
total effect of HIV on stroke at any given level of smoking founding of HIV by U and confounding of HIV by
and age. However, β2 cannot be interpreted in the same smoking because U is connected to HIV only through its

Am J Epidemiol. 2013;177(4):292–298
294 Westreich and Greenland

differ considerably between the models because smoking is


associated with U given HIV and age, and age is associated
with U given HIV and smoking. This difference translates
into a bias in estimates of β2 and β3 under model 1 when
considered as estimates of smoking and age effects.

INTERPRETATION GETS HARDER WITH


HETEROGENEITY

Variation (heterogeneity) of effect measures across co-


variate levels can severely complicate separation of direct
and indirect effects (3, 14). In what follows, we will need
Figure 2. Causal diagram for the effect of human immunodefi- to distinguish between 2 sources of variation of the effect
ciency virus (HIV) seroconversion on 10-year stroke risk, with con- measure for the study exposure across levels of an adjust-
founding by smoking level, age, and U. ment (model) covariate, depending on whether one views
the covariate as a secondary intervention variable or merely
as a passive stratification factor (13, 15).
The first source is variation in the effect measure that is
attributable in a precise causal sense to effects of a
effect on smoking. Nonetheless, the interpretation of β2 as modeled covariate. When neither exposure nor the modeled
a direct effect of smoking after blocking HIV is now incor- covariate is confounded by uncontrolled covariates, the ob-
rect, because β2 is confounded by U. More precisely, under served associations of the outcome with exposure and the
the causal model in Figure 2 and the regression model 1, β2 modeled covariate are attributable entirely to the joint
no longer validly represents an effect of smoking (although effects of the exposure and the covariate; in particular, in-
β1 remains a valid estimate of the total HIV effect on the terventions on the modeled covariate (e.g., smoking cessa-
log odds of stroke). tion) would in this case alter the measure of exposure
Perhaps less obviously, under Figure 2 and model 1, β3 effect. This source of effect-measure variation has been
no longer validly represents an effect of age. This is called “causal interaction” (13, 15), although some find that
because smoking is now a collider on the indirect path term objectionable because the variation might reflect
from age to smoking to U to stroke, and adjustment for nothing more than model choice rather than interaction
smoking (needed to estimate the HIV effect) opens the U- defined in terms of biologic mechanisms.
stroke path, thus biasing β3 as an effect measure (4, 5). In The second source of variation is that attributable to
other words, we are forced to control for smoking to un- effects of uncontrolled covariates, which need not be con-
biasedly estimate the HIV effect; but under Figure 2, ad- founders of the exposure effect but may still be confound-
justment for smoking biases the estimated direct effect of ers of covariate effects. In this situation, intervention on a
age (if there is no adjustment for U). Estimation of the modeled covariate might not change the exposure effect
direct effect of age on stroke would require control of HIV, measure at all, or at least not to the extent suggested by
smoking, and U. mere descriptive comparison of the exposure effect
We have thus illustrated how a model sufficient for esti- measure across the covariate’s levels. This sort of effect
mating the average effect of the primary study exposure can variation (variation without a specified source) has been
be insufficient to provide an unbiased estimate of secondary termed “effect heterogeneity” or “effect measure modifica-
effects. Under Figure 2, to obtain an unbiased estimate of tion” (5, 13, 15), although the term “effect measure modifi-
the direct effect of smoking after blocking its effect on HIV cation” (or worse, “effect modification”) is problematic
and the direct effect of age after blocking its effect on because it evokes the more narrow concept of causal inter-
smoking and HIV, we would have to adjust for U, for action (in which changing the covariate would change the
example, using the following model: effect measure).

logitðStrokejHIV; Smoking; Age; UÞ Interpretation of product terms with no uncontrolled


¼ b0 þ b1  HIV þ b2  Smoking þ confounding of any variable in the model
b3  Age þ b4  U: ðModel 2Þ Consider first the simpler case in which the only possible
causes of effect-measure variation are the covariates in the
Under Figure 2, model 2, and low risk of the outcome, it model, as in Figure 1. Causal diagrams are nonparametric
follows from basic collapsibility results (9, 10) that β1 and thus silent about the degree of variation because that is
would be approximately equal in models 1 and 2, because a parametric property represented by coefficients of product
HIV is unassociated with U given smoking and age terms (“statistical interactions”). There is no reason to
(because of the noncollapsibility of rate ratios and odds expect homogeneity (as assumed in models 1 and 2) in
ratios, this equality would fail using a log-linear rate or lo- most applications; the best we can hope is that our use of a
gistic risk models with common outcomes (9, 10)). None- homogeneous model (one without product terms, such as
theless, the smoking and age coefficients (β2 and β3) may model 1 or model 2) provides roughly unbiased estimates

Am J Epidemiol. 2013;177(4):292–298
The Table 2 Fallacy 295

of average exposure effects across covariates and so is not pack per day; specifically, smoking an additional pack per
misleading for marginal (standardized) effects (16). day adds β4 to the total HIV log odds ratio at any age.
However, sometimes the heterogeneity is severe enough What may be less anticipated, however, is that β4 is also
that it needs to be modeled, as would be the case for study- the change in the direct smoking effect on the log odds pro-
ing HIV if the HIV odds ratio were the targeted effect duced by HIV; specifically, HIV adds β4 to the direct effect
measure and varied considerably with age and smoking; of smoking a pack per day on the log odds, at any given
that is, if β4 or β5 were important to retain in the model: age. Thus β4 represents modification of the total HIV and
the direct smoking log odds ratios on stroke, both condi-
logitðStrokejHIV; Smoking; AgeÞ tional on age.
Next, β5 is the change in the total HIV log odds ratio
¼ b0 þ b1  HIV þ b2  Smoking þ b3  Age from aging an additional decade, when smoking is set
þ b4  HIV  Smoking þ b5  HIV  Age (held) to a fixed level; specifically, each decade of aging
adds β5 to the HIV-stroke log odds ratio when smoking is
þ b6  Smoking  Age: ðModel 3Þ left unchanged. We thus might say that β5 is the direct
modification of the HIV-stroke log odds ratio produced by
Under Figure 1 and model 3, the log odds ratio (the change a decade of aging when smoking level is held constant.
in the log odds of stroke) associated with moving from However, β5 is also the change in the log odds ratio for the
HIV = 0 to HIV = 1 is β1+ β4 × Smoking + β5 × Age, which direct effect of a decade of aging produced by HIV, when
is a function of smoking and age. Typical presentations smoking level is held constant. Similarly, β6 is the change
would tabulate estimates of this quantity (or its antilog) for in the log odds ratio for the direct effect of smoking a pack
various choices of smoking and age. We will consider the per day produced by an additional decade of aging when
causal interpretation of this quantity below. HIV status is unchanged and is also the change in the log
As a preliminary, for the single-variable (“main effect”) odds ratio for the direct effect of aging from an additional
terms to be interpretable when product terms are used, zero pack per day when HIV status is set to a fixed level.
must be a meaningful value in the study for each covariate. Model 3 forces all of the pairwise modifying effects just
With product terms present, single-variable exposure terms described (e.g., modification of HIV-stroke log odds ratio
represent effects only when all the covariates that appear in by age) to remain the same across the variable held cons-
product terms with exposure are zero. Thus, by adding a tant (e.g., smoking), so that smoking-and-age–specific og
product term with age, we have to center age around a ref- odds ratios for the HIV effect equal β1 + β4 × Smoking +
erence value common in the sample, so that “Age = 0” cor- β5 × Age. Modeling heterogeneity of modification (nonad-
responds to this value, not actual age. Here, we will center ditivity beyond 2-way products) would require at least a
age by subtracting 40 years (so that age represents a devia- triple product in the model (such as HIV × smoking × age);
tion from 40 years of age). We will also assume that non- the interpretation of the coefficient of this triple product
smokers (Smoking = 0) are included in the sample. It is would vary depending on which effect was targeted.
important that the units used represent meaningful differ-
ences; otherwise, product coefficients may be misleadingly Interpretation changes when there are uncontrolled
tiny even in the presence of substantial effect variation. confounders of a covariate
Hence, we will assume that age is measured in decades and
smoking in packs/day (rather than the misleadingly small Interpretation of effect-measure variation across modeled
units of years and cigarettes/day). covariates changes if some of those covariates are them-
Under Figure 1 and model 3, the smoking-and-age– selves confounded by uncontrolled covariates, even if the
specific total effect of HIV on the log odds of stroke is β1 + primary study exposure is not confounded (15). In this
β4 × Smoking + β5 × Age, which varies with smoking and case, some or all of the variation in the exposure effect
age. Such unconfounded variation in an exposure effect measure may be due to those uncontrolled covariates. For
measure represents the deviation from additivity of the joint example, U confounds smoking in Figure 2 and confounds
causal effects of HIV, smoking, and aging on the log odds the effects of age given smoking; thus, U could be respon-
of stroke. Model 3 implies that β1 is the HIV-stroke log sible for some of the variation in an HIV effect measure
odds ratio when the covariates are zero, whereas the HIV- across combined smoking and age levels. In the extreme
smoking and HIV-aging product coefficients β4 and β5 rep- case in which the HIV effect measure was constant across
resent variation in this log odds ratio across smoking and smoking and age given U, U would be the only remaining
age. Figure 1 further implies that β1 is the total effect of cause of variation in the HIV effect measure across
HIV on the log odds of stroke among 40-year-old non- smoking and age levels; thus, there would be no modifica-
smokers (Age = 0); β2 is the direct effect of smoking a pack tion of the HIV effect measure by smoking or age, so that
of cigarettes per day on the log odds among persons 40 neither alteration of smoking habits nor aging would
years of age when HIV is prevented (HIV is set to 0); and change the HIV effect measure. The product terms in
β3 is the direct effect of aging a decade on the log odds model 3 are commonly called “interaction terms” in statis-
among people in whom both HIV and smoking are prevent- tics, but in this example these terms would not represent
ed (HIV and Smoking both set to 0). any causal interaction among the variables in the products.
Figure 1 also implies that β4 is the change in the HIV In less extreme examples conforming to Figure 2, some
effect on the log odds produced by smoking one additional variation of the HIV effect measure across smoking and

Am J Epidemiol. 2013;177(4):292–298
296 Westreich and Greenland

age would remain upon control of U, so that alteration of regardless of the measure or model form used to quantify
smoking habits would change the HIV effect measure. effects.
Nonetheless, if U were not controlled, the observed varia-
tion of the HIV effect measure across smoking and age
would not equal the change in the HIV odds ratio that Table 2 may discourage realistic modeling
would be produced by a change in smoking or by aging. In
The desire for simple estimates to present in Table 2 may
other words, U could confound the apparent modification
discourage realistically flexible variable specifications. For
of the HIV effect measure by smoking and age even if it
example, spline coding of variables can improve model fit
did not account for it entirely. We note that in Figure 2, the
over linear or categorical codings and produce more credi-
confounding of age modification of the HIV effect measure
ble smooth models of complex dose-response functions (5,
by U would be entirely due to conditioning on smoking
18–20). However, splines require one to recode a variable
(which is a collider for U and age (17)); again, however,
into several functions, the coefficients of which lack simple
conditioning on smoking was necessary to remove con-
interpretation. Similarly, the use of “black-box” machine
founding by smoking and by U of the HIV effect measure.
learning techniques (21, 22) or many product terms among
To illustrate how this affects Table 2 interpretation,
multiple variables can improve model fit and validity but
suppose Figure 2 is the correct causal model and that
does not produce easily interpretable coefficients. Outputs
model 3 remains the correct regression model for the effect
of flexible regressions can be presented in terms of inter-
of HIV on stroke. The model terms involving smoking and
pretable effect measures (e.g., risk differences or ratios)
age would suffice to control for confounding of the HIV-
comparing specific values of a variable (e.g., 1 pack per
stroke relationship by smoking, age, and U because all
day smokers vs. nonsmokers), but the extra labor and com-
backdoor paths from HIV to stroke that include U are
mentary required may lead some investigators to opt for
blocked by conditioning on smoking and age. Thus, the
models that are less flexible and less valid than what can be
total effect of HIV on the cohort would still be given by
easily fit with modern software.
smoking-and-age standardization, and smoking-and-age–
specific ffects of HIV on the log odds of stroke would still
equal β1 + β4 × Smoking + β5 × Age. AVOIDING TABLE 2 FALLACIES
Nonetheless, we would expect confounding of smoking
by U to bias most interpretations given above in terms of A reasonable starting point for causal modeling is con-
the effects of smoking or aging on stroke or HIV effect struction of plausible causal diagrams that display the ana-
measures. Specifically, we could no longer interpret β2 and lyst’s best understanding of the literature. Those diagrams
β3 as direct effects of smoking and aging on the log odds encode the causal assumptions used to select covariates for
of stroke, β4 and β5 as alterations of the HIV log odds ratio inclusion in the model (3–7, 23, 24). Table 2 problems can
produced by smoking or aging, or β6 as alteration of be avoided by limiting the table to estimates of the primary
smoking log odds ratios by aging or vice-versa. Thus, exposure effect measures under the different models, with
under Figure 2, β4, β5, and β6 no longer represent deviation the secondary “adjustment” covariates reported in a foot-
from additivity of the joint causal effects on the log odds note along with how they were categorized or modeled, as
(causal interactions). For example, because of confounding is common practice in space-limited presentations. This
by U, we could no longer interpret β4 as the change in the practice leaves room for multiple estimates for the same ex-
HIV-stroke log odds ratio produced by each additional pack posures using different models (e.g., showing results from
per day of smoking. In particular, β1 and β1 + β4 would adjustment under different causal assumptions or diagrams
remain the total HIV effects on the log odds among the or for difference as well as ratio measures of effect). If
nonsmokers and pack-per-day smokers, yet β4 would be some of the primary exposures are intermediate with
biased by U effects as a measure of modification of the respect to other primary exposures, these intermediate pri-
HIV-stroke log odds ratio by smoking. Put another way, β4 maries will have to be left out of the model used to estimate
no longer equals the departure from additivity of the joint the total effects of the other primary exposures; thus, the
causal effects of HIV and smoking on the stroke log odds estimates in this table may be derived from models with
(their causal interaction on the logit scale), even though β1 different covariate subsets.
and β1 + β4 still equal the effects of HIV on the log odds of Using different covariate subsets would allow Table 2 to
stroke among nonsmokers and pack-per-day smokers given include estimates of total effects of secondary covariates
age. (which might be useful to other researchers). In our
Phrased more generally, under Figure 2, U could con- example, Table 2 could provide estimates of odds ratios for
found the apparent modification of the HIV effect measure total effects of each variable in Figure 1, using a model
by smoking and age, even though U would not confound with all 3 variables to estimate the total HIV effect on
the smoking- and age-specific total effects of HIV. The var- stroke (adjusted for smoking and age) but using a model
iation in an HIV effect measure across smoking and age without HIV to estimate the total smoking effect on stroke
would be real but could be produced entirely or partly by and a model with age alone to estimate the total age effect
the unobserved differences in U across levels of smoking on stroke. This approach invites objections, however;
or age, rather than by smoking and age alone. Because the among them is that the direct effect of smoking is what is
graph is nonparametric (and in particular not dependent needed for sensitivity analysis for other HIV-stroke an-
the chosen effect measure), this conclusion would apply alyses that did not have smoking measurements (because

Am J Epidemiol. 2013;177(4):292–298
The Table 2 Fallacy 297

confounding by smoking is transmitted only through its This work was supported by the Eunice Kennedy
effects outside of any effect on HIV). Shriver National Institute of Child Health & Human Devel-
If indeed the direct effects are of interest, then Table 2 opment at the National Institutes of Health (grant R00
could include effect estimates from the model with all 3 var- HD063961 to D.W.) and the National Institute of Allergy
iables. In that case, it seems advisable that the text descrip- and Infectious Disease at the National Institutes of Health
tion note that the smoking and age estimates are for direct (grant 2P30 AI064518 to D.W.).
effects. Under Figure 2, these interpretations require that the We wish to thank Stephen R. Cole and Tyler Vander-
model include variables like U (if, unlike U, they are mea- Weele for helpful comments in the preparation of this man-
sured) that confound the smoking and age direct effects even uscript and the reviewers and editor for their extensive
if they do not confound the HIV effects (5, 6, 9). helpful suggestions.
Conflict of interest: none declared.

DISCUSSION

The problem with a table presenting multiple estimated


effect measures from the same model (“Table 2”) is that it
encourages the reader to interpret all these estimates in the REFERENCES
same way, typically as total-effect estimates. As illustrated
1. Cole SR, Stuart EA. Generalizing evidence from randomized
above, the interpretation of a confounder effect estimate
clinical trials to target populations: the ACTG 320 trial. Am J
may be different than for the exposure effect estimate. Of Epidemiol. 2010;172(1):107–115.
course, it is possible that some secondary reported associa- 2. VanderWeele TJ, Staudt NC. Causal diagrams for empirical
tions are unbiased for total effects and that others are unbi- legal research: methodology for identifying causation,
ased for direct effects; nonetheless, the assumption that all avoiding bias, and interpreting results. Law Probab Risk.
estimates reported in Table 2 are for total effects is not war- 2011;10(4):329–354.
ranted. Thus, we recommend that a presentation of second- 3. Robins JM, Greenland S. Identifiability and exchangeability
ary effect estimates would best specify the type of effect for direct and indirect effects. Epidemiology. 1992;3(2):
being estimated. 143–155.
As in all causal modeling, the interpretations described 4. Cole SR, Hernan MA. Fallibility in estimating direct effects.
Int J Epidemiol. 2002;31(1):163–165.
above should raise questions about the ethics and feasibility
5. Glymour MM, Greenland S. Causal diagrams. In:
of the interventions implicit in the effect definitions. In the Rothman KJ, Greenland S, Lash TL, eds. Modern
example, definition of the direct effect of age requires Epidemiology. 3rd ed. Philadelphia, PA: Lippincott Williams
holding a person’s smoking level and HIV status constant & Wilkins; 2008;183–209.
as they age, which would be unethical for smokers (among 6. Pearl J. Causality. 2nd ed. New York, NY: Cambridge
whom reduction should be encouraged) and infeasible even University Press; 2009.
if desirable for those HIV negative (because some will 7. Pearl J. Causal diagrams for empirical research. Biometrika.
maintain unsafe practices). More generally, definitions of 1995;82(4):669–688.
direct and indirect effects involve combined interventions 8. Greenland S, Pearl J, Robins JM. Causal diagrams for
on both the exposure and mediators; some combinations epidemiologic research. Epidemiology. 1999;10(1):37–48.
9. Greenland S, Robins JM, Pearl J. Confounding and
may resemble nothing anyone would consider in reality
collapsibility in causal inference. Stat Sci. 1999;14(1):
(25–27), thus violating positivity constraints (28). 29–46.
In sum, presenting estimates of effect measures for sec- 10. Greenland S, Pearl J. Adjustments and their consequences—
ondary risk factors (confounders and modifiers of the expo- collapsibility analysis using graphical models. Int Stat Rev.
sure effect measure) obtained from the same model as that 2011;79(3):401–426.
used to estimate the primary exposure effects can lead 11. Madsen IE, Burr H, Diderichsen F, et al. Work-related
readers astray in a number of ways. Extra thought and de- violence and incident use of psychotropics. Am J Epidemiol.
scription will be needed when interpreting such secondary 2011;174(12):1354–1362.
estimates. 12. VanderWeele TJ. Marginal structural models for the
estimation of direct and indirect effects. Epidemiology.
2009;20(1):18–26.
13. VanderWeele TJ, Knol MJ. Interpretation of subgroup
analyses in randomized trials: heterogeneity versus
ACKNOWLEDGMENTS secondary interventions. Ann Intern Med. 2011;154(10):
680–683.
Author affiliations: Department of Obstetrics and Gyne- 14. Vanderweele TJ, Vansteelandt S. Odds ratios for mediation
cology, Duke University, Durham, North Carolina (Daniel analysis for a dichotomous outcome. Am J Epidemiol.
2010;172(12):1339–1348.
Westreich); Duke Global Health Institute, Duke University,
15. VanderWeele TJ. On the distinction between interaction
Durham, North Carolina (Daniel Westreich); Department of and effect modification. Epidemiology. 2009;20(6):
Epidemiology, University of California Los Angeles, Los 863–871.
Angeles, California (Sander Greenland); and Department of 16. Greenland S, Maldonado G. The interpretation of
Statistics, University of California Los Angeles, Los multiplicative-model parameters as standardized parameters.
Angeles, California (Sander Greenland). Stat Med. 1994;13(10):989–999.

Am J Epidemiol. 2013;177(4):292–298
298 Westreich and Greenland

17. Greenland S. Quantifying biases in causal models: classical 23. Hernán MA, Hernández-Diaz S, Werler MM, et al. Causal
confounding vs collider-stratification bias. Epidemiology. knowledge as a prerequisite for confounding evaluation: an
2003;14(3):300–306. application to birth defects epidemiology. Am J Epidemiol.
18. Greenland S. Dose-response and trend analysis in 2002;155(2):176–184.
epidemiology: alternatives to categorical analysis. 24. Robins JM. Data, design, and background knowledge
Epidemiology. 1995;6(4):356–365. in etiologic inference. Epidemiology. 2000;11(3):313–320.
19. Orsini N, Greenland S. A procedure to tabulate and plot 25. Robins JM, Richardson TS. Alternative graphical causal
results after flexible modeling of a quantitative covariate. models and the identification of direct effects. In: Shrout P,
Stata J. 2011;11(1):1–29. ed. Causality and Psychopathology: Finding the
20. Howe CJ, Cole SR, Westreich DJ, et al. Splines for trend Determinants of Disorders and Their Cures. New York, NY:
analysis and continuous confounder control. Epidemiology. Oxford University Press; 2010.
2011;22(6):874–875. 26. Kaufman JS. Commentary: gilding the black box. Int J
21. Breiman L. Statistical modeling: the two cultures. Stat Sci. Epidemiol. 2009;38(3):845–847.
2001;16(3):199–231. 27. Kaufman JS. Invited commentary: decomposing with a lot of
22. Hastie T, Tibshirani R, Friedman J. The Elements of supposing. Am J Epidemiol. 2010;172(12):1349–1351.
Statistical Learning: Data Mining, Inference, and Prediction. 28. Westreich D, Cole SR. Invited commentary: positivity in
2nd ed. New York: Springer; 2009. practice. Am J Epidemiol. 2010;171(6):674–677.

Am J Epidemiol. 2013;177(4):292–298

You might also like