Permalink : https://doi.org/1866/23892
Impact of discretization of the timeline for longitudinal causal inference methods
Article [Accepted Manuscript]
Abstract(s)
In longitudinal settings, causal inference methods usually rely on a
discretization of the patient timeline that may not reflect the underlying data generation process. This paper investigates the estimation of
causal parameters under discretized data. It presents the implicit assumptions practitioners make but do not acknowledge when discretizing data to assess longitudinal causal parameters. We illustrate that
differences in point estimates under different discretizations are due
to the data coarsening resulting in both a modified definition of the
parameter of interest and loss of information about time-dependent
confounders. We further investigate several tools to advise analysts
in selecting a timeline discretization for use with pooled Longitudinal
Targeted Maximum Likelihood Estimation for the estimation of the parameters of a marginal structural model. We use a simulation study
to empirically evaluate bias at different discretizations and assess the
use of the cross-validated variance as a measure of data support to
select a discretization under a chosen data coarsening mechanism. We
then apply our approach to a study on the relative effect of alternative asthma treatments during pregnancy on pregnancy duration. The
results of the simulation study illustrate how coarsening changes the
target parameter of interest as well as how it may create bias due to a
lack of appropriate control for time-dependent confounders. We also
observe evidence that the cross-validated variance acts well as a measure of support in the data, by being minimized at finer discretizations
as the sample size increases.