⚠⚠⚠ Attention ⚠⚠⚠: Le dépôt institutionnel Papyrus sera indisponible du 31 janvier 21h00 HNE au 1er février 08h00 HNE en raison de travaux électriques sur le campus.
⚠⚠⚠Please note ⚠⚠⚠: The Papyrus Institutional Repository will be unavailable from January 31, 9:00 PM EST to February 1, 8:00 AM EST due to electrical work on campus.
Permalink: http://hdl.handle.net/1866/23110
Data-adaptive longitudinal model selection in causal inference with collaborative targeted minimum loss-based estimation
Article [Accepted Manuscript]
Abstract(s)
Causal inference methods have been developed for longitudinal observationalstudy designs where confounding is thought to occur over time. In particular,one may estimate and contrast the population mean counterfactual outcomeunder specific exposure patterns. In such contexts, confounders of thelongitudinal treatment‐outcome association are generally identified usingdomain‐specific knowledge. However, this may leave an analyst with a largeset of potential confounders that may hinder estimation. Previous approaches todata‐adaptive model selection for this type of causal parameter were limited tothe single time‐point setting. We develop a longitudinal extension of acollaborative targeted minimum loss‐based estimation (C‐TMLE) algorithmthat can be applied to perform variable selection in the models for theprobability of treatment with the goal of improving the estimation of thepopulation mean counterfactual outcome under a fixed exposure pattern. Weinvestigate the properties of this method through a simulation study, comparingit to G‐Computation and inverse probability of treatment weighting. We thenapply the method in a real‐data example to evaluate the safety of trimester‐specific exposure to inhaled corticosteroids during pregnancy in women withmild asthma. The data for this study were obtained from the linkage ofelectronic health databases in the province of Quebec, Canada. The C‐TMLEcovariate selection approach allowed for a reduction of the set of potentialconfounders, which included baseline and longitudinal variables.