Introduction

The intention-to-treat (ITT) analysis estimates the effects of initiating one treatment versus another on an outcome of interest regardless of the subsequent adherence to the assigned treatment. In contrast, the per-protocol (PP) analysis estimates the effects of initiating and continuously adhering to the initial treatment, which is discussed in the Per-protocol Analysis tutorial.

Modern causal inference

Counterfactual framework is the foundation of modern causal inference and it defines causal effects as the contrast in potential outcomes. Let us use the research question from Hernán and Robins (2016) as an example:

What is the effect of hormone therapy on the 5-year risk of breast cancer among postmenopausal women?
Note that this is a causal question rather than an associational question since we are interested in contrasting the outcomes under two hypothetical scenarios
At the individual level, the potential outcomes \(Y(1)\) and \(Y(0)\) under two treatment are the outcomes that would be observed if the subject had taken hormone therapy and had never taken the hormone therapy, respectively. The contrast in the potential outcomes, either on the absolute difference scale or relative scale, is referred to as individual causal effect (ITE)
In comparison, at the population level, the potential mean outcomes \(E[Y(1)]\) and \(E[Y(0)]\) under two treatment are the average of the outcomes that would be observed if all subjects had taken hormone therapy and had never taken the hormone therapy, respectively. The contrast in these potential outcomes is referred to as average causal effect (ATE)
For both ITE and ATE, if the potential outcomes under the two treatment are the same, then there is no causal effect; otherwise, this is a causal effect

The fundamental problem of counterfactual framework

For an individual, we can only observe the outcome under one scenario, either under hormone therapy or no hormone therapy, but not both. Although this is an obstacle at the individual level, we can resolve this barrier at the population level:

Consider an ideal randomized control trial (RCT) where patients are perfectly randomized to hormone therapy versus usual care treatment
Under this set up, it is reasonable to say that patients from either treatment group can represent all the patients in the study since there is no difference in patient characteristics between treatment groups, thanks to randomization
Given this, the mean outcome in the hormone therapy group equals to the potential outcome had everyone in the study taken hormone therapy. Similarly, the mean outcome in the usual care group equals to the potential outcome had everyone in the study never taken usual care
Thus, we can observe both potential outcomes at the population level under an RCT design and estimate causal effects. This is a main reason why RCT is the gold standard for comparative effectiveness research (CER)

From randomized control trials to observational studies

Many researchers think observational data is messy and a more challenging place to find reliable evidence, so let us summarize the advantages and disadvantages of RCTs and observational studies on a list of key considerations of a study below:

	Randomized Control Trial	Observational Study
Time Zero	Clear and straightforward: Randomization	Not straightforward
Selection Bias	Some is prevented, e.g., prevalent user bias	Exist if eligibility, treatment initiation, and start of follow up are misaligned
Immortal-time bias	No	Yes if eligibility, treatment initiation, and start of follow up are misaligned
Confounding Bias	No	Yes
Missing Data	Less	More
Measurement Error	Less	More
Statistical Power	Limited	Rich
Patient Information	Limited	Rich
Generalizability	Limited	More generalizable
Cost	Millions of dollars	Existing data
Follow-up Time	Limited: 0.5-5 years	Long-term: 10-20 years
Ethical Concerns	More	Much Less

According to the summary above, I argue that high-quality observational studies can complement findings from RCTs and provide actionable evidence when conducted robustly. Here are the reasons:

We explained in our previous tutorials that by emulating a target trial, either an existing or hypothetical trial, we are able to clearly define time zero of the study and align patient eligibility, treatment initiation, and start of follow up, leading to minimized selection bias and immortal time bias
We have established data cleaning pipelines and statistical methods to handle missing data and measurement error
Large-scale electronic health records (EHRs), as one of the main resource of observational data, provides much richer power than RCTs
RCTs are typically powered for primary estimand, e.g., average treatment effects, so study treatment effect heterogeneity using RCT data is challenging. This is where observational data can complement
EHR data captures rich patient information on demographics, disease history, lab results, comorbidities, and so on
EHR captures a much wider study population, from patients with minor health issues to patients in severe disease conditions, while RCTs often have strict eligibility criteria and exclude patients who are too sick
RCTs cost several millions of dollars to run while EHR data already exists
Primarily due to the high cost of RCTs, participants are often follow up for a limited time, while EHR data includes patient’s records for decades
In many situation, RCTs are not feasible due to ethical concerns, while EHR data captures a wide range of exposures and health events that occur naturally over a patient’s life course
Wang et al. (2023) carefully emulated 32 RCTs using observational data
- Overall observed agreement between the RCT and the database emulation results was a Pearson correlation of 0.82 (95%CI, 0.64-0.91), with 75% meeting statistical significance, 66% estimate agreement, and 75% standardized difference agreement

Conduct high-quality observational studies

To leverage observational data for estimating causal treatment effects, we need to make the following identifibility assumptions:

Stable Unit Treatment Value Assumption (SUTVA)
- Treatment variation irrelevance and no interference, i.e., \(Y_i(a_1, m_1, ..., a_n, m_n) = Y_i(a_i)\)
- One individual’s outcomes \(Y_i\) are not affected by others’ treatment assignment \(a_j\) and the mode treatment is delivered \(m_j\), with \(j \neq i\).
Consistency
- The potential outcome under the actual treatment is the same as the observed outcome, i.e., \(Y_i(a_i) = Y_i\) if \(A_i = a_i\)
Positivity
- There are individuals in the treatment and control arms for every covariate space exist in the study population, i.e., \(0<\pi<1\), where \(\pi\) is the probability of receving the treatment
Conditional Exchangeability
- Treatment and potential outcomes are independent conditional on measured confounders \(X\), under the assumption that all variables affecting both treatment assignment and outcome are included in \(X\), i.e., \(A \perp \{Y(1), Y(0)\}|X\)
- Patients are exchangeable in each stratum defined by \(X\)
- Failing to statisfy conditional exchangeability is commonly referred to as confounding by indication or endogeneity.

Causal inference methods for restoring conditional exchangeability

Various propensity score methods, including matching, weighting, stratification, regression, doubly robust methods Wager and Athey (2018), marginal structural models, are widely used to estimate ATEs and they well explained by Miguel; Robins (2020). We focus on the inverse probability (IP) of treatment weighting approach (Horvitz and Thompson 1952) here and defer other methods to future tutorials:

Propensity score (PS) is defined as the probability of assigning the treatment given covariates, i.e., \(\pi = P(A=1|X)\)
The main idea is using the PSs to weigh the observed outcomes for each treatment group
IP weights: \(\frac{1}{\pi}\) for treated units and \(\frac{1}{1-\pi}\) for control units
The output of this weighting process is a pseudo-population where treatment assignment is random given \(X=x\)
Within this pseudo-population, we estimate the ATE as \(\hat{\tau}_{\mathrm{ATE}} = \frac{1}{N}\sum_i^N \left(\frac{Y_iA_i}{\pi_i}-\frac{Y_i (1-A_i)}{1-\pi_i}\right)\)

Implementation of intention-to-treat analysis

Recall our previous example that aims to compare the effectiveness of ARB versus ACEI anti-hypertensive medications on reducing the risk of cardiovascular disease (CVD).

We emulated the target trial specified in Table 1 and prepared the dataset in the single-trial emulation session. Thus, we directly load the pre-processed data.

obsdata5 <- readRDS("obsdata5.rds")

get_label <- function(x) {
  lbl <- attr(x, "label", exact = TRUE)
  if (is.null(lbl)) "" else as.character(lbl)
}

dict <- data.frame(
  Variable = names(obsdata5),
  Meaning  = vapply(obsdata5, get_label, character(1)),
  check.names = FALSE
)

knitr::kable(dict, caption = "Data Dictionary", row.names = FALSE)

Data Dictionary
Variable	Meaning
id	Patient ID
time	Time index for longitudinal records (months)
X1	Non-ACEI or ARB antihypertensive medication use over time
X2	Standardized systolic blood pressure over time
X3	Biological sex (M=1, F=0)
X4	Standardized diastolic blood pressure at baseline
age	Age over time (years)
A	Treatment indicator over time (ARB = 1, ACEI = 0; NA = neither)
Y	Event indicator of cardiovascular disease
C	Indicator of early dropout / censoring
follow_up
assigned_treatment	Treatment indicator over time (ARB = 1, ACEI = 0; NA = neither)
X1_0	Baseline value of X1
X2_0	Baseline value of X2
X3_0	Biological sex (M=1, F=0)
X4_0	Standardized diastolic blood pressure at baseline
age_0	Baseline value of age

We now convert the long format data (multiple rows per subject) to wide format (one row per subject), retaining baseline covariates, treatment variable, follow-up time, and outcome status

# Find event/censoring time (min time of the two) for subjects who experience it 
# Find follow-up time (max time) for subjects who never experience event/censoring 
obsdata6 <- obsdata5 %>%
  group_by(id) %>%
  summarise(
    event_or_cens = any(Y == 1 | C == 1, na.rm = TRUE),
    follow_up = ifelse(event_or_cens,
                       min(follow_up[Y == 1 | C == 1], na.rm = TRUE),
                       max(follow_up, na.rm = TRUE)),
    outcome = ifelse(any(Y == 1 & C == 0, na.rm = TRUE), 1, 0), 
    .groups = "drop") %>%
  left_join(obsdata5, by = c("id", "follow_up")) %>%
  select(id, assigned_treatment, X1_0, X2_0, X3, X4, age_0, follow_up, outcome) %>% 
  arrange(id)

# Check if all subjects are included
setequal(unique(obsdata5$id), unique(obsdata6$id))
#> [1] TRUE

In a time-to-event outcome analysis, the censoring mechanism for individuals who dropped out early plays an important role in deciding the appropriate statistical analysis. We consider the two scenarios below:

Non-informative Censoring: Early dropout is completely at random
Informative Censoring: Early dropout is at random conditional on patients’ pre- and post-treatment characteristics

Below, we demonstrate the analysis procedures step-by-step for each scenario.

Scenario 1: Non-informative Censoring

Step 1: Estimate the propensity scores, i.e., the probabilities of initiating ARB, using logistic regression with baseline covariates as predictors. Compute unstabilized IP of treatment weights according to the IP weights formula above
Machine learning (ML) models are alternatives, e.g., gradient boosting machine, adaptive lasso; however, doubly robust methods should be used when PS is estimated using ML models (Chernozhukov et al. 2018)

# Fit a logistic regression
ps_model <- glm(assigned_treatment ~ X1_0 + X2_0 + X3 + X4 + age_0, 
                 data = obsdata6, 
                 family = binomial)

# Calculate propensity scores
p <- predict(ps_model, type = "response")

# Calculate inverse probability of treatment weights (IPTW)
obsdata6 <- obsdata6 %>%
  mutate(
    p = p,
    trt_weight = if_else(
      assigned_treatment == 1, 
      1/p, # 1/P(treatment) for treated subjects
      1/(1-p) # 1/(1-P(treatment)) for control subjects
  ))

Step 2: Examine overlap of treatment by visualizing the distributions of propensity score between two treatment groups

# Visualize overlap
ps_plot <- obsdata6 %>%
  mutate(treatment_group = factor(assigned_treatment, 
                                  levels = c(0, 1), 
                                  labels = c("Control", "Treated"))) %>%
  ggplot(aes(x = p, fill = treatment_group, group = treatment_group)) +
  theme_classic() +
  geom_density(data = . %>% filter(treatment_group == "Treated"),
               aes(y = after_stat(density)), alpha = 0.6) +
  geom_density(data = . %>% filter(treatment_group == "Control"), 
               aes(y = -after_stat(density)), alpha = 0.6) +
  xlab("Propensity Score") +
  ylab("Density") +
  labs(fill = "Treatment Group",
       title = "Distribution of Propensity Scores") +
  theme(plot.title = element_text(size = 10, face = "bold"))
print(ps_plot)

Step 3: IP weightng approach is unstable when there exists extreme weights, so we truncate the weights above 99th percentile to improve stability

# Save untruncated weights for later use
obsdata6 <- obsdata6 %>%
  mutate(trt_weight_untrunc = trt_weight)

trt_weight_cutoff <- quantile(obsdata6$trt_weight, 0.99)
obsdata6 <- obsdata6 %>%
  mutate(trt_weight = ifelse(trt_weight > trt_weight_cutoff, 
                             trt_weight_cutoff, trt_weight))
saveRDS(obsdata6, "obsdata6.rds")

Step 4: Evaluate covariate balance using the absolute standardized mean difference (ASMD) metric and a threshold of 0.1 is used to determine an acceptable balance

bal_s2 <- bal.tab(assigned_treatment ~ X1_0 + X2_0 + X3 + X4 + age_0, 
        data = obsdata6, estimand = "ATE", un = TRUE,  # Include unadjusted balance
        weights = obsdata6$trt_weight,
        disp.means = TRUE, disp.sds = TRUE)

love.plot(bal_s2, var.order = "unadjusted", abs = TRUE, thresholds = c(m = 0.1),
          stats = "mean.diffs", sample.names = c("Unweighted", "Weighted"),
          title = "Covariate Balance Before and After Weighting", 
          stars = "raw",
          var.names = c(X1_0 = "anti-HTN meds_0", 
                       X2_0 = "SBP_0", 
                       X3 = "Sex", 
                       X4 = "DBP",
                       age_0 = "Age_0")) +
  theme(plot.title = element_text(size = 10), plot.caption = element_text(hjust = 0, size = 8)) +
  labs(caption = "Abbreviations: anti-HTW meds_0 = antihypertensive medication use at baseline;\nSBP_0 = Standardized systolic blood pressure at baseline;\nDBP = Standardized diastolic blood pressure at baseline")

Step 5: Fit a IP weighted Cox Proportional Hazards (CPH) model

# Fit weighted CPH model
weighted_cox_model <- coxph(Surv(follow_up, outcome) ~ assigned_treatment, 
                             data = obsdata6, weights = trt_weight)
summary(weighted_cox_model)
#> Call:
#> coxph(formula = Surv(follow_up, outcome) ~ assigned_treatment, 
#>     data = obsdata6, weights = trt_weight)
#> 
#>   n= 245, number of events= 29 
#> 
#>                       coef exp(coef) se(coef) robust se      z Pr(>|z|)   
#> assigned_treatment -1.1732    0.3094   0.3194    0.4369 -2.685  0.00725 **
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#>                    exp(coef) exp(-coef) lower .95 upper .95
#> assigned_treatment    0.3094      3.232    0.1314    0.7284
#> 
#> Concordance= 0.615  (se = 0.052 )
#> Likelihood ratio test= 15.02  on 1 df,   p=1e-04
#> Wald test            = 7.21  on 1 df,   p=0.007
#> Score (logrank) test = 14.92  on 1 df,   p=1e-04,   Robust = 6.18  p=0.01
#> 
#>   (Note: the likelihood ratio and score tests assume independence of
#>      observations within a cluster, the Wald and robust score tests do not).

##### Interpretation:
# Hazard ratio (HR) = exp(coef)
# coef < 0 & HR < 1: treatment is associated with lower risk.
# coef > 0 & HR > 1: treatment is associated with higher risk.
# P-value <= alpha = 0.05 & 95% confidence interval excluding 1 suggest a statistically significant effect (not in this case).

The hazard ratio (HR) is the estimand.

Step 6 Conduct visual checks for proportional hazard (PH) assumption in the Cox model.

# Schoenfeld Residuals plot: A flat line indicates that the proportional hazards assumption holds.
plot(cox.zph(weighted_cox_model))


# Log-log plot: Parallel lines indicate proportional hazards assumption is met.
obsdata6.nonzero <- obsdata6 %>% filter(follow_up > 0)
surv_fit <- survfit(Surv(follow_up, outcome) ~ assigned_treatment, data = obsdata6.nonzero)
ggsurvplot(surv_fit, data = obsdata6, fun = "cloglog", 
           title = "Log-log Plot", xlab = "log(Time)",
           legend.title = "Treatment Group", 
           legend.labs = c("Control", "Treatment"),
           ggtheme = theme(legend.text = element_text(size = 9),      
              legend.title = element_text(size = 10)))

Scenario 2: Informative Censoring

We directly work with the long format data in this scenario since we need to explicitly model the censoring mechanism.

Step 1: Estimate propensity scores. We skip this step as it is the same as in Scenario 1
Step 2: Estimate censoring probabilities and censoring weights at each follow-up time using logistic regression

# Fit logistic regression to predict censoring based on treatment and covariates
# Skip baseline observations since baseline censoring weight is always 1
# Note C=1: censored, but we want to know the probability of not being censored (C_neg=0)
obsdata5 <- obsdata5 %>% mutate(C_neg = 1-C)
censor_model <- glm(C_neg ~ A + X1_0 + X2_0 + X3 + X4 + age_0, 
data <- obsdata5 %>% filter(follow_up > 0), family = binomial)

obsdata5 <- obsdata5 %>%
  mutate(
    censor_probs = predict(censor_model, newdata = ., type = "response"), # Calculate censoring probabilities
    
    censor_weight = ifelse(C_neg == 1, # Calculate censoring weights (unstablized)
                           1 / censor_probs,  # Not censored
                           1 / (1 - censor_probs))  # Censored
  )

Step 3: Compute cumulative censoring weights up to current time

obsdata5 <- obsdata5 %>%
  arrange(id, follow_up) %>%
  group_by(id) %>%
  mutate(cumulative_censor_weight = cumprod(censor_weight)) %>% 
  ungroup()

# Calculate summary statistics for cumulative censoring weights by treatment group
censor_weight_summary <- obsdata5 %>%
  mutate(treatment_group = factor(assigned_treatment, 
                                  levels = c(0, 1), 
                                  labels = c("Control", "Treated"))) %>%
  group_by(treatment_group) %>%
  summarise(
    min = min(cumulative_censor_weight, na.rm = TRUE),
    max = max(cumulative_censor_weight, na.rm = TRUE),
    mean = mean(cumulative_censor_weight, na.rm = TRUE),
    sd = sd(cumulative_censor_weight, na.rm = TRUE),
    .groups = 'drop')%>%
  mutate(across(c(min, max, mean, sd), ~ round(.x, 2)))

print(censor_weight_summary)
#> # A tibble: 2 × 5
#>   treatment_group   min   max  mean    sd
#>   <fct>           <dbl> <dbl> <dbl> <dbl>
#> 1 Control             1     1     1     0
#> 2 Treated             1     1     1     0

Step 4: Combine censoring weights with baseline treatment weights. The censoring weights are used to adjust for the selection bias induced by informative censoring.

# Combined weight
obsdata5 <- obsdata5 %>%
  left_join(obsdata6 %>% select(id, trt_weight_untrunc), by = "id") %>%
  mutate(combi_weight = trt_weight_untrunc * cumulative_censor_weight)

Step 5: Truncate the final, combined weights at 99th percentile.

comb_weight_cutoff <- quantile(obsdata5$combi_weight, 0.99, na.rm=T)
obsdata5 <- obsdata5 %>%
  mutate(combi_weight = ifelse(combi_weight > comb_weight_cutoff, comb_weight_cutoff, combi_weight))

Step 6: Fit a weighted pooled logistic regression to estimate the hazard of CVD at each follow-up time

# Fit weighted pooled logistic regression (start from baseline time)
# Do not need to add X1_0 + X2_0 + X3 + X4 + age_0 as predictors for the unstablized weights case
weighted_iptw_ipcw_model <- glm(Y ~ assigned_treatment + follow_up + I(follow_up^2), 
                                data = obsdata5 %>% filter(C != 1), # Exclude censored observations (C=1) because their outcomes are unknown
                                weights = combi_weight, 
                                family = binomial)

summary(weighted_iptw_ipcw_model)
#> 
#> Call:
#> glm(formula = Y ~ assigned_treatment + follow_up + I(follow_up^2), 
#>     family = binomial, data = obsdata5 %>% filter(C != 1), weights = combi_weight)
#> 
#> Coefficients:
#>                      Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)        -5.102e+00  5.478e-01  -9.313   <2e-16 ***
#> assigned_treatment -1.077e+00  5.194e-01  -2.073   0.0382 *  
#> follow_up           1.721e-02  8.089e-03   2.127   0.0334 *  
#> I(follow_up^2)     -3.791e-05  2.676e-05  -1.417   0.1566    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 189.84  on 860  degrees of freedom
#> Residual deviance: 178.86  on 857  degrees of freedom
#>   (1040 observations deleted due to missingness)
#> AIC: 178.81
#> 
#> Number of Fisher Scoring iterations: 8

Step 7: Create two counterfactual datasets such that all individuals are assigned to treatment/control

gen_cf_data <- function(data, treatment_value) {
  data_cf <- data %>% mutate(assigned_treatment = treatment_value)
  return(data_cf)
}

data_all_treated <- gen_cf_data(obsdata5 %>% filter(C != 1), 1)
data_all_untreated <- gen_cf_data(obsdata5 %>% filter(C != 1), 0)

Step 8: Calculate cumulative incidences with each treatment group at specific time points

# Calculate conditional cumulative incidences (combined weights)
calc_condit_cum_inc <- function(outcome_model, data) {
  
  pred_data <- data %>% select(id, follow_up, assigned_treatment)

  # Predict probabilities
  pred_data$prob_outcome <- predict(outcome_model, newdata = pred_data, type = "response") 
  
  # Calculate cumulative incidence for each subject
  cum_inc <- pred_data %>%
    arrange(id, follow_up) %>%
    group_by(id) %>%
    mutate(
      survival_prob = cumprod(1 - prob_outcome), # P(alive up to current time)
      survival_prob_lag = lag(survival_prob, default = 1), # P(alive up to last time)
      death_prob = survival_prob_lag * prob_outcome, # P(dead at current time)
      cumulative_incidence = cumsum(death_prob) 
    ) %>% ungroup()
  
  return(cum_inc)
}

condit_cum_inc_treated <- calc_condit_cum_inc(weighted_iptw_ipcw_model, data_all_treated)
condit_cum_inc_untreated <- calc_condit_cum_inc(weighted_iptw_ipcw_model, data_all_untreated)

Step 9: Calculate marginal cumulative incidences: Average conditional cumulative incidence at each visit across all individuals in two datasets

marginal_res <- bind_rows(
  condit_cum_inc_treated %>% 
    group_by(follow_up) %>%
    summarise(mean_cum_inc = mean(cumulative_incidence, na.rm = TRUE), .groups = 'drop') %>%
    mutate(treatment = "Treated"),
  
  condit_cum_inc_untreated %>% 
    group_by(follow_up) %>%
    summarise(mean_cum_inc = mean(cumulative_incidence, na.rm = TRUE), .groups = 'drop') %>%
    mutate(treatment = "Untreated")
)

Step 10: Compute risk differences and risk ratios between treatment groups.

# Calculate incidence differences and ratios
inc_comp <- marginal_res %>%
  select(follow_up, treatment, mean_cum_inc) %>%
  pivot_wider(names_from = treatment, values_from = mean_cum_inc) %>%
  mutate(incidence_diff = Treated - Untreated, incidence_ratio = Treated / Untreated)

Step 11: Display risk differences and risk ratios at times of interest.

# Marginal Cumulative Incidences at different Time Points
key_times <- c(182, 365) # 6 months and 1 year
#find the closest time points in the data stratified by treatment 
results <- marginal_res %>%
  tidyr::crossing(assess_time = key_times) %>%
  filter(follow_up >= assess_time) %>%
  group_by(treatment, assess_time) %>%
  slice_min(follow_up, n = 1, with_ties = FALSE) %>%
  ungroup() %>%
  select(treatment, assess_time, mean_cum_inc)%>%
  arrange(treatment, assess_time)%>%
  pivot_wider(names_from = treatment, values_from = mean_cum_inc)
results
#> # A tibble: 2 × 3
#>   assess_time Treated Untreated
#>         <dbl>   <dbl>     <dbl>
#> 1         182  0.0512     0.142
#> 2         365  0.116      0.300
 

# Risk Differences and Ratios at different Time Points
results1 <- inc_comp %>%
  crossing(assess_time = key_times) %>%
  filter(follow_up >= assess_time) %>%
  group_by(assess_time) %>%
  slice_min(follow_up, n = 1, with_ties = FALSE) %>%select(assess_time,Treated, Untreated, incidence_diff, incidence_ratio )%>%
  ungroup()

results1
#> # A tibble: 2 × 5
#>   assess_time Treated Untreated incidence_diff incidence_ratio
#>         <dbl>   <dbl>     <dbl>          <dbl>           <dbl>
#> 1         182  0.0512     0.142        -0.0906           0.361
#> 2         365  0.116      0.300        -0.184            0.385

Funding

This work was supported by Utah Clinical & Translational Science Institute (CTSI) Translational Innovation Pilot (TIP) Program Award (NCATS UM1TR004409).

References

Chernozhukov, Victor, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. 2018. “Double/Debiased Machine Learning for Treatment and Structural Parameters.” Oxford University Press Oxford, UK.

Hernán, Miguel A, and James M Robins. 2016. “Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available.” American Journal of Epidemiology 183: 758–64.

Horvitz, Daniel G, and Donovan J Thompson. 1952. “A Generalization of Sampling Without Replacement from a Finite Universe.” Journal of the American Statistical Association 47: 663–85.

Laan, Mark J Van Der, and Daniel Rubin. 2006. “Targeted Maximum Likelihood Learning.” The International Journal of Biostatistics 2.

Miguel; Robins, James Hernan. 2020. Causal Inference: What If. Boca Ration: Chapman & Hall/CRC.

Wager, Stefan, and Susan Athey. 2018. “Estimation and Inference of Heterogeneous Treatment Effects Using Random Forests.” Journal of the American Statistical Association 113: 1228–42.

Wang, Shirley V, Sebastian Schneeweiss, Jessica M Franklin, Rishi J Desai, William Feldman, Elizabeth M Garry, Robert J Glynn, et al. 2023. “Emulation of Randomized Clinical Trials with Nonrandomized Database Analyses: Results of 32 Clinical Trials.” Jama 329: 1376–85.

A new-user, active comparator design; Intention-to-treat analysis

Peirong Hao, Kevin Ying, Adam Bress, Tom Greene, Yizhe Xu*

2026-01-29