Distributed Kaplan-Meier Curves via the Influence Function

Large, multi-center observational data are required to study rare events and exposures. However, sharing sensitive individual-level survival data such as event times and patient characteristics can require a lengthy approval process. Existing work on distributed survival analysis focuses on parametric and semi-parametric models rather than non-parametric Kaplan-Meier (KM) curves. We develop a privacy-preserving sequential distributed method for approximating KM curves by splines updated via the influence function, with confounder adjustment via inverse probability weighting and inference using the weighted log-rank test. Our method requires sharing only summary-level data (spline coefficients and knot locations), and we show equivalent inferential performance to KM analysis with pooled data in simulations. We use our method to examine incidence of blood clots after COVID-19 infection and COVID-19 vaccination using electronic health record data at Corewell Health and Michigan Medicine.

Additional Authors and Speakers (not including you)

Xu Shi

University of Michigan

Lili Zhao

University of Michigan

Session

Biostatistics Student Research Session #2

Date and Time

Mon, 06/03/2024 - 10:20 - Mon, 06/03/2024 - 10:35

Language of Oral Presentation

English / Anglais

Language of Visual Aids

English / Anglais