Inference from Synthetic Datasets: Methods, Pitfalls, and Best Practices
Synthetic datasets are increasingly used in biostatistics and health research to enable data sharing and protect patient privacy. While these datasets often resemble the original data closely, analyzing them requires care : standard inferential procedures may no longer apply, and ignoring the variability introduced during synthesis can lead to misleading conclusions. This half-day workshop offers a practical introduction to analyzing synthetic data with appropriate methodology, such as specific combining rules, and will include hands-on implementation in R. Examples of some existing synthetic datasets in the biostatistics domain will be provided.
Date and Time
-