Skip to main content
Balancing Inferential Integrity and Disclosure Risk: A Mixture Modeling Approach
In the context of survey sampling, Rubin (1993) proposed to release multiply imputed synthetic datasets with the target sensitive values replaced by values drawn from posterior predictive distributions under proper imputation models. However, information loss due to incorrect model specification can weaken or invalidate the inference obtained from synthetic data. We discuss a new masking framework through data augmentation as a potential remedy. The new framework can always guarantee valid inferences using synthetic datasets, and allows data users to obtain their desired data utility while satisfying disclosure requirements. This framework can be extended through mixture modelling and combined with other existing methods to accommodate different levels of disclosure protection. We demonstrate through simulations and an illustrative example that the new framework outperforms the classical multiple imputation approach to preserving data utility while providing good disclosure protection.
Date and Time
-
Additional Authors and Speakers (not including you)
Adrian E. Raftery
University of Washington
Russell J. Steele
McGill University
Naisyin Wang
University of Michigan
Language of Oral Presentation
English
Language of Visual Aids
English

Speaker

Edit Name Primary Affiliation
Bei Jiang University of Alberta