Skip to main content

In this workshop we will learn about the science of collecting, analyzing and sharing confidential data without disclosing personal information. We will first provide an overview of the different goals and approaches in this vast field of study, making connections with work from the computer science community, often published under different terminology. We will then consider a specific approach known as differential privacy which is the focus of much research and is used in practice by certain statistical agencies and private companies. We will explain the origin of this formal privacy measure, look in details at its mathematical definition and its meaning, and sow how to implement it for simple tasks. The rest of the workshop will focus on the use of synthetic datasets for privacy purposes, looking at how to generate such datasets and evaluate their quality in terms of risk and utility. All the content will be illustrated with R code and some time will be reserved for the participants to experiment with the methods on real datasets.

Outline:
1. Statistical data privacy
2. Differential privacy
3. Creating synthetic datasets
4. Evaluating and using synthetic datasets

 

Room
3017
Presenter(s)
Anne-Sophie Charest
Université Laval
Date and Time
-