# Track: Differential privacy

What:
Talk
Part of:
When:
9:50 AM, Monday 13 Jul 2020 EDT (2 hours 5 minutes)
Breaks:
Break    11:30 AM to 12:30 PM (1 hour)
Where:
Virtual session
This session is in the past.
The virtual space is closed.
Discussion:
4

Pre-recorded presentation

Summary:

This work is a systematic taxonomy of differential privacy variants and extensions. We categorize them using 7 "dimensions", depending on which aspect of the original dimension is modified: variants belonging to the same category cannot be combined, but different dimensions can be combined to form new definitions. The talk will give an overview of these dimensions and give a few examples, but it will not dive too much into technical details of each dimension. Rather, we will explain how and why this project came to be, and talk about the kind of challenges we encountered while working on this systematization of knowledge paper. The presentation should be accessible even for people who only have superficial knowledge about differential privacy.

Pre-recorded presentation

Summary:

The meaning of differential privacy (DP) is tightly bound with the notion of distance on databases, typically defined as the number of changed rows. Considering the semantics of data, this metric may be not the most suitable one, particularly when a distance comes out as larger than the data owner desired (which would undermine privacy). In this paper, we give a mechanism to specify continuous metrics that depend on the locations and amounts of changes in a much more nuanced manner. Our metrics turn the set of databases into a Banach space.

In order to construct DP information release mechanisms based on our metrics, we introduce derivative sensitivity, an analogue to local sensitivity for continuous functions. We use this notion in an analysis that determines the amount of noise to be added to the result of a database query in order to obtain a certain level of differential privacy, and demonstrate that derivative sensitivity allows us to employ powerful mechanisms from calculus to perform the analysis for a variety of queries. We have implemented the analyzer and evaluated its efficiency and precision.

Parameswaran Kamalaruban (EPFL), Victor Perrier (ISAE-SUPAERO & Data61, CSIRO), Hassan Jameel Asghar (Macquarie University & Data61, CSIRO), and Mohamed Ali Kaafar (Macquarie University & Data61, CSIRO)

Pre-recorded presentation

Summary:

Differential privacy provides strong privacy guarantees simultaneously enabling useful insights from sensitive datasets. However, it provides the same level of protection for all elements (individuals and attributes) in the data. There are practical scenarios where some data attributes need more/less protection than others. In this paper, we consider $d_{\mathcal{X}}$-privacy, an instantiation of the privacy notion introduced in \cite{chatzikokolakis2013broadening}, which allows this flexibility by specifying a separate privacy budget for each pair of elements in the data domain. We describe a systematic procedure to tailor any existing differentially private mechanism that assumes a query set and a sensitivity vector as input into its $d_{\mathcal{X}}$-private variant, specifically focusing on linear queries. Our proposed meta procedure has broad applications as linear queries form the basis of a range of data analysis and machine learning algorithms, and the ability to define a more flexible privacy budget across the data domain results in improved privacy/utility tradeoff in these applications. We propose several $d_{\mathcal{X}}$-private mechanisms and provide theoretical guarantees on the trade-off between utility and privacy. We also experimentally demonstrate the effectiveness of our procedure, by evaluating our proposed $d_{\mathcal{X}}$-private Laplace mechanism on both synthetic and real datasets using a set of randomly generated linear queries.

Pre-recorded presentation

Summary:

We explore the power of the hybrid model of differential privacy (DP), in which some users desire the guarantees of the local model of DP and others are content with receiving the trusted-curator model guarantees. In particular, we study the utility of hybrid model estimators that compute the mean of arbitrary real-valued distributions with bounded support. When the curator knows the distribution's variance, we design a hybrid estimator that, for realistic datasets and parameter settings, achieves a constant factor improvement over natural baselines. We then analytically characterize how the estimator's utility is parameterized by the problem setting and parameter choices. When the distribution's variance is unknown, we design a heuristic hybrid estimator and analyze how it compares to the baselines. We find that it often performs better than the baselines, and sometimes almost as well as the known-variance estimator. We then answer the question of how our estimator's utility is affected when users' data are not drawn from the same distribution, but rather from distributions dependent on their trust model preference. Concretely, we examine the implications of the two groups' distributions diverging and show that in some cases, our estimators maintain fairly high utility. We then demonstrate how our hybrid estimator can be incorporated as a sub-component in more complex, higher-dimensional applications. Finally, we propose a new privacy amplification notion for the hybrid model that emerges due to interaction between the groups, and derive corresponding amplification results for our hybrid estimators.

Pre-recorded presentation

Summary:

We present DPareto: a practical approach towards automatic discovery of privacy-utility Pareto fronts, with focus on Machine Learning applications.

### Who's Attending

• 23 anonymous people