网赌

网赌 > 学术报告 > 正文
Enhancing Privacy-Preserving Data Science: Advances in Foundation Models and the US Census
报告人:王晨笛助理教授,厦门大学 时间:2024年10月28日13:30 字号:

地点:行健楼学术活动室665

邀请人:孙海琳教授

摘要:

In this talk, we will introduce several recent advances aimed at improving the performance of differential privacy (DP) mechanisms in data science.

First, we will explore the accuracy of computer vision models when privately fine-tuning foundation models, such as the Vision Transformer. Our findings indicate that when features are well-extracted—specifically when approximate neural collapse (a phenomenon in representation learning) occurs during fine-tuning—the misclassification error decays exponentially, resulting in nearly no privacy-utility tradeoff. Moreover, based on our theoretical insights, we will demonstrate that DP learning is inherently less robust to perturbations. To address this, we reveal that dimension reduction methods, such as PCA, can enhance feature quality and, consequently, improve the robustness of DP fine-tuning.

Secondly, we will present a state-of-the-art privacy analysis of US Census data, one of the most significant applications of DP implemented by the US government. Specifically, by leveraging the recently proposed f-differential privacy framework, we developed a method to measure the privacy level of the US Census, which outperforms the current method employed by the US Census Bureau. Our approach improves the accuracy of the decennial census data by nearly 10%.

References:

1. Neural Collapse Meets Differential Privacy: Curious Behaviors of NoisyGD with Near-perfect Representation Learning. Wang et al., ICML'24 (oral presentation).

2. The 2020 United States Decennial Census Is More Private Than You (Might) Think. Su et al., Submitted.


【打印此页】 【关闭窗口】