Machine learning algorithms increasingly work with sensitive information
on individuals, and hence the problem of privacy-preserving data analysis --
how to design data analysis algorithms that operate on the sensitive data
of individuals while still guaranteeing the privacy of individuals in
the data-- has achieved great practical importance. In this
talk, we address two problems in differentially private data analysis.
First, we address the problem of privacy-preserving classification, and
present an efficient classifier which is private in the
differential privacy model of Dwork et al. Our classifier works in the
ERM (empirical loss minimization) framework, and includes privacy
preserving logistic regression and privacy preserving support
vector machines. We show that our classifier is private, provide
analytical bounds its sample requirement, and evaluate it on real data. We next address the question of differentially private statistical estimation. We draw a concrete connection between differential privacy, and gross error sensitivity, a measure of robustness of a statistical estimator, and show how these two notions are quantitatively related.
Based on joint work with Claire Monteleoni (George Washington
University), Anand Sarwate (Rutgers University), and Daniel Hsu (Columbia University).