Recent Advances in Privacy-Preserving Data Analytics
For many years, big companies have been collecting and storing your personal data. These data include your location traces, online browsing histories, purchasing histories, daily activities/social relationship, health status, or even voice recordings. They are collected via a variety of platforms, including computers, smart phones, and Internet-of-Things (IoT) devices, and are being used for multiple data analytics purposes, such as targeted advertisement, location-based services, or to obtain statistical information. While these data help to enhance user experience and improve companies’ products and services, they also bring severe privacy concerns. Thus it is critical to adopt technical measures to protect users’ private data without hampering the data utility. One of the most promising approaches is to obfuscate or perturb users’ data before uploading to a server, and formal privacy metrics such as local differential privacy (LDP) has been used to measure the privacy leakage. However the main challenge is how to guarantee a small amount of leakage while still enabling data query/analytics with reasonable accuracy. In this talk, I will present our recent research advances in privacy-preserving data analytics in the local setting. We first focus on enhancing the utility of complex query types with LDP guarantees, where we design efficient data perturbation mechanisms for correlated multi-dimensional data such as key-value data, and supporting hybrid queries on location data (both individual and statistical). Another effort is on exploiting context-aware privacy definitions, where incorporating prior knowledge about the data allows us to design adaptive data perturbation mechanisms that enhance the utility/privacy tradeoff. Finally, I will discuss some future research challenges and open problems in this area.