No icon

PDA: Semantically Secure Time-Series Data Analytics with Dynamic User Groups

PDA: Semantically Secure Time-Series Data Analytics with Dynamic User Groups


Third-party analysis on private records is becoming increasingly important due to the widespread data collection for various analysis purposes. However, the data in its original form often contains sensitive information about individuals, and its publication will severely breach their privacy. In this paper, we present a novel Privacy-preserving Data Analytics framework PDA, which allows a third-party aggregator to obliviously conduct many different types of polynomial-based analysis on private data records provided by a dynamic sub-group of users. Notably, every user needs to keep only O(n) keys to join data analysis among O(2n) different groups of users, and any data analysis that is represented by polynomials is supported by our framework. Besides, a real implementation shows the performance of our framework is comparable to the peer works who present ad-hoc solutions for specific data analysis applications. Despite such nice properties of PDA, it is provably secure against a very powerful attacker (chosen-plaintext attack) even in the Dolev-Yao network model where all communication channels are insecure.

Existing System:

Driven by such needs, sharing and analysis of usergenerated data has become common. For examples, President Information Technology Advisory Committee (PITAC) released a report to establish a nation-wide electronic medical records sharing system to share the medical knowledge for various clinic decision making  third-party data analytic or data mining services are widely available now; and it is commonplace that companies share the user’s data to make fortune nowadays. User-generated data is released for third-party analysis in such cases, but the usergenerated data in its original form contains much sensitive information about individuals, and the publication of such data has brought active discussions in both industry and academia.

Current primary practice is to rely on (1) privacy-preserving data publishing (with sanitization or perturbation) to clean up the sensitive information in the publication or (2) regulation enforcement to restrict the usage of the published data. The former approach tries to protect privacy by removing personally identifiable information: recent privacy-preserving data publication enables data sanitization even without trusted central authorities.

Proposed System:

1) Instead of presenting ad-hoc solutions for many applications, PDA enables any type of data analysis that is based on polynomials to be performed on a private dataset distributed among individuals.

2) After initial setup, PDA supports time-series data analysis among any subgroup of n users, and the whole group is scalable as well. Notably, the key storage complexity is _(n) while similar schemes require _(2n).

3) Our proposed scheme is provably secure against chosen-plaintext attack. Specifically, any PPTA adversary’s advantage is polynomially bounded by the advantage of solving widely believed hard problems such as Decisional Diffie-Hellman or Decisional Composite Residuosity problems.

Comment As:

Comment (0)