No icon

Towards Privacy Preserving Publishing of Set-valued Data on Hybrid Cloud

Towards Privacy Preserving Publishing of Set-valued Data on Hybrid Cloud

Abstract:

Storage as a service has become an important paradigm in cloud computing for its great flexibility and economic savings. However, the development is hampered by data privacy concerns: data owners no longer physically possess the storage of their data. In this work, we study the issue of privacy-preserving set-valued data publishing. Existing data privacy-preserving techniques (such as encryption, suppression, generalization) are not applicable in many real scenes, since they would incur large overhead for data query or high information loss. Motivated by this observation, we present a suite of new techniques that make privacy-aware set-valued data publishing feasible on hybrid cloud. On data publishing phase, we propose a data partition technique, named extended quasi-identifier-partitioning (EQI-partitioning), which disassociates record terms that participate in identifying combinations. This way the cloud server cannot associate with high probability a record with rare term combinations. We prove the privacy guarantee of our mechanism. On data querying phase, we adopt interactive differential privacy strategy to resist privacy breaches from statistical queries. We finally evaluate its performance using real-life data sets on our cloud test-bed. Our extensive experiments demonstrate the validity and practicality of the proposed scheme.

Existing System:

Data encryption with fine-grained data access control is a natural solution, however, data processing based on cryptographic operations (such as homomorphic encryption, fine-grained cloud data access) are still not up to the challenge posed by large data operation, which causes large overhead for publishing, querying, and other data operations. For example, homomorphic encryption was found to be prohibitively expensive for big data query, while the secret-sharing techniques un-derlying most outsourcing proposals lead to intensive data exchanges among the shareholders on cloud during a computation involving an enormous amount of data, and are therefore hard to scale.

 

Proposed System:

To the best of our knowledge,this is the first study that formalizes the problem of privacy-preserving set-valued data publication over hybrid cloud, and provides a complete system framework. Our design proposes a novel data partition mechanism that splits EQI into dif-ferent chunks, and it ensures that private information will not be exposed to the public cloud. Moreover, we employ interactive differential privacy into the proposed frame-work, which provides strong privacy guarantees.

We built a new query analysis tool that automatical-ly transforms the structure of a query to optimize data query on hybrid-cloud. The tool not only breaks down a query into sanitized sub-queries that can work on the public and private clouds respectively, but also helps con-trol the amount of theintermediate outcomes to be deliv-ered back to the private cloud, which ensures the data confidentiality.

Comment As:

Comment (0)