Privacy-preserving search for a similar genomic makeup in the cloud

Date
2021-04-20
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
IEEE Transactions on Dependable and Secure Computing
Print ISSN
15455971
Electronic ISSN
1941-0018
Publisher
Institute of Electrical and Electronics Engineers Inc.
Volume
19
Issue
4
Pages
2771 - 2788
Language
English
Journal Title
Journal ISSN
Volume Title
Series
Abstract

Increasing affordability of genome sequencing and, as a consequence, widespread availability of genomic data opens up new opportunities for the field of medicine, as also evident from the emergence of popular cloud-based offerings in this area, such as Google Genomics [1]. To utilize this data more efficiently, it is crucial that different entities share their data with each other. However, such data sharing is risky mainly due to privacy concerns. In this article, we attempt to provide a privacy-preserving and efficient solution for the “similar patient search” problem among several parties (e.g., hospitals) by addressing the shortcomings of previous attempts. We consider a scenario in which each hospital has its own genomic dataset and the goal of a physician (or researcher) is to search for a patient similar to a given one (based on a genomic makeup) among all the hospitals in the system. To enable this search, we propose a hierarchical index structure to index each hospital’s dataset with low memory requirement. Furthermore, we develop a novel privacy-preserving index merging mechanism that generates a common search index from individual indices of each hospital to significantly improve the search efficiency. We also consider the storage of medical information associated with genomic data of a patient (e.g., diagnosis and treatment). We allow access to this information via a fine-grained access control policy that we develop through the combination of standard symmetric encryption and ciphertext policy attribute-based encryption. Using this mechanism, a physician can search for similar patients and obtain medical information about the matching records if the access policy holds. We conduct experiments on large-scale genomic data and show the high efficiency of the proposed scheme.

Course
Other identifiers
Book Title
Citation
Published Version (Please cite this version)