Browsing by Subject "Genome privacy"
Now showing 1 - 3 of 3
- Results Per Page
- Sort Options
Item Unknown Analyzing the effect of kinship for re-identification attacks in genomic data sharing beacons(2019-08) Ayşen, MirayGenomic data contains sensitive information about an individual. Family members' genome sequence can be re-constructed with high confidence or individuals' may face discrimination because of predisposition of a disease if genome sequence of a person is obtained. To protect the genomic information and provide a standardize and secure way for using this data the \Beacon project" initiated. Studies show that the genomic data sharing beacons are vulnerable to re-identification attacks. Since beacons generally constructed based on types of diseases, re-identification creates a significant risk for individuals. On the other hand, genomic data enables researchers to find the cause of diseases and improves personalized medicine. Previously proposed counter measures against re-identification attacks proved to be not effective as earlier researches show. In this thesis, we analyze the kin relationships' effect on the genomic data sharing beacons. Our study is based on the fact that kinship may be misleading for re-identification attacks since same SNPs can be appear in multiple family members. We showed that adding at least one of the parents to the beacon (i) cause significant decrease in the power of attacks and (ii) increase in the number of queries needed to confirm an individual's beacon membership. To investigate the suitability of using kinship as a counter measure for beacons we also calculate the utility decrease. We further show the effects of adding more distant relatives to the beacon such as grandparents.Item Unknown Privacy-preserving search for a similar genomic makeup in the cloud(Institute of Electrical and Electronics Engineers Inc., 2021-04-20) Zhu, X.; Vitenberg, R.; Veeraragavan, N. R.; Ayday, ErmanIncreasing affordability of genome sequencing and, as a consequence, widespread availability of genomic data opens up new opportunities for the field of medicine, as also evident from the emergence of popular cloud-based offerings in this area, such as Google Genomics [1]. To utilize this data more efficiently, it is crucial that different entities share their data with each other. However, such data sharing is risky mainly due to privacy concerns. In this article, we attempt to provide a privacy-preserving and efficient solution for the “similar patient search” problem among several parties (e.g., hospitals) by addressing the shortcomings of previous attempts. We consider a scenario in which each hospital has its own genomic dataset and the goal of a physician (or researcher) is to search for a patient similar to a given one (based on a genomic makeup) among all the hospitals in the system. To enable this search, we propose a hierarchical index structure to index each hospital’s dataset with low memory requirement. Furthermore, we develop a novel privacy-preserving index merging mechanism that generates a common search index from individual indices of each hospital to significantly improve the search efficiency. We also consider the storage of medical information associated with genomic data of a patient (e.g., diagnosis and treatment). We allow access to this information via a fine-grained access control policy that we develop through the combination of standard symmetric encryption and ciphertext policy attribute-based encryption. Using this mechanism, a physician can search for similar patients and obtain medical information about the matching records if the access policy holds. We conduct experiments on large-scale genomic data and show the high efficiency of the proposed scheme.Item Open Access Re-identification of individuals in genomic data-sharing beacons via allele inference(2017-10) Thenen, Nora vonGenomic datasets are often associated with sensitive phenotypes. Therefore, the leak of membership information is a major privacy risk. Genomic beacons aim to provide a secure, easy to implement, and standardized interface for data sharing by only allowing yes/no queries on the presence of speci c alleles in the dataset. Previously deemed secure against re-identi cation attacks, beacons were shown to be vulnerable despite their stringent policy. Recent studies have demonstrated that it is possible to determine whether the victim is in the dataset, by repeatedly querying the beacon for his/her single nucleotide polymorphisms (SNPs). In this thesis, we propose a novel re-identi cation attack and show that the privacy risk is more serious than previously thought. Using the proposed attack, even if the victim systematically hides informative SNPs (i.e., SNPs with very low minor allele frequency -MAF-), it is possible to infer the alleles at positions of interest as well as the beacon query results with very high con dence. Our method is based on the fact that alleles at di erent loci are not necessarily independent. We use the linkage disequilibrium and a high-order Markov chain-based algorithm for the inference. We show that in a simulated beacon with 65 individuals from the CEU population, we can infer membership of individuals with 95% con dence with only 5 queries, even when SNPs with MAF less than 0.05 are hidden. This means, we need less than 0.5% of the number of queries that existing works require, to determine beacon membership under the same conditions. We further show that countermeasures such as hiding certain parts of the genome or setting a query budget for the user would fail to protect the privacy of the participants under our adversary model.