Inference attacks against differentially private query results from genomic datasets including dependent tuples

buir.contributor.authorAlmadhoun, Nour
buir.contributor.authorAyday, Erman
buir.contributor.authorUlusoy, Özgür
dc.citation.epagei145en_US
dc.citation.issueNumber1en_US
dc.citation.spagei136en_US
dc.citation.volumeNumber36en_US
dc.contributor.authorAlmadhoun, Nouren_US
dc.contributor.authorAyday, Ermanen_US
dc.contributor.authorUlusoy, Özgüren_US
dc.date.accessioned2021-02-25T10:48:41Z
dc.date.available2021-02-25T10:48:41Z
dc.date.issued2020
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractMotivation: The rapid decrease in the sequencing technology costs leads to a revolution in medical research and clinical care. Today, researchers have access to large genomic datasets to study associations between variants and complex traits. However, availability of such genomic datasets also results in new privacy concerns about personal information of the participants in genomic studies. Differential privacy (DP) is one of the rigorous privacy concepts, which received widespread interest for sharing summary statistics from genomic datasets while protecting the privacy of participants against inference attacks. However, DP has a known drawback as it does not consider the correlation between dataset tuples. Therefore, privacy guarantees of DP-based mechanisms may degrade if the dataset includes dependent tuples, which is a common situation for genomic datasets due to the inherent correlations between genomes of family members. Results: In this article, using two real-life genomic datasets, we show that exploiting the correlation between the dataset participants results in significant information leak from differentially private results of complex queries. We formulate this as an attribute inference attack and show the privacy loss in minor allele frequency (MAF) and chisquare queries. Our results show that using the results of differentially private MAF queries and utilizing the dependency between tuples, an adversary can reveal up to 50% more sensitive information about the genome of a target (compared to original privacy guarantees of standard DP-based mechanisms), while differentially privacy chi-square queries can reveal up to 40% more sensitive information. Furthermore, we show that the adversary can use the inferred genomic data obtained from the attribute inference attack to infer the membership of a target in another genomic dataset (e.g. associated with a sensitive trait). Using a log-likelihood-ratio test, our results also show that the inference power of the adversary can be significantly high in such an attack even using inferred (and hence partially incorrect) genomes.en_US
dc.identifier.doi10.1093/bioinformatics/btaa475en_US
dc.identifier.issn1367-4811
dc.identifier.urihttp://hdl.handle.net/11693/75586
dc.language.isoEnglishen_US
dc.publisherNLM (Medline)en_US
dc.relation.isversionofhttps://dx.doi.org/10.1093/bioinformatics/btaa475en_US
dc.source.titleBioinformatics (Oxford, England)en_US
dc.titleInference attacks against differentially private query results from genomic datasets including dependent tuplesen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Inference_attacks_against_differentially_private_query_results_from_genomic_datasets_including_dependent_tuples.pdf
Size:
918.49 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: