Genome reconstruction in beacons using summary statistics

Date

2026-01

Editor(s)

Advisor

Sav, Sinem

Supervisor

Co-Advisor

Çiçek, Abdullah Ercüment

Co-Supervisor

Instructor

BUIR Usage Stats
5
views
19
downloads

Attention Stats

Series

Abstract

Genomic data-sharing beacons, designed to safeguard individual privacy while promoting scientific discovery, remain critically vulnerable to sophisticated genome reconstruction attacks that leverage publicly released summary statistics. This thesis systematically advances the understanding and effectiveness of these attacks, challenging the assumption that releasing simple allele frequencies (AFs) is a secure protocol. The fundamental flaw lies in the beacon’s protocol to account for linkage disequilibrium (LD), which allows a malicious party to infer individual data from combined summary statistics. Our foundational contribution established the feasibility of this threat with a two-stage optimization-based algorithm that utilized public LD and AFs, achieving an F1-score of 70% and confirming the inherent privacy risk. Building upon this, the research introduces a more powerful methodology: a single-stage joint optimization framework that unifies the objectives of SNP correlation and allele frequency alignment. This formulation not only increases reconstruction performance to an average F1-score of 71.4% but also yields substantial computational savings: reconstructing 2,000 SNPs across 100 individuals now requires 7.4 hours instead of 10 hours, representing a 26% reduction in runtime. Collectively, these results provide compelling evidence of the increasing practicality and sophistication of genome reconstruction attacks against beacon protocols, underscoring the urgent need for the development of robust, adaptive, and correlation-aware defense mechanisms to protect the integrity and privacy of genomic data infrastructure.

Source Title

Publisher

Course

Other identifiers

Book Title

Degree Discipline

Computer Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Citation

Published Version (Please cite this version)

Language

English

Type