The effect of kinship in re-identification attacks against genomic data sharing beacons
i903 - i910
Item Usage Stats
Motivation: Big data era in genomics promises a breakthrough in medicine, but sharing data in a private manner limit the pace of field. Widely accepted ‘genomic data sharing beacon’ protocol provides a standardized and secure interface for querying the genomic datasets. The data are only shared if the desired information (e.g. a certain variant) exists in the dataset. Various studies showed that beacons are vulnerable to re-identification (or membership inference) attacks. As beacons are generally associated with sensitive phenotype information, re-identification creates a significant risk for the participants. Unfortunately, proposed countermeasures against such attacks have failed to be effective, as they do not consider the utility of beacon protocol. Results: In this study, for the first time, we analyze the mitigation effect of the kinship relationships among beacon participants against re-identification attacks. We argue that having multiple family members in a beacon can garble the information for attacks since a substantial number of variants are shared among kin-related people. Using family genomes from HapMap and synthetically generated datasets, we show that having one of the parents of a victim in the beacon causes (i) significant decrease in the power of attacks and (ii) substantial increase in the number of queries needed to confirm an individual’s beacon membership. We also show how the protection effect attenuates when more distant relatives, such as grandparents are included alongside the victim. Furthermore, we quantify the utility loss due adding relatives and show that it is smaller compared with flipping based techniques.