Preserving Genomic Privacy via Selective Sharing

Proc ACM Workshop Priv Electron Soc. 2020 Nov:2020:163-179. doi: 10.1145/3411497.3420214. Epub 2020 Nov 9.

Abstract

Although genomic data has significant impact and widespread usage in medical research, it puts individuals' privacy in danger, even if they anonymously or partially share their genomic data. To address this problem, we present a framework that is inspired from differential privacy for sharing individuals' genomic data while preserving their privacy. We assume an individual with some sensitive portion on her genome (e.g., mutations or single nucleotide polymorphisms - SNPs that reveal sensitive information about the individual) that she does not want to share. The goals of the individual are to (i) preserve the privacy of her sensitive data (considering the correlations between the sensitive and non-sensitive part), (ii) preserve the privacy of interdependent data (data that belongs to other individuals that is correlated with her data), and (iii) share as much non-sensitive data as possible to maximize utility of data sharing. As opposed to traditional differential privacy-based data sharing schemes, the proposed scheme does not intentionally add noise to data; it is based on selective sharing of data points. We observe that traditional differential privacy concept does not capture sharing data in such a setting, and hence we first introduce a privacy notation, ϵ-indirect privacy, that addresses data sharing in such settings. We show that the proposed framework does not provide sensitive information to the attacker while it provides a high data sharing utility. We also compare the proposed technique with the previous ones and show our advantage both in terms of privacy and data sharing utility.

Keywords: Data Sharing; Differential Privacy; Genomics; Privacy.