Privacy preserving protocol for detecting genetic relatives using rare variants

Farhad Hormozdiari; Jong Wha J Joo; Akshay Wadia; Feng Guan; Rafail Ostrosky; Amit Sahai; Eleazar Eskin

doi:10.1093/bioinformatics/btu294

Privacy preserving protocol for detecting genetic relatives using rare variants

Bioinformatics. 2014 Jun 15;30(12):i204-11. doi: 10.1093/bioinformatics/btu294.

Authors

Farhad Hormozdiari¹, Jong Wha J Joo¹, Akshay Wadia¹, Feng Guan¹, Rafail Ostrosky¹, Amit Sahai¹, Eleazar Eskin²

Affiliations

¹ Department of Computer Science, Bioinformatics IDP, Department of Mathematics and Department of Human Genetics, University of California, LA 90095, USA.
² Department of Computer Science, Bioinformatics IDP, Department of Mathematics and Department of Human Genetics, University of California, LA 90095, USADepartment of Computer Science, Bioinformatics IDP, Department of Mathematics and Department of Human Genetics, University of California, LA 90095, USA.

Abstract

Motivation: High-throughput sequencing technologies have impacted many areas of genetic research. One such area is the identification of relatives from genetic data. The standard approach for the identification of genetic relatives collects the genomic data of all individuals and stores it in a database. Then, each pair of individuals is compared to detect the set of genetic relatives, and the matched individuals are informed. The main drawback of this approach is the requirement of sharing your genetic data with a trusted third party to perform the relatedness test.

Results: In this work, we propose a secure protocol to detect the genetic relatives from sequencing data while not exposing any information about their genomes. We assume that individuals have access to their genome sequences but do not want to share their genomes with anyone else. Unlike previous approaches, our approach uses both common and rare variants which provide the ability to detect much more distant relationships securely. We use a simulated data generated from the 1000 genomes data and illustrate that we can easily detect up to fifth degree cousins which was not possible using the existing methods. We also show in the 1000 genomes data with cryptic relationships that our method can detect these individuals.

Availability: The software is freely available for download at http://genetics.cs.ucla.edu/crypto/.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Genetic Privacy*
Genetic Variation*
Genome, Human*
Genomics / methods*
Haplotypes
High-Throughput Nucleotide Sequencing
Humans
Pedigree*

Abstract

Publication types

MeSH terms

Grants and funding