Privacy preserving protocol for detecting genetic relatives using rare variants.
- Author(s): Hormozdiari, Farhad
- Joo, Jong Wha J
- Wadia, Akshay
- Guan, Feng
- Ostrosky, Rafail
- Sahai, Amit
- Eskin, Eleazar
- et al.
Published Web Locationhttps://doi.org/10.1093/bioinformatics/btu294
High-throughput sequencing technologies have impacted many areas of genetic research. One such area is the identification of relatives from genetic data. The standard approach for the identification of genetic relatives collects the genomic data of all individuals and stores it in a database. Then, each pair of individuals is compared to detect the set of genetic relatives, and the matched individuals are informed. The main drawback of this approach is the requirement of sharing your genetic data with a trusted third party to perform the relatedness test.In this work, we propose a secure protocol to detect the genetic relatives from sequencing data while not exposing any information about their genomes. We assume that individuals have access to their genome sequences but do not want to share their genomes with anyone else. Unlike previous approaches, our approach uses both common and rare variants which provide the ability to detect much more distant relationships securely. We use a simulated data generated from the 1000 genomes data and illustrate that we can easily detect up to fifth degree cousins which was not possible using the existing methods. We also show in the 1000 genomes data with cryptic relationships that our method can detect these individuals.The software is freely available for download at http://genetics.cs.ucla.edu/crypto/.