Causal mutation discovery using next generation sequencing data: development and application of a pipeline to reduce false positive calls and to map regions of shared homozygosity and IBD

Date

2012-11

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

62nd Annual Meeting of the American Society of Human Genetics, ASHG 2012

Print ISSN

Electronic ISSN

Publisher

American Society of Human Genetics

Volume

Issue

Pages

Language

English

Journal Title

Journal ISSN

Volume Title

Series

Abstract

Next generation sequencing technologies have brought enormous successes for disease gene discovery but also challenges for data analysis, particularly in genomic regions with low or low quality sequence coverage. Errors in variant calling may lead to missing true variants or to calling many false positives. The false discovery rate can be reduced by optimizing variant calling thresholds such as quality of base pair identification, mapping, and alignment. However, such optimization strategies are often associated with the loss of true variants. We present and apply a pipeline for variant identification and verification using aligned sequences of related individuals. It is comprised of three modules: (1) an identification pipeline for de novo variants where data of parents and siblings are aligned in order to rule out false positive calls in children, false negative calls in parents, and indel artifacts; (2) a homozygosity mapping and IBD analysis module; and (3) a variant read depth module that reveals variants that may have been missed due to sequence coverage and quality issues. We applied module (1) to a large trio-based gene discovery project and reduced the number of variant calling errors by 74%, thereby significantly streamlining the experimental validation protocol for potential de novo variants. We also applied the pipeline to the discovery of the gene responsible for mega corpus callosum and microcephaly with developmental delay, and epilepsy in a brother and sister whose unaffected parents were first cousins. Our error correction pipeline significantly improved homozygosity mapping and IBD analysis and facilitated the rapid identification of the causal allele in this family.

Course

Other identifiers

Book Title

Keywords

Citation

item.page.isversionof

Collections