Polishing copy number variant calls on exome sequencing data VIA deep learning

Özden, Furkan

Polishing copy number variant calls on exome sequencing data VIA deep learning

Files

10411614.pdf (1.46 MB)

Date

2021-07

Authors

Özden, Furkan

Advisor

Çiçek, A. Ercüment

BUIR Usage Stats

5
views

46
downloads

Abstract

Accurate and eﬃcient detection of copy number variants (CNVs) is of critical importance due to their signiﬁcant association with complex genetic diseases. Although algorithms that use whole genome sequencing (WGS) data provide sta-ble results with mostly-valid statistical assumptions, copy number detection on whole exome sequencing (WES) data shows comparatively lower accuracy. This is unfortunate as WES data is cost eﬃcient, compact and is relatively ubiquitous. The bottleneck is primarily due to non-contiguous nature of the targeted capture: biases in targeted genomic hybridization, GC content, targeting probes, and sam-ple batching during sequencing. Here, we present a novel deep learning model, DECoNT, which uses the matched WES and WGS data and learns to correct the copy number variations reported by any oﬀ-the-shelf WES-based germline CNV caller. We train DECoNT on the 1000 Genomes Project data, and we show that we can eﬃciently triple the duplication call precision and double the deletion call precision of the state-of-the-art algorithms. We also show that our model con-sistently improves the performance independent from (i) sequencing technology,(ii) exome capture kit and (iii) CNV caller. Using DECoNT as a universal exome CNV call polisher has the potential to improve the reliability of germline CNV detection on WES data sets.

Keywords

Copy number variation, Whole exome sequencing, Deep learning

Degree Discipline

Computer Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Permalink

http://hdl.handle.net/11693/76442

Collections

Graduate School of Engineering and Science

Language

English

Type

Thesis

Full item page

Polishing copy number variant calls on exome sequencing data VIA deep learning

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Polishing copy number variant calls on exome sequencing data VIA deep learning

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type