ECOLE: Learning to call copy number variants on whole exome sequencing data
buir.contributor.author | Kaynar, Gün | |
buir.contributor.author | Yılmaz, Mehmet Alper | |
buir.contributor.author | Alkan, Can | |
buir.contributor.author | Çiçek, A.Ercüment | |
buir.contributor.orcid | Kaynar, Gün|0009-0006-6764-7716 | |
buir.contributor.orcid | Yılmaz, Mehmet Alper|0009-0001-8933-823X | |
buir.contributor.orcid | Alkan, Can|0000-0002-5443-0706 | |
buir.contributor.orcid | Çiçek, A.Ercüment| 0000-0001-8613-6619 | |
dc.citation.epage | 132-13 | |
dc.citation.issueNumber | 1 | |
dc.citation.spage | 132-1 | |
dc.citation.volumeNumber | 15 | |
dc.contributor.author | Mandıracıoğlu, Berke | |
dc.contributor.author | Özden, Furkan | |
dc.contributor.author | Kaynar, Gün | |
dc.contributor.author | Yılmaz, Mehmet Alper | |
dc.contributor.author | Alkan, Can | |
dc.contributor.author | Çiçek, A.Ercüment | |
dc.date.accessioned | 2025-02-22T16:11:20Z | |
dc.date.available | 2025-02-22T16:11:20Z | |
dc.date.issued | 2024-01-02 | |
dc.department | Department of Computer Engineering | |
dc.description.abstract | Copy number variants (CNV) are shown to contribute to the etiology of several genetic disorders. Accurate detection of CNVs on whole exome sequencing (WES) data has been a long sought-after goal for use in clinics. This was not possible despite recent improvements in performance because algorithms mostly suffer from low precision and even lower recall on expert-curated gold standard call sets. Here, we present a deep learning-based somatic and germline CNV caller for WES data, named ECOLE. Based on a variant of the transformer architecture, the model learns to call CNVs per exon, using high-confidence calls made on matched WGS samples. We further train and fine-tune the model with a small set of expert calls via transfer learning. We show that ECOLE achieves high performance on human expert labelled data for the first time with 68.7% precision and 49.6% recall. This corresponds to precision and recall improvements of 18.7% and 30.8% over the next best-performing methods, respectively. We also show that the same fine-tuning strategy using tumor samples enables ECOLE to detect RT-qPCR-validated variations in bladder cancer samples without the need for a control sample. ECOLE is available at https://github.com/ciceklab/ECOLE. Copy number variants (CNV) are shown to contribute to the etiology of various genetic disorders. Here, authors present ECOLE, a deep learning-based somatic and germline CNV caller for WES data. Utilising a variant of the transformer architecture, the model is trained to call CNVs per exon. | |
dc.description.provenance | Submitted by Muhammed Murat Uçar (murat.ucar@bilkent.edu.tr) on 2025-02-22T16:11:20Z No. of bitstreams: 1 ECOLE_Learning_to_call_copy_number_variants_on_whole_exome_sequencing_data.pdf: 2322162 bytes, checksum: b4b13ca9787b3340cf94e5a1ab7a1bbb (MD5) | en |
dc.description.provenance | Made available in DSpace on 2025-02-22T16:11:20Z (GMT). No. of bitstreams: 1 ECOLE_Learning_to_call_copy_number_variants_on_whole_exome_sequencing_data.pdf: 2322162 bytes, checksum: b4b13ca9787b3340cf94e5a1ab7a1bbb (MD5) Previous issue date: 2024-01-02 | en |
dc.identifier.doi | 10.1038/s41467-023-44116-y | |
dc.identifier.eissn | 2041-1723 | |
dc.identifier.uri | https://hdl.handle.net/11693/116652 | |
dc.language.iso | English | |
dc.publisher | NATURE PORTFOLIO | |
dc.relation.isversionof | https://dx.doi.org/10.1038/s41467-023-44116-y | |
dc.rights | CC BY 4.0 Deed (Attribution 4.0 International) | |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
dc.source.title | Nature Communications | |
dc.title | ECOLE: Learning to call copy number variants on whole exome sequencing data | |
dc.type | Article |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- ECOLE_Learning_to_call_copy_number_variants_on_whole_exome_sequencing_data.pdf
- Size:
- 2.21 MB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: