Paralog-specific gene copy number discovery within segmental duplications
Date
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Volume
Issue
Pages
Language
Type
Journal Title
Journal ISSN
Volume Title
Usage Stats
views
downloads
Series
Abstract
With the advancing technology in genome sequencing and analysis, it has become evident that the structural variations are the main source of alteration in human genome. Despite their signi cance in understanding disease susceptibility, there is no algorithm yet to nd all types and sizes of structural variations at once. Structural variation discovery remained problematic since they often overlap with the segmental duplications, nearly identical segments of DNA that appear more than once in the genome. Researchers often excluded these regions that made up 5% of the genome because of the complexity it brings to their studies. Only few of them are working in these regions, however, they require a special sequence alignment le where reads are mapped to multiple locations. Here, we present ParaCoND to discover paralog speci c gene copy number within segmental duplications using a sequence alignment le with unique mapping. We utilize the singly unique nucleotides (SUN) that distinguish paralogs from each other in the sequence alignment of the duplicated regions. Our method is based on read depth and is limited to detect only duplications and deletions. We computed the absolute copy numbers of genes using only read depth of SUN. Furthermore, we also computed the paralog speci c absolute copy numbers for genes residing in the same segmental duplication.