Browsing by Subject "Inversion"
Now showing 1 - 3 of 3
- Results Per Page
- Sort Options
Item Open Access Characterization of large structural variation using linked-reads(2018-08) Karaoğlanoğlu, FatihMany algorithms aimed at characterizing genomic structural variation (SV) have been developed since the inception of high-throughput sequencing. However, the full spectrum of SVs in the human genome is not yet assessed. Most of the existing methods focus on discovery and genotyping of deletions, insertions, and mobile elements. Detection of balanced SVs with no gain or loss of genomic segments (e.g. inversions) is particularly a challenging task. Long read sequencing has been leveraged to find short inversions but there is still a need to develop methods to detect large genomic inversions. Furthermore, currently there are no algorithms to predict the insertion locus of large interspersed segmental duplications. Here we propose novel algorithms to characterize large (>40Kbp) interspersed segmental duplications and (>80Kbp) inversions using Linked-Read sequencing data. Linked-Read sequencing provides long range information, where Illumina reads are tagged with barcodes that can be used to assign short reads to pools of larger (30-50 Kbp) molecules. Our methods rely on split molecule sequence signature that we have previously described. Similar to the split read, split molecules refer to large segments of DNA that span an SV breakpoint. Therefore, when mapped to the reference genome, the mapping of these segments would be discontinuous. We redesign our earlier algorithm, VALOR, to specifically leverage Linked-Read sequencing data to discover large inversions and characterize interspersed segmental duplications. We implement our new algorithms in a new software package, called VALOR2.Item Open Access Discovery of large genomic inversions using long range information(BioMed Central Ltd., 2017) Rasekh, M. E.; Chiatante, G.; Miroballo, M.; Tang, J.; Ventura M.; Amemiya, C. T.; Eichler, E. E.; Antonacci, F.; Alkan C.Although many algorithms are now available that aim to characterize different classes of structural variation, discovery of balanced rearrangements such as inversions remains an open problem. This is mainly due to the fact that breakpoints of such events typically lie within segmental duplications or common repeats, which reduces the mappability of short reads. The algorithms developed within the 1000 Genomes Project to identify inversions are limited to relatively short inversions, and there are currently no available algorithms to discover large inversions using high throughput sequencing technologies. Results: Here we propose a novel algorithm, Valor, to discover large inversions using new sequencing methods that provide long range information such as 10X Genomics linked-read sequencing, pooled clone sequencing, or other similar technologies that we commonly refer to as long range sequencing. We demonstrate the utility of Valor using both pooled clone sequencing and 10X Genomics linked-read sequencing generated from the genome of an individual from the HapMap project (NA12878). We also provide a comprehensive comparison of Valor against several state-of-the-art structural variation discovery algorithms that use whole genome shotgun sequencing data. Conclusions: In this paper, we show that Valor is able to accurately discover all previously identified and experimentally validated large inversions in the same genome with a low false discovery rate. Using Valor, we also predicted a novel inversion, which we validated using fluorescent in situ hybridization. Valor is available at https://github.com/BilkentCompGen/Valor. © 2017 The Author(s).Item Open Access Large structural variation discovery using long reads with several degrees of error(2020-12) Ebren, EzgiGenomic structural variations (SVs) are briefly defined as large-scale alterations of DNA content, copy, and organization. Although significant progress has been made since the introduction of high throughput sequencing (HTS) in character-izing SVs, accurate detection of complex SVs and balanced rearrangements still remains elusive due to the sequence complexity at the breakpoints. Until very recently, the difficulty of read mapping in such regions when the reads were short and the high error rates of long read platforms kept the problem challenging. However, with the introduction of the Pacific Biosciences’ High Fidelity (HiFi) sequencing methodology, powerful SV detection and breakpoint resolution be-came possible as a result of its capability to produce highly accurate (> 99%) long reads (10 − 20 kbps). Here, we introduce DALEK, a novel algorithm that aims to use long-read tech-nologies to discover large structural variations with high break-point resolution. DALEK uses split read and read depth signatures from long read data to dis-cover large (≥ 10 kbps) deletions, inversions and segmental duplications. We also develop methods to detect large SVs in existing high-error Oxford Nanopore Technologies data.