Genome scaffolding using poled clone sequencing
Embargo Lift Date: 2016-12-29
Item Usage Stats
MetadataShow full item record
The DNA sequencing technologies hold great promise in generating information that will guide scientists to learn more about how the genome affects human health, organismal evolution, and genetic relationships between individuals of the same species. The process of generating raw genome sequence data becomes cheaper, faster, but more error prone. Assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ in terms of their performance, and in their final output. More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. In this thesis, we evaluated the accuracies of several genome scaffolding algorithms using two different types of data generated from the genome of the same human individual: i) whole genome shotgun sequencing (WGS), and ii) pooled clone sequencing (PCS). We observed that, it is possible to obtain less number of scaffolds with longer total assemble length if PCS data is used, compared to using only WGS data. However, the current scaffolding algorithms are developed only for WGS, and PCS-aware scaffolding algorithms remain an open problem.