Massively parallel mapping of next generation sequence reads using GPU

Korkmaz, Mustafa

Massively parallel mapping of next generation sequence reads using GPU

buir.advisor	Aykanat, Cevdet
dc.contributor.author	Korkmaz, Mustafa
dc.date.accessioned	2016-01-08T18:25:03Z
dc.date.available	2016-01-08T18:25:03Z
dc.date.issued	2012
dc.description	Cataloged from PDF version of article.	en_US
dc.description	Includes bibliographical refences.	en_US
dc.description.abstract	The high throughput sequencing (HTS) methods have already started to fundamentally revolutionize the area of genome research through low-cost and highthroughput genome sequencing. However, the sheer size of data imposes various computational challenges. For example, in the Illumina HiSeq2000, each run produces over 7-8 billion short reads and over 600 Gb of base pairs of sequence data within less than 10 days. For most applications, analysis of HTS data starts with read mapping, i.e. nding the locations of these short sequence reads in a reference genome assembly. The similarities between two sequences can be determined by computing their optimal global alignments using a dynamic programming method called the Needleman-Wunsch algorithm. The Needleman-Wunsch algorithm is widely used in hash-based DNA read mapping algorithms because of its guaranteed sensitivity. However, the quadratic time complexity of this algorithm makes it highly timeconsuming and the main bottleneck in analysis. In addition to this drawback, the short length of reads ( 100 base pairs) and the large size of mammalian genomes (3.1 Gbp for human) worsens the situation by requiring several hundreds to tens of thousands of Needleman-Wunsch calculations per read. The fastest approach proposed so far avoids Needleman-Wunsch and maps the data described above in 70 CPU days with lower sensitivity. More sensitive mapping approaches are even slower. We propose that e cient parallel implementations of string comparison will dramatically improve the running time of this process. With this motivation, we propose to develop enhanced algorithms to exploit the parallel architecture of GPUs.	en_US
dc.description.statementofresponsibility	Korkmaz, Mustafa	en_US
dc.format.extent	xi, 52 leaves	en_US
dc.identifier.itemid	B134551
dc.identifier.uri	http://hdl.handle.net/11693/15818
dc.language.iso	English	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Semi-global alignment	en_US
dc.subject	Needleman-Wunsch	en_US
dc.subject	CUDA	en_US
dc.subject.lcc	QH447 .K67 2012	en_US
dc.subject.lcsh	Human gene mapping--Data processing.	en_US
dc.subject.lcsh	Genomics.	en_US
dc.subject.lcsh	Gene mapping.	en_US
dc.subject.lcsh	Sequence analysis.	en_US
dc.subject.lcsh	Parallel programming (Computer science)	en_US
dc.title	Massively parallel mapping of next generation sequence reads using GPU	en_US
dc.type	Thesis	en_US
thesis.degree.discipline	Computer Engineering
thesis.degree.grantor	Bilkent University
thesis.degree.level	Master's
thesis.degree.name	MS (Master of Science)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 0006522.pdf
Size:: 913.33 KB
Format:: Adobe Portable Document Format

Download

Collections

Graduate School of Engineering and Science