GateKeeper-GPU: accelerated pre-alignment filtering in short read mapping

Limited Access
This item is unavailable until:
2021-03-03
Date
2020-08
Editor(s)
Advisor
Alkan, Can
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Bilkent University
Volume
Issue
Pages
Language
English
Journal Title
Journal ISSN
Volume Title
Series
Abstract

Recent advances in high throughput sequencing (HTS) facilitate fast production of short DNA fragments (reads) in numerous amounts. Although the production is becoming inexpensive everyday, processing the present data for sequence alignment as a whole procedure is still computationally expensive. As the last step of alignment, the candidate locations of short reads on the reference genome are verified in accordance with their difference from the corresponding reference segment with the least possible error. In this sense, comparison of reads and reference segments requires approximate string matching techniques which traditionally inherit dynamic programming algorithms. Performing dynamic programming for each of the read and reference segment pair makes alignment, a computationally-costly stage for mapping process. So, accelerating this stage is expected to improve alignment performance in terms execution time. Here, we propose, GateKeeper-GPU, a fast pre-alignment filter to be performed before verification to get rid of the sequence pairs, which exceed a predefined error threshold, for reducing the computational load on the dynamic programming. We choose GateKeeper as the filtration algorithm, we improve and implement it on a GPGPU platform with CUDA framework to obtain benefit from performing compute-intensive work with highly parallel and independent millions of threads for boosting performance. GateKeeper-GPU can accelerate verification stage by up to 2.9× and provide up to 1.4× speedup for overall read alignment procedure when integrated with mrFAST, while producing up to 52× less number of false accept pairs than original GateKeeper work.

Course
Other identifiers
Book Title
Citation
Published Version (Please cite this version)