Shifted Hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping
dc.citation.epage | 1560 | en_US |
dc.citation.issueNumber | 10 | en_US |
dc.citation.spage | 1553 | en_US |
dc.citation.volumeNumber | 31 | en_US |
dc.contributor.author | Xin, H. | en_US |
dc.contributor.author | Greth, J. | en_US |
dc.contributor.author | Emmons, J. | en_US |
dc.contributor.author | Pekhimenko, G. | en_US |
dc.contributor.author | Kingsford, C. | en_US |
dc.contributor.author | Alkan C. | en_US |
dc.contributor.author | Mutlu, O. | en_US |
dc.date.accessioned | 2016-02-08T10:43:14Z | |
dc.date.available | 2016-02-08T10:43:14Z | |
dc.date.issued | 2015 | en_US |
dc.department | Department of Computer Engineering | en_US |
dc.description.abstract | Motivation: Calculating the edit-distance (i.e. minimum number of insertions, deletions and substitutions) between short DNA sequences is the primary task performed by seed-and-extend based mappers, which compare billions of sequences. In practice, only sequence pairs with a small edit-distance provide useful scientific data. However, the majority of sequence pairs analyzed by seed-and-extend based mappers differ by significantly more errors than what is typically allowed. Such error-abundant sequence pairs needlessly waste resources and severely hinder the performance of read mappers. Therefore, it is crucial to develop a fast and accurate filter that can rapidly and efficiently detect error-abundant string pairs and remove them from consideration before more computationally expensive methods are used. Results: We present a simple and efficient algorithm, Shifted Hamming Distance (SHD), which accelerates the alignment verification procedure in read mapping, by quickly filtering out error-abundant sequence pairs using bit-parallel and SIMD-parallel operations. SHD only filters string pairs that contain more errors than a user-defined threshold, making it fully comprehensive. It also maintains high accuracy with moderate error threshold (up to 5% of the string length) while achieving a 3-fold speedup over the best previous algorithm (Gene Myers's bit-vector algorithm). SHD is compatible with all mappers that perform sequence alignment for verification. | en_US |
dc.identifier.doi | 10.1093/bioinformatics/btu856 | en_US |
dc.identifier.issn | 1367-4803 | |
dc.identifier.uri | http://hdl.handle.net/11693/25347 | |
dc.language.iso | English | en_US |
dc.publisher | Oxford University Press | en_US |
dc.relation.isversionof | http://dx.doi.org/10.1093/bioinformatics/btu856 | en_US |
dc.source.title | Bioinformatics | en_US |
dc.title | Shifted Hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping | en_US |
dc.type | Article | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Shifted Hamming distance A fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping.pdf
- Size:
- 1.91 MB
- Format:
- Adobe Portable Document Format
- Description:
- Full Printable Version