SeGraM: A universal hardware accelerator for genomic sequence-to-graph and sequence-to-sequence mapping
buir.contributor.author | Bingöl, Zülal | |
buir.contributor.author | Fırtına, Can | |
buir.contributor.author | Alkan, Can | |
buir.contributor.orcid | Bingöl, Zülal|0000-0002-2828-9665 | |
buir.contributor.orcid | Fırtına, Can|0000-0002-6548-7863 | |
buir.contributor.orcid | Alkan, Can|0000-0002-5443-0706 | |
dc.citation.epage | 655 | en_US |
dc.citation.spage | 638 | en_US |
dc.contributor.author | Cali, D.Ş | |
dc.contributor.author | Kanellopoulos, K. | |
dc.contributor.author | Lindegger, J. | |
dc.contributor.author | Bingöl, Zülal | |
dc.contributor.author | Kalsi, G.S. | |
dc.contributor.author | Zuo, Z. | |
dc.contributor.author | Fırtına, Can | |
dc.contributor.author | Cavlak, M.B. | |
dc.contributor.author | Kim, J. | |
dc.contributor.author | Ghiasi, N.M. | |
dc.contributor.author | Singh, G. | |
dc.contributor.author | Gómez-Luna, J. | |
dc.contributor.author | Almadhoun Alserr, N. | |
dc.contributor.author | Alser, M. | |
dc.contributor.author | Subramoney, S. | |
dc.contributor.author | Alkan, Can | |
dc.contributor.author | Ghose, S. | |
dc.contributor.author | Mutlu, O. | |
dc.date.accessioned | 2023-02-27T06:06:25Z | |
dc.date.available | 2023-02-27T06:06:25Z | |
dc.date.issued | 2020-06-11 | |
dc.department | Department of Computer Engineering | en_US |
dc.description.abstract | A critical step of genome sequence analysis is the mapping of sequenced DNA fragments (i.e., reads) collected from an individual to a known linear reference genome sequence (i.e., sequence-to-sequence mapping). Recent works replace the linear reference sequence with a graph-based representation of the reference genome, which captures the genetic variations and diversity across many individuals in a population. Mapping reads to the graph-based reference genome (i.e., sequence-to-graph mapping) results in notable quality improvements in genome analysis. Unfortunately, while sequence-to-sequence mapping is well studied with many available tools and accelerators, sequence-to-graph mapping is a more difficult computational problem, with a much smaller number of practical software tools currently available. We analyze two state-of-the-art sequence-to-graph mapping tools and reveal four key issues. We find that there is a pressing need to have a specialized, high-performance, scalable, and low-cost algorithm/hardware co-design that alleviates bottlenecks in both the seeding and alignment steps of sequence-to-graph mapping. Since sequence-to-sequence mapping can be treated as a special case of sequence-to-graph mapping, we aim to design an accelerator that is efficient for both linear and graph-based read mapping. To this end, we propose SeGraM, a universal algorithm/hardware co-designed genomic mapping accelerator that can effectively and efficiently support both <u>se</u>quence-to-<u>gra</u>ph <u>m</u>apping and sequence-to-sequence mapping, for both short and long reads. To our knowledge, SeGraM is the first algorithm/hardware co-design for accelerating sequence-to-graph mapping. SeGraM consists of two main components: (1) MinSeed, the first <u>min</u>imizer-based <u>seed</u>ing accelerator, which finds the candidate locations in a given genome graph; and (2) BitAlign, the first <u>bit</u>vector-based sequence-to-graph <u>align</u>ment accelerator, which performs alignment between a given read and the subgraph identified by MinSeed. We couple SeGraM with high-bandwidth memory to exploit low latency and highly-parallel memory access, which alleviates the memory bottleneck. We demonstrate that SeGraM provides significant improvements for multiple steps of the sequence-to-graph (i.e., S2G) and sequence-to-sequence (i.e., S2S) mapping pipelines. First, SeGraM outperforms state-of-the-art S2G mapping tools by 5.9×/3.9× and 106×/- 742× for long and short reads, respectively, while reducing power consumption by 4.1×/4.4× and 3.0×/3.2×. Second, BitAlign outperforms a state-of-the-art S2G alignment tool by 41×-539× and three S2S alignment accelerators by 1.2×-4.8×. We conclude that SeGraM is a high-performance and low-cost universal genomics mapping accelerator that efficiently supports both sequence-to-graph and sequence-to-sequence mapping pipelines. | en_US |
dc.description.provenance | Submitted by Samet Emre (samet.emre@bilkent.edu.tr) on 2023-02-27T06:06:25Z No. of bitstreams: 1 SeGraM _a _universal _hardware _accelerator _for _genomic _sequence-to-graph _and _sequence-to-sequence _mapping.pdf: 8433280 bytes, checksum: d355126d9a3a768fca5601327506556c (MD5) | en |
dc.description.provenance | Made available in DSpace on 2023-02-27T06:06:25Z (GMT). No. of bitstreams: 1 SeGraM _a _universal _hardware _accelerator _for _genomic _sequence-to-graph _and _sequence-to-sequence _mapping.pdf: 8433280 bytes, checksum: d355126d9a3a768fca5601327506556c (MD5) Previous issue date: 2020-06-11 | en |
dc.identifier.doi | 10.1145/3470496.3527436 | en_US |
dc.identifier.eissn | 1557-735X | en_US |
dc.identifier.uri | http://hdl.handle.net/11693/111783 | en_US |
dc.language.iso | English | en_US |
dc.publisher | Association for Computing Machinery | en_US |
dc.relation.isversionof | https://doi.org/10.1145/3470496.3527436 | en_US |
dc.source.title | ISCA '22: Proceedings of the 49th Annual International Symposium on Computer Architecture | en_US |
dc.subject | Genomics | en_US |
dc.subject | Genome analysis | en_US |
dc.subject | Genome graphs | en_US |
dc.subject | Read mapping | en_US |
dc.subject | Algorithm/hardware co-design | en_US |
dc.subject | Hardware accelerator | en_US |
dc.subject | Read alignment | en_US |
dc.subject | Seeding | en_US |
dc.subject | Minimizer | en_US |
dc.subject | Bitvector | en_US |
dc.title | SeGraM: A universal hardware accelerator for genomic sequence-to-graph and sequence-to-sequence mapping | en_US |
dc.type | Conference Paper | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- SeGraM _a _universal _hardware _accelerator _for _genomic _sequence-to-graph _and _sequence-to-sequence _mapping.pdf
- Size:
- 8.04 MB
- Format:
- Adobe Portable Document Format
- Description:
- Makale Dosyası
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.69 KB
- Format:
- Item-specific license agreed upon to submission
- Description: