• About
  • Policies
  • What is open access
  • Library
  • Contact
Advanced search
      View Item 
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      SeGraM: A universal hardware accelerator for genomic sequence-to-graph and sequence-to-sequence mapping

      Thumbnail
      View / Download
      8.0 Mb
      Author(s)
      Cali, D.Ş
      Kanellopoulos, K.
      Lindegger, J.
      Bingöl, Zülal
      Kalsi, G.S.
      Zuo, Z.
      Fırtına, Can
      Cavlak, M.B.
      Kim, J.
      Ghiasi, N.M.
      Singh, G.
      Gómez-Luna, J.
      Almadhoun Alserr, N.
      Alser, M.
      Subramoney, S.
      Alkan, Can
      Ghose, S.
      Mutlu, O.
      Date
      2020-06-11
      Source Title
      ISCA '22: Proceedings of the 49th Annual International Symposium on Computer Architecture
      Electronic ISSN
      1557-735X
      Publisher
      Association for Computing Machinery
      Pages
      638 - 655
      Language
      English
      Type
      Conference Paper
      Item Usage Stats
      6
      views
      17
      downloads
      Abstract
      A critical step of genome sequence analysis is the mapping of sequenced DNA fragments (i.e., reads) collected from an individual to a known linear reference genome sequence (i.e., sequence-to-sequence mapping). Recent works replace the linear reference sequence with a graph-based representation of the reference genome, which captures the genetic variations and diversity across many individuals in a population. Mapping reads to the graph-based reference genome (i.e., sequence-to-graph mapping) results in notable quality improvements in genome analysis. Unfortunately, while sequence-to-sequence mapping is well studied with many available tools and accelerators, sequence-to-graph mapping is a more difficult computational problem, with a much smaller number of practical software tools currently available. We analyze two state-of-the-art sequence-to-graph mapping tools and reveal four key issues. We find that there is a pressing need to have a specialized, high-performance, scalable, and low-cost algorithm/hardware co-design that alleviates bottlenecks in both the seeding and alignment steps of sequence-to-graph mapping. Since sequence-to-sequence mapping can be treated as a special case of sequence-to-graph mapping, we aim to design an accelerator that is efficient for both linear and graph-based read mapping. To this end, we propose SeGraM, a universal algorithm/hardware co-designed genomic mapping accelerator that can effectively and efficiently support both <u>se</u>quence-to-<u>gra</u>ph <u>m</u>apping and sequence-to-sequence mapping, for both short and long reads. To our knowledge, SeGraM is the first algorithm/hardware co-design for accelerating sequence-to-graph mapping. SeGraM consists of two main components: (1) MinSeed, the first <u>min</u>imizer-based <u>seed</u>ing accelerator, which finds the candidate locations in a given genome graph; and (2) BitAlign, the first <u>bit</u>vector-based sequence-to-graph <u>align</u>ment accelerator, which performs alignment between a given read and the subgraph identified by MinSeed. We couple SeGraM with high-bandwidth memory to exploit low latency and highly-parallel memory access, which alleviates the memory bottleneck. We demonstrate that SeGraM provides significant improvements for multiple steps of the sequence-to-graph (i.e., S2G) and sequence-to-sequence (i.e., S2S) mapping pipelines. First, SeGraM outperforms state-of-the-art S2G mapping tools by 5.9×/3.9× and 106×/- 742× for long and short reads, respectively, while reducing power consumption by 4.1×/4.4× and 3.0×/3.2×. Second, BitAlign outperforms a state-of-the-art S2G alignment tool by 41×-539× and three S2S alignment accelerators by 1.2×-4.8×. We conclude that SeGraM is a high-performance and low-cost universal genomics mapping accelerator that efficiently supports both sequence-to-graph and sequence-to-sequence mapping pipelines.
      Keywords
      Genomics
      Genome analysis
      Genome graphs
      Read mapping
      Algorithm/hardware co-design
      Hardware accelerator
      Read alignment
      Seeding
      Minimizer
      Bitvector
      Permalink
      http://hdl.handle.net/11693/111783
      Published Version (Please cite this version)
      https://doi.org/10.1145/3470496.3527436
      Collections
      • Department of Computer Engineering 1561
      Show full item record

      Browse

      All of BUIRCommunities & CollectionsTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsCoursesThis CollectionTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsCourses

      My Account

      Login

      Statistics

      View Usage StatisticsView Google Analytics Statistics

      Bilkent University

      If you have trouble accessing this page and need to request an alternate format, contact the site administrator. Phone: (312) 290 2976
      © Bilkent University - Library IT

      Contact Us | Send Feedback | Off-Campus Access | Admin | Privacy