• About
  • Policies
  • What is open access
  • Library
  • Contact
Advanced search
      View Item 
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Hercules: a profile HMM-based hybrid error correction algorithm for long reads

      Thumbnail
      View / Download
      1.1 Mb
      Author(s)
      Fırtına, Can
      Bar-Joseph, Z.
      Alkan, Can
      Çicek, A. Ercüment
      Date
      2018
      Source Title
      Nucleic Acids Research
      Print ISSN
      0305-1048
      Electronic ISSN
      1362-4962
      Publisher
      Oxford University Press
      Volume
      46
      Issue
      21
      Language
      English
      Type
      Article
      Item Usage Stats
      119
      views
      88
      downloads
      Abstract
      Choosing whether to use second or third generation sequencing platforms can lead to trade-offs between accuracy and read length. Several types of studies require long and accurate reads. In such cases researchers often combine both technologies and the erroneous long reads are corrected using the short reads. Current approaches rely on various graph or alignment based techniques and do not take the error profile of the underlying technology into account. Efficient machine learning algorithms that address these shortcomings have the potential to achieve more accurate integration of these two technologies.We proposeHercules, the first machine learning-based long read error correction algorithm. Hercules models every long read as a profile Hidden Markov Model with respect to the underlying platform’s error profile. The algorithm learns a posterior transition/emission probability distribution for each long read to correct errors in these reads. We show on two DNA-seq BAC clones (CH17-157L1 and CH17- 227A2) that Hercules-corrected reads have the highest mapping rate among all competing algorithms and have the highest accuracy when the breadth of coverage is high. On a large human CHM1 cell line WGS data set, Hercules is one of the few scalable algorithms; and among those, it achieves the highest accuracy.
      Permalink
      http://hdl.handle.net/11693/50559
      Published Version (Please cite this version)
      https://doi.org/10.1093/nar/gky724
      Collections
      • Department of Computer Engineering 1435
      Show full item record

      Browse

      All of BUIRCommunities & CollectionsTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsThis CollectionTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartments

      My Account

      LoginRegister

      Statistics

      View Usage StatisticsView Google Analytics Statistics

      Bilkent University

      If you have trouble accessing this page and need to request an alternate format, contact the site administrator. Phone: (312) 290 1771
      © Bilkent University - Library IT

      Contact Us | Send Feedback | Off-Campus Access | Admin | Privacy