Show simple item record

dc.contributor.authorHach, F.en_US
dc.contributor.authorSarrafi, I.en_US
dc.contributor.authorHormozdiari, F.en_US
dc.contributor.authorAlkan C.en_US
dc.contributor.authorEichler, E. E.en_US
dc.contributor.authorSahinalp, S. C.en_US
dc.date.accessioned2016-02-08T10:49:45Z
dc.date.available2016-02-08T10:49:45Z
dc.date.issued2014en_US
dc.identifier.issn0305-1048
dc.identifier.urihttp://hdl.handle.net/11693/25736
dc.description.abstractHigh throughput sequencing (HTS) platforms generate unprecedented amounts of data that introduce challenges for processing and downstream analysis. While tools that report the 'best' mapping location of each read provide a fast way to process HTS data, they are not suitable for many types of downstream analysis such as structural variation detection, where it is important to report multiple mapping loci for each read. For this purpose we introduce mrsFAST-Ultra, a fast, cache oblivious, SNP-aware aligner that can handle the multi-mapping of HTS reads very efficiently. mrsFAST-Ultra improves mrsFAST, our first cache oblivious read aligner capable of handling multi-mapping reads, through new and compact index structures that reduce not only the overall memory usage but also the number of CPU operations per alignment. In fact the size of the index generated by mrsFAST-Ultra is 10 times smaller than that of mrsFAST. As importantly, mrsFAST-Ultra introduces new features such as being able to (i) obtain the best mapping loci for each read, and (ii) return all reads that have at most n mapping loci (within an error threshold), together with these loci, for any user specified n. Furthermore, mrsFAST-Ultra is SNP-aware, i.e. it can map reads to reference genome while discounting the mismatches that occur at common SNP locations provided by db-SNP; this significantly increases the number of reads that can be mapped to the reference genome. Notice that all of the above features are implemented within the index structure and are not simple post-processing steps and thus are performed highly efficiently. Finally, mrsFAST-Ultra utilizes multiple available cores and processors and can be tuned for various memory settings. Our results show that mrsFAST-Ultra is roughly five times faster than its predecessor mrsFAST. In comparison to newly enhanced popular tools such as Bowtie2, it is more sensitive (it can report 10 times or more mappings per read) and much faster (six times or more) in the multi-mapping mode. Furthermore, mrsFAST-Ultra has an index size of 2GB for the entire human reference genome, which is roughly half of that of Bowtie2. mrsFAST-Ultra is open source and it can be accessed at http://mrsfast.sourceforge.net. © 2014 The Author(s).en_US
dc.language.isoEnglishen_US
dc.source.titleNucleic Acids Researchen_US
dc.relation.isversionofhttp://dx.doi.org/10.1093/nar/gku370en_US
dc.subjectAlgorithmen_US
dc.subjectCalculationen_US
dc.subjectData analysis softwareen_US
dc.subjectData extractionen_US
dc.subjectGene locationen_US
dc.subjectGene locusen_US
dc.subjectGene mappingen_US
dc.subjectGenetic databaseen_US
dc.subjectHigh throughput sequencingen_US
dc.subjectHuman genomeen_US
dc.subjectPriority journalen_US
dc.subjectReference databaseen_US
dc.subjectSensitivity analysisen_US
dc.subjectSingle nucleotide polymorphismen_US
dc.subjectHigh-throughput nucleotide sequencingen_US
dc.subjectPolymorphismen_US
dc.subjectSequence alignmenten_US
dc.subjectSoftwareen_US
dc.titlemrsFAST-Ultra: a compact, SNP-aware mapper for high performance sequencing applicationsen_US
dc.typeArticleen_US
dc.departmentDepartment of Computer Engineering
dc.citation.spageW494en_US
dc.citation.epageW500en_US
dc.citation.volumeNumber42en_US
dc.citation.issueNumberW1en_US
dc.identifier.doi10.1093/nar/gku370en_US
dc.publisherOxford University Pressen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record