Using Bloom filters to quickly and efficiently characterize genomic repeats and segmental duplications
| buir.advisor | Alkan, Can | |
| dc.contributor.author | Zambaku, Klea | |
| dc.date.accessioned | 2025-09-02T11:39:09Z | |
| dc.date.available | 2025-09-02T11:39:09Z | |
| dc.date.copyright | 2025-08 | |
| dc.date.issued | 2025-08 | |
| dc.date.submitted | 2025-08-29 | |
| dc.description | Cataloged from PDF version of article. | |
| dc.description | Includes bibliographical references (leaves 35-43). | |
| dc.description.abstract | Advances in sequencing technologies are expected to further reduce the occurrence of sequencing-related misassemblies. Nevertheless, errors caused by repetitive sequences and duplications remain a persistent challenge and are likely to continue impacting genome assemblies. This highlights the need for fast and efficient algorithms specifically designed to address repeat-induced errors. In this study, we present KonuSeg, a versatile k-mer counting tool that leverages Bloom filters and assigns copy numbers to genomic regions in a segmentbased manner across the genome. KonuSeg employs a non-mapping-based approach that is computationally efficient and readily integrable into assembly graph frameworks, providing improved scalability and memory performance. We demonstrate its effectiveness through comprehensive analyses on data from multiple species under various configurations and evaluate its performance in combination with a widely used scaffolding algorithm to showcase its potential for enhancing assembly quality. | |
| dc.description.statementofresponsibility | by Klea Zambaku | |
| dc.embargo.release | 2026-02-28 | |
| dc.format.extent | xii, 44 leaves : color illustrations, charts ; 30 cm. | |
| dc.identifier.itemid | B163191 | |
| dc.identifier.uri | https://hdl.handle.net/11693/117472 | |
| dc.language.iso | English | |
| dc.rights | info:eu-repo/semantics/openAccess | |
| dc.subject | Whole genome | |
| dc.subject | Assembly | |
| dc.subject | Bloom filter | |
| dc.subject | Counting Bloom filter | |
| dc.subject | Misassembly | |
| dc.subject | k-mer | |
| dc.subject | Repeats | |
| dc.title | Using Bloom filters to quickly and efficiently characterize genomic repeats and segmental duplications | |
| dc.title.alternative | Bloom filtresi kullanarak hızlı ve verimli şekilde genomik tekrar ve segmental duplikasyonların karakterizasyonu | |
| dc.type | Thesis | |
| thesis.degree.discipline | Computer Engineering | |
| thesis.degree.grantor | Bilkent University | |
| thesis.degree.level | Master's | |
| thesis.degree.name | MS (Master of Science) |