Using Bloom filters to quickly and efficiently characterize genomic repeats and segmental duplications

buir.advisorAlkan, Can
dc.contributor.authorZambaku, Klea
dc.date.accessioned2025-09-02T11:39:09Z
dc.date.available2025-09-02T11:39:09Z
dc.date.copyright2025-08
dc.date.issued2025-08
dc.date.submitted2025-08-29
dc.descriptionCataloged from PDF version of article.
dc.descriptionIncludes bibliographical references (leaves 35-43).
dc.description.abstractAdvances in sequencing technologies are expected to further reduce the occurrence of sequencing-related misassemblies. Nevertheless, errors caused by repetitive sequences and duplications remain a persistent challenge and are likely to continue impacting genome assemblies. This highlights the need for fast and efficient algorithms specifically designed to address repeat-induced errors. In this study, we present KonuSeg, a versatile k-mer counting tool that leverages Bloom filters and assigns copy numbers to genomic regions in a segmentbased manner across the genome. KonuSeg employs a non-mapping-based approach that is computationally efficient and readily integrable into assembly graph frameworks, providing improved scalability and memory performance. We demonstrate its effectiveness through comprehensive analyses on data from multiple species under various configurations and evaluate its performance in combination with a widely used scaffolding algorithm to showcase its potential for enhancing assembly quality.
dc.description.statementofresponsibilityby Klea Zambaku
dc.embargo.release2026-02-28
dc.format.extentxii, 44 leaves : color illustrations, charts ; 30 cm.
dc.identifier.itemidB163191
dc.identifier.urihttps://hdl.handle.net/11693/117472
dc.language.isoEnglish
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectWhole genome
dc.subjectAssembly
dc.subjectBloom filter
dc.subjectCounting Bloom filter
dc.subjectMisassembly
dc.subjectk-mer
dc.subjectRepeats
dc.titleUsing Bloom filters to quickly and efficiently characterize genomic repeats and segmental duplications
dc.title.alternativeBloom filtresi kullanarak hızlı ve verimli şekilde genomik tekrar ve segmental duplikasyonların karakterizasyonu
dc.typeThesis
thesis.degree.disciplineComputer Engineering
thesis.degree.grantorBilkent University
thesis.degree.levelMaster's
thesis.degree.nameMS (Master of Science)

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
B163191.pdf
Size:
6.08 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.1 KB
Format:
Item-specific license agreed upon to submission
Description: