Whole genome alignment via Alternating Lyndon Factorization Tree traversal

buir.advisorAlkan, Can
dc.contributor.authorAydın, Mahmud Sami
dc.date.accessioned2023-07-26T08:01:33Z
dc.date.available2023-07-26T08:01:33Z
dc.date.copyright2023-07
dc.date.issued2023-07
dc.date.submitted2023-07
dc.departmentDepartment of Computer Engineering
dc.descriptionCataloged from PDF version of article.
dc.descriptionThesis (Master's): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2023.
dc.descriptionIncludes bibliographical references (leaves 67-74).
dc.description.abstractThe Whole Genome Alignment Problem (WGA) is an important challenge in the field of genomics, especially in the context of pangenome construction. Here we propose a novel indexing structure called the Alternating Lyndon Factor-ization Tree (ALFTree), which incorporates both spatial and lexicographical information within its nodes. The ALFTree is a powerful tool for WGA, as it can efficiently store and retrieve information about large DNA sequences. We present an algorithm, namely Idoneous, specifically designed to construct the ALFTree from a given DNA sequence. The algorithm works by generating intervals of specific sizes, identifying matches within these intervals, and perform-ing a sanity check through alignment procedures. The algorithm is efficient and scalable, making it a valuable tool for WGA. Some of the key features of the ALFTree are 1) compact and efficient data structure for storing large DNA sequences; 2) efficient retrieval of information about specific regions of a DNA sequence; 3) ability to handle both spatial and lexicographical information; and 4) scalability to large DNA sequences. Our experimental results on different genomes highlight the effects of param-eter selections on coverage and identity. Idoneous demonstrates competitive per-formance in terms of coverage and provides flexibility in adjusting sensitivity and specificity for different alignment scenarios. The ALFTree has the potential to significantly improve the performance of WGA algorithms. We believe that the ALFTree is a valuable contribution to the field of genomics, and we hope that it will be used by researchers to accelerate the pace of discovery.
dc.description.degreeM.S.
dc.description.statementofresponsibilityby Mahmud Sami Aydın
dc.embargo.release2024-01-18
dc.format.extentxvi, 75 leaves : charts ; 30 cm.
dc.identifier.itemidB162254
dc.identifier.urihttps://hdl.handle.net/11693/112437
dc.language.iso English
dc.publisherBilkent University
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectWhole genome alignment
dc.subjectIndexing
dc.subjectLyndon factorization
dc.titleWhole genome alignment via Alternating Lyndon Factorization Tree traversal
dc.title.alternativeAlmaşık Lyndon Faktörizasyon Ağacında gezinerek tüm genom hizalama
dc.typeThesis

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
B162254.pdf
Size:
1.39 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: