Diverse sequence search and alignment

buir.advisorFerhatosmanoğlu, Hakan
dc.contributor.authorEser, Elif
dc.date.accessioned2016-01-08T18:25:37Z
dc.date.available2016-01-08T18:25:37Z
dc.date.issued2013
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionAnkara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University, 2013.en_US
dc.descriptionThesis (Master's) -- Bilkent University, 2013.en_US
dc.descriptionIncludes bibliographical references leaves 51-54.en_US
dc.description.abstractSequence similarity tools, such as BLAST, seek sequences from a database most similar to a query. They return results signi cantly similar to the query sequence that are typically also highly similar to each other. Most sequence analysis tasks in bioinformatics require an exploratory approach where the initial results guide the user to new searches. However, diversity has not been considered as an integral component of sequence search tools yet. Repetitions in the result can be avoided by introducing non-redundancy during database construction; however, it is not feasible to dynamically set a level of non-redundancy tailored to a query sequence. We introduce the problem of diverse search and browsing in sequence databases that produces non-redundant results optimized for any given query. We de ne diversity measures for sequences, and propose methods to obtain diverse results extracted from current sequence similarity search tools. We propose a new measure to evaluate the diversity of a set of sequences that is returned as a result of a similarity query. We evaluate the e ectiveness of the proposed methods in post-processing PSI-BLAST results. We also assess the functional diversity of the returned results based on available Gene Ontology annotations. Our experiments show that the proposed methods are able to achieve more diverse yet similar result sets compared to static non-redundancy approaches. In both sequence based and functional diversity evaluation, the proposed diversi cation methods outperform original BLAST results signi cantly. We built an online diverse sequence search tool Div-BLAST that supports queries using BLAST web services. It re-ranks the results diversely according to given parameters.en_US
dc.description.degreeM.S.en_US
dc.description.provenanceMade available in DSpace on 2016-01-08T18:25:37Z (GMT). No. of bitstreams: 1 0006557.pdf: 807292 bytes, checksum: 2676f4c141cee156d61b7e841de2d44d (MD5)en
dc.description.statementofresponsibilityEser, Elifen_US
dc.format.extentx, 54 leaves, illustrations, graphicsen_US
dc.identifier.itemidB139349
dc.identifier.urihttp://hdl.handle.net/11693/15855
dc.language.isoEnglishen_US
dc.publisherBilkent Universityen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectDiversity searchen_US
dc.subjectSequence alignmenten_US
dc.subjectData analysisen_US
dc.subject.lccQA76.9.D3 E74 2013en_US
dc.subject.lcshDatabase searching--Computer programs.en_US
dc.subject.lcshSequences (Mathematics)en_US
dc.subject.lcshData mining.en_US
dc.subject.lcshInformation retrieval.en_US
dc.subject.lcshSearch engines--Programming.en_US
dc.titleDiverse sequence search and alignmenten_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
0006557.pdf
Size:
788.37 KB
Format:
Adobe Portable Document Format