Browsing by Subject "information processing"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Open Access Div-blast: Diversification of sequence search results(Public Library of Science, 2014) Eser, E.; Can, T.; Ferhatosmanoglu H.Sequence similarity tools, such as BLAST, seek sequences most similar to a query from a database of sequences. They return results significantly similar to the query sequence and that are typically highly similar to each other. Most sequence analysis tasks in bioinformatics require an exploratory approach, where the initial results guide the user to new searches. However, diversity has not yet been considered an integral component of sequence search tools for this discipline. Some redundancy can be avoided by introducing non-redundancy during database construction, but it is not feasible to dynamically set a level of non-redundancy tailored to a query sequence. We introduce the problem of diverse search and browsing in sequence databases that produce non-redundant results optimized for any given query. We define diversity measures for sequences and propose methods to obtain diverse results extracted from current sequence similarity search tools. We also propose a new measure to evaluate the diversity of a set of sequences that is returned as a result of a sequence similarity query. We evaluate the effectiveness of the proposed methods in post-processing BLAST and PSIBLAST results. We also assess the functional diversity of the returned results based on available Gene Ontology annotations. Additionally, we include a comparison with a current redundancy elimination tool, CD-HIT. Our experiments show that the proposed methods are able to achieve more diverse yet significant result sets compared to static non-redundancy approaches. In both sequencebased and functional diversity evaluation, the proposed diversification methods significantly outperform original BLAST results and other baselines. A web based tool implementing the proposed methods, Div-BLAST, can be accessed at cedar.cs.bilkent.edu.tr/Div-BLAST © 2014 Eser et al.Item Open Access Human visual cortical responses to specular and matte motion flows(Frontiers Media S. A, 2015) Kam, T.-E.; Mannion, D.J.; Lee, S.-W.; Doerschner, K.; Kersten, D.J.Determining the compositional properties of surfaces in the environment is an important visual capacity. One such property is specular reflectance, which encompasses the range from matte to shiny surfaces. Visual estimation of specular reflectance can be informed by characteristic motion profiles; a surface with a specular reflectance that is difficult to determine while static can be confidently disambiguated when set in motion. Here, we used fMRI to trace the sensitivity of human visual cortex to such motion cues, both with and without photometric cues to specular reflectance. Participants viewed rotating blob-like objects that were rendered as images (photometric) or dots (kinematic) with either matte-consistent or shiny-consistent specular reflectance profiles. We were unable to identify any areas in low and mid-level human visual cortex that responded preferentially to surface specular reflectance from motion. However, univariate and multivariate analyses identified several visual areas; V1, V2, V3, V3A/B, and hMT+, capable of differentiating shiny from matte surface flows. These results indicate that the machinery for extracting kinematic cues is present in human visual cortex, but the areas involved in integrating such information with the photometric cues necessary for surface specular reflectance remain unclear. © 2015 Kam, Mannion, Lee, Doerschner and Kersten.Item Open Access A privacy-preserving solution for compressed storage and selective retrieval of genomic data(Cold Spring Harbor Laboratory Press, 2016) Huang Z.; Ayday, E.; Lin, H.; Aiyar, R. S.; Molyneaux, A.; Xu, Z.; Fellay, J.; Steinmetz, L. M.; Hubaux, Jean-PierreIn clinical genomics, the continuous evolution of bioinformatic algorithms and sequencing platforms makes it beneficial to store patients' complete aligned genomic data in addition to variant calls relative to a reference sequence. Due to the large size of human genome sequence data files (varying from 30 GB to 200 GB depending on coverage), two major challenges facing genomics laboratories are the costs of storage and the efficiency of the initial data processing. In addition, privacy of genomic data is becoming an increasingly serious concern, yet no standard data storage solutions exist that enable compression, encryption, and selective retrieval. Here we present a privacy-preserving solution named SECRAM (Selective retrieval on Encrypted and Compressed Reference-oriented Alignment Map) for the secure storage of compressed aligned genomic data. Our solution enables selective retrieval of encrypted data and improves the efficiency of downstream analysis (e.g., variant calling). Compared withBAM, thede factostandard for storing aligned genomic data, SECRAM uses 18%less storage. Compared with CRAM, one of the most compressed nonencrypted formats (using 34% less storage than BAM), SECRAM maintains efficient compression and downstream data processing, while allowing for unprecedented levels of security in genomic data storage. Compared with previous work, the distinguishing features of SECRAM are that (1) it is position-based insteadofread-based,and(2)itallowsrandomqueryingofasubregionfromaBAM-likefileinanencryptedform.Ourmethod thus offers a space-saving, privacy-preserving, and effective solution for the storage of clinical genomic data.