Browsing by Author "Acar, Aybar C."
Now showing 1 - 4 of 4
- Results Per Page
- Sort Options
Item Open Access An integrative framework for clinical diagnosis and knowledge discovery from exome sequencing data(2024-02) Shojaei, Mona; Mohammadvand, Navid; Dogan, Tunca; Alkan, Can; Atalay, Rengül Çetin; Acar, Aybar C.Non-silent single nucleotide genetic variants, like nonsense changes and insertion-deletion variants, that affect protein function and length substantially are prevalent and are frequently misclassified. The low sensitivity and specificity of existing variant effect predictors for nonsense and indel variations restrict their use in clinical applications. We propose the Pathogenic Mutation Prediction (PMPred) method to predict the pathogenicity of single nucleotide variations, which impair protein function by prematurely terminating a protein's elongation during its synthesis. The prediction starts by monitoring functional effects (Gene Ontology annotation changes) of the change in sequence, using an existing ensemble machine learning model (UniGOPred). This, in turn, reveals the mutations that significantly deviate functionally from the wild-type sequence. We have identified novel harmful mutations in patient data and present them as motivating case studies. We also show that our method has increased sensitivity and specificity compared to state-of-the-art, especially in single nucleotide variations that produce large functional changes in the final protein. As further validation, we have done a comparative docking study on such a variation that is misclassified by existing methods and, using the altered binding affinities, show how PMPred can correctly predict the pathogenicity when other tools miss it. PMPred is freely accessible as a web service at https://pmpred.kansil.org/, and the related code is available at https://github.com/kansil/PMPredItem Open Access Efficient discovery of join plans in schemaless data(ACM, 2009-09) Acar, Aybar C.; Motro, A.We describe a method of inferring join plans for a set of relation instances, in the absence of any metadata, such as attribute domains, attribute names, or constraints (e.g., keys or foreign keys). Our method enumerates the possible join plans in order of likelihood, based on the compatibility of a pair of columns and their suitability as join attributes (i.e. their appropriateness as keys). We outline two variants of the approach. The first variant is accurate but potentially time-consuming, especially for large relations that do not fit in memory. The second variant is an approximation of the former and hence less accurate, but is considerably more efficient, allowing the method to be used online, even for large relations. We provide experimental results showing how both forms scale in terms of performance as the number of candidate join attributes and the size of the relations increase. We also characterize the accuracy of the approximate variant with respect to the exact variant. Copyright ©2009 ACM.Item Open Access mESAdb: microRNA expression and sequence analysis database(Oxford University Press, 2011) Kaya, Koray D.; Karakülah, G.; Yakıcıer, Cengiz M.; Acar, Aybar C.; Konu, ÖzlenMicroRNA expression and sequence analysis database (http://konulab.fen. bilkent.edu.tr/mirna/) (mESAdb) is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language. The database primarily comprises mature microRNA sequences and their target data, along with selected human, mouse and zebrafish expression data sets. mESAdb analysis modules allow (i) mining of microRNA expression data sets for subsets of microRNAs selected manually or by motif; (ii) pair-wise multivariate analysis of expression data sets within and between taxa; and (iii) association of microRNA subsets with annotation databases, HUGE Navigator, KEGG and GO. The use of existing and customized R packages facilitates future addition of data sets and analysis tools. Furthermore, the ability to upload and analyze user-specified data sets makes mESAdb an interactive and expandable analysis tool for microRNA sequence and expression data.Item Open Access Segmenting and labeling query sequences in a multidatabase environment(Springer, Berlin, Heidelberg, 2011) Acar, Aybar C.; Motro, A.When gathering information from multiple independent data sources, users will generally pose a sequence of queries to each source, combine (union) or cross-reference (join) the results in order to obtain the information they need. Furthermore, when gathering information, there is a fair bit of trial and error involved, where queries are recursively refined according to the results of a previous query in the sequence. From the point of view of an outside observer, the aim of such a sequence of queries may not be immediately obvious. We investigate the problem of isolating and characterizing subsequences representing coherent information retrieval goals out of a sequence of queries sent by a user to different data sources over a period of time. The problem has two sub-problems: segmenting the sequence into subsequences, each representing a discrete goal; and labeling each query in these subsequences according to how they contribute to the goal. We propose a method in which a discriminative probabilistic model (a Conditional Random Field) is trained with pre-labeled sequences. We have tested the accuracy with which such a model can infer labels and segmentation on novel sequences. Results show that the approach is very accurate (> 95% accuracy) when there are no spurious queries in the sequence and moderately accurate even in the presence of substantial noise (∼70% accuracy when 15% of queries in the sequence are spurious). © 2011 Springer-Verlag.