Browsing by Author "Atalay, V."
Now showing 1 - 13 of 13
- Results Per Page
- Sort Options
Item Open Access Bi-k-bi clustering: mining large scale gene expression data using two-level biclustering(Inderscience Enterprises Ltd., 2010) Çarkacioǧlu, L.; Atalay, R.; Konu, O.; Atalay, V.; Can, T.Due to the increase in gene expression data sets in recent years, various data mining techniques have been proposed for mining gene expression profiles. However, most of these methods target single gene expression data sets and cannot handle all the available gene expression data in public databases in reasonable amount of time and space. In this paper, we propose a novel framework, bi-k-bi clustering, for finding association rules of gene pairs that can easily operate on large scale and multiple heterogeneous data sets. We applied our proposed framework on the available NCBI GEO Homo sapiens data sets. Our results show consistency and relatedness with the available literature and also provides novel associations. Copyright © 2010 Inderscience Enterprises Ltd.Item Open Access HandVR: a hand-gesture-based interface to a video retrieval system(Springer U K, 2015) Genç, S.; Baştan M.; Güdükbay, Uğur; Atalay, V.; Ulusoy, ÖzgürUsing one’s hands in human–computer interaction increases both the effectiveness of computer usage and the speed of interaction. One way of accomplishing this goal is to utilize computer vision techniques to develop hand-gesture-based interfaces. A video database system is one application where a hand-gesture-based interface is useful, because it provides a way to specify certain queries more easily. We present a hand-gesture-based interface for a video database system to specify motion and spatiotemporal object queries. We use a regular, low-cost camera to monitor the movements and configurations of the user’s hands and translate them to video queries. We conducted a user study to compare our gesture-based interface with a mouse-based interface on various types of video queries. The users evaluated the two interfaces in terms of different usability parameters, including the ease of learning, ease of use, ease of remembering (memory), naturalness, comfortable use, satisfaction, and enjoyment. The user study showed that querying video databases is a promising application area for hand-gesture-based interfaces, especially for queries involving motion and spatiotemporal relations.Item Open Access Identification of Novel Reference Genes Based on MeSH Categories(PLoS ONE, 2014) Ersahin, T.; Carkacioglu, L.; Can, T.; Konu, O.; Atalay, V.; Cetin Atalay, R.Transcriptome experiments are performed to assess protein abundance through mRNA expression analysis. Expression levels of genes vary depending on the experimental conditions and the cell response. Transcriptome data must be diverse and yet comparable in reference to stably expressed genes, even if they are generated from different experiments on the same biological context from various laboratories. In this study, expression patterns of 9090 microarray samples grouped into 381 NCBI-GEO datasets were investigated to identify novel candidate reference genes using randomizations and Receiver Operating Characteristic (ROC) curves. The analysis demonstrated that cell type specific reference gene sets display less variability than a united set for all tissues. Therefore, constitutively and stably expressed, origin specific novel reference gene sets were identified based on their coefficient of variation and percentage of occurrence in all GEO datasets, which were classified using Medical Subject Headings (MeSH). A large number of MeSH grouped reference gene lists are presented as novel tissue specific reference gene lists. The most commonly observed 17 genes in these sets were compared for their expression in 8 hepatocellular, 5 breast and 3 colon carcinoma cells by RT-qPCR to verify tissue specificity. Indeed, commonly used housekeeping genes GAPDH, Actin and EEF2 had tissue specific variations, whereas several ribosomal genes were among the most stably expressed genes in vitro. Our results confirm that two or more reference genes should be used in combination for differential expression analysis of large-scale data obtained from microarray or next generation sequencing studies. Therefore context dependent reference gene sets, as presented in this study, are required for normalization of expression data from diverse technological backgrounds. © 2014 Ersahin et al.Item Open Access Implicit motif distribution based hybrid computational kernel for sequence classification(Oxford University Press, 2005) Atalay, V.; Cetin Atalay, R.Motivation: We designed a general computational kernel for classification problems that require specific motif extraction and search from sequences. Instead of searching for explicit motifs, our approach finds the distribution of implicit motifs and uses as a feature for classification. Implicit motif distribution approach may be used as modus operandi for bioinformatics problems that require specific motif extraction and search, which is otherwise computationally prohibitive. Results: A system named P2SL that infer protein subcellular targeting was developed through this computational kernel. Targeting-signal was modeled by the distribution of subsequence occurrences (implicit motifs) using self-organizing maps. The boundaries among the classes were then determined with a set of support vector machines. P2SL hybrid computational system achieved ∼81% of prediction accuracy rate over ER targeted, cytosolic, mitochondrial and nuclear protein localization classes. P2SL additionally offers the distribution potential of proteins among localization classes, which is particularly important for proteins, shuttle between nucleus and cytosol. © The Author 2004. Published by Oxford University Press. All rights reserved.Item Open Access A novel model-based method for feature extraction from protein sequences for classification(IEEE, 2006) Saraç, Ö. S.; Atalay, V.; Çetin-Atalay, RengülRepresentation of amino-acid sequences constitutes the key point in classification of proteins into functional or structural classes. The representation should contain the biologically meaningful information hidden in the primary sequence of the protein. Conserved or similar subsequences are strong indicators of functional and structural similarity. In this study we present a feature mapping that takes into account the models of the subsequences of protein sequences. An expectation-maximization algorithm along with an HMM mixture model is used to cluster and learn the models of subsequences of a given set of proteins.Item Open Access Prediction of protein subcellular localization based on primary sequence data(Springer-Verlag Berlin, 2003) Özarar, M.; Atalay, V.; Atalay, R. Ç.This paper describes a system called prediction of protein subcellular localization (P2SL) that predicts the subcellular localization of proteins in eukaryotic organisms based on the amino acid content of primary sequences using amino acid order. Our approach for prediction is to find the most frequent motifs for each protein (class) based on clustering and then to use these most frequent motifs as features for classification. This approach allows a classification independent of the length of the sequence. Another important property of the approach is to provide a means to perform reverse analysis and analysis to extract rules. In addition to these and more importantly, we describe the use of a new encoding scheme for the amino acids that conserves biological function based on point of accepted mutations (PAM) substitution matrix. We present preliminary results of our system on a two class (dichotomy) classifier. However, it can be extended to multiple classes with some modifications. © Springer-Verlag Berlin Heidelberg 2003.Item Open Access Prediction of protein subcellular localization based on primary sequence data(IEEE, 2004) Özarar, M.; Atalay, V.; Çetin-Atalay, RengülSubcellular localization is crucial for determining the functions of proteins. A system called prediction of protein subcellular localization (P2SL) that predicts the subcellular localization of proteins in eukaryotic organisms based on the amino acid content of primary sequences using amino acid order is designed. The approach for prediction is to find the most frequent motifs for each protein in a given class based on clustering via self organizing maps and then to use these most frequent motifs as features for classification by the help of multi layer perceptrons. This approach allows a classification independent of the length of the sequence. In addition to these, the use of a new encoding scheme is described for the amino acids that conserves biological function based on point of accepted mutations (PAM) substitution matrix. The statistical test results of the system is presented on a four class problem. P2SL achieves slightly higher prediction accuracy than the similar studies.Item Open Access Short time series microarray data analysis and biological annotation(IEEE, 2008) Sökmen, Z.; Atalay, V.; Çetin-Atalay, RengülSignificant gene list is the result of microarray data analysis should be explained for the purpose of biological functions. The aim of this study is to extract the biologically related gene clusters over the short time series microarray gene data by applying unsupervised methods and automatically perform biological annotation of those clusters. In the first step of the study, short time series microarray expression data is clustered according to similar expression profiles. After that, several biological data sources are integrated to get information related with the genes in one of those clusters and new sub-clusters are created by using this unified information. As a last step, biological annotation of gene sub-clusters is performed by using information related with those sub-clusters.Item Open Access A signal transduction score flow algorithm for cyclic cellular pathway analysis, which combines transcriptome and ChIP-seq data(Royal Society of Chemistry, 2012) Isik, Z.; Ersahin, T.; Atalay, V.; Aykanat, Cevdet; Cetin Atalay, R.Determination of cell signalling behaviour is crucial for understanding the physiological response to a specific stimulus or drug treatment. Current approaches for large-scale data analysis do not effectively incorporate critical topological information provided by the signalling network. We herein describe a novel model- and data-driven hybrid approach, or signal transduction score flow algorithm, which allows quantitative visualization of cyclic cell signalling pathways that lead to ultimate cell responses such as survival, migration or death. This score flow algorithm translates signalling pathways as a directed graph and maps experimental data, including negative and positive feedbacks, onto gene nodes as scores, which then computationally traverse the signalling pathway until a pre-defined biological target response is attained. Initially, experimental data-driven enrichment scores of the genes were computed in a pathway, then a heuristic approach was applied using the gene score partition as a solution for protein node stoichiometry during dynamic scoring of the pathway of interest. Incorporation of a score partition during the signal flow and cyclic feedback loops in the signalling pathway significantly improves the usefulness of this model, as compared to other approaches. Evaluation of the score flow algorithm using both transcriptome and ChIP-seq data-generated signalling pathways showed good correlation with expected cellular behaviour on both KEGG and manually generated pathways. Implementation of the algorithm as a Cytoscape plug-in allows interactive visualization and analysis of KEGG pathways as well as user-generated and curated Cytoscape pathways. Moreover, the algorithm accurately predicts gene-level and global impacts of single or multiple in silico gene knockouts. This journal is © The Royal Society of Chemistry 2012.Item Open Access Subband domain coding of binary textual images for document archiving(Institute of Electrical and Electronics Engineers, 1999-10) Gerek, Ö. N.; Çetin, A. Enis; Tewfik, A. H.; Atalay, V.In this work, a subband domain textual image compression method is developed. The document image is first decomposed into subimages using binary subband decompositions. Next, the character locations in the subbands and the symbol library consisting of the character images are encoded. The method is suitable for keyword search in the compressed data. It is observed that very high compression ratios are obtained with this method. Simulation studies are presented.Item Open Access Subsequence-based feature map for protein function classification(Elsevier, 2008) Sarac, O. S.; Gürsoy-Yüzügüllü, O.; Cetin Atalay, R.; Atalay, V.Automated classification of proteins is indispensable for further in vivo investigation of excessive number of unknown sequences generated by large scale molecular biology techniques. This study describes a discriminative system based on feature space mapping, called subsequence profile map (SPMap) for functional classification of protein sequences. SPMap takes into account the information coming from the subsequences of a protein. A group of protein sequences that belong to the same level of classification is decomposed into fixed-length subsequences and they are clustered to obtain a representative feature space mapping. Mapping is defined as the distribution of the subsequences of a protein sequence over these clusters. The resulting feature space representation is used to train discriminative classifiers for functional families. The aim of this approach is to incorporate information coming from important subregions that are conserved over a family of proteins while avoiding the difficult task of explicit motif identification. The performance of the method was assessed through tests on various protein classification tasks. Our results showed that SPMap is capable of high accuracy classification in most of these tasks. Furthermore SPMap is fast and scalable enough to handle large datasets. © 2007 Elsevier Ltd. All rights reserved.Item Open Access Vision-based continuous Graffiti™-like text entry system(SPIE, 2004) Erdem, İ. A.; Erdem, M. E.; Atalay, V.; Çetin, A. EnisIt is now possible to design real-time, low-cost computer version systems even in personal computers due to the recent advances in electronics and the computer industry. Due to this reason, it is feasible to develop computer-vision-based human-computer interaction systems. A vision-based continuous Graffiti™-like text entry system is presented. The user sketches characters in a Griffiti™-like alphabet in a continuous manner on a flat surface using a laser pointer. The beam of the laser pointer is tracked on the image sequences captured by a camera, and the corresponding written word is recognized from the extracted trace of the laser beam. © 2004 Society of Photo-Optical Instrumentation Engineers.Item Open Access Vision-based single-stroke character recognition for wearable computing(IEEE, 2001) Özer, Ö. F.; Özün, O.; Tüzel, C. Ö.; Atalay, V.; Çetin, A. EnisParticularly when compared to traditional tools such as a keyboard or mouse, wearable computing data entry tools offer increased mobility and flexibility. Such tools include touch screens, hand gesture and facial expression recognition, speech recognition, and key systems. We describe a new approach for recognizing characters drawn by hand gestures or by a pointer on a user's forearm captured by a digital camera. We draw each character as a single, isolated stroke using a Graffiti-like alphabet. Our algorithm enables effective and quick character recognition. The resulting character recognition system has potential for application in mobile communication and computing devices such as phones, laptop computers, handheld computers and personal data assistants.