Subsequence-based feature map for protein function classification

dc.citation.epage130en_US
dc.citation.issueNumber2en_US
dc.citation.spage122en_US
dc.citation.volumeNumber32en_US
dc.contributor.authorSarac, O. S.en_US
dc.contributor.authorGürsoy-Yüzügüllü, O.en_US
dc.contributor.authorCetin Atalay, R.en_US
dc.contributor.authorAtalay, V.en_US
dc.date.accessioned2016-02-08T10:09:40Z
dc.date.available2016-02-08T10:09:40Z
dc.date.issued2008en_US
dc.departmentDepartment of Molecular Biology and Geneticsen_US
dc.description.abstractAutomated classification of proteins is indispensable for further in vivo investigation of excessive number of unknown sequences generated by large scale molecular biology techniques. This study describes a discriminative system based on feature space mapping, called subsequence profile map (SPMap) for functional classification of protein sequences. SPMap takes into account the information coming from the subsequences of a protein. A group of protein sequences that belong to the same level of classification is decomposed into fixed-length subsequences and they are clustered to obtain a representative feature space mapping. Mapping is defined as the distribution of the subsequences of a protein sequence over these clusters. The resulting feature space representation is used to train discriminative classifiers for functional families. The aim of this approach is to incorporate information coming from important subregions that are conserved over a family of proteins while avoiding the difficult task of explicit motif identification. The performance of the method was assessed through tests on various protein classification tasks. Our results showed that SPMap is capable of high accuracy classification in most of these tasks. Furthermore SPMap is fast and scalable enough to handle large datasets. © 2007 Elsevier Ltd. All rights reserved.en_US
dc.description.provenanceMade available in DSpace on 2016-02-08T10:09:40Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2008en
dc.identifier.doi10.1016/j.compbiolchem.2007.11.004en_US
dc.identifier.issn1476-9271
dc.identifier.urihttp://hdl.handle.net/11693/23154
dc.language.isoEnglishen_US
dc.publisherElsevieren_US
dc.relation.isversionofhttp://dx.doi.org/10.1016/j.compbiolchem.2007.11.004en_US
dc.source.titleComputational Biology and Chemistryen_US
dc.subjectFunction classificationen_US
dc.subjectProtein function predictionen_US
dc.subjectSubsequence distributionen_US
dc.subjectClassificationen_US
dc.subjectData structuresen_US
dc.subjectIdentification (control systems)en_US
dc.subjectMolecular biologyen_US
dc.subjectScalabilityen_US
dc.subjectDatasetsen_US
dc.subjectProtein function predictionen_US
dc.subjectProteinsen_US
dc.subjectEnzymeen_US
dc.subjectG protein coupled receptoren_US
dc.titleSubsequence-based feature map for protein function classificationen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Subsequence-based feature map for protein function classification.pdf
Size:
302.73 KB
Format:
Adobe Portable Document Format
Description:
Full printable version