Show simple item record

dc.contributor.advisorÇiçek, Abdullah Ercüment
dc.contributor.authorDeznabi, Iman
dc.date.accessioned2018-08-29T05:58:08Z
dc.date.available2018-08-29T05:58:08Z
dc.date.copyright2018-08
dc.date.issued2018-08
dc.date.submitted2018-08-15
dc.identifier.urihttp://hdl.handle.net/11693/47751
dc.descriptionCataloged from PDF version of article.en_US
dc.descriptionThesis (M.S.): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2018.en_US
dc.descriptionIncludes bibliographical references (leaves 35-42).en_US
dc.description.abstractProtein kinases are a large family of enzymes that catalyze the phosphorylation of other proteins. By acting as molecular switches for protein activity, the phosphorylation events regulate intracellular signal transduction, thereby assuming a central role in a broad range of cellular activities. On the other hand, aberrant kinase function is implicated in many diseases. Understanding the normal and malfunctioning signaling in the cell entails the identification of phosphorylation sites and the characterization of their interactions with kinases. Recent advances in mass spectrometry enable rapid identification of phosphosites at the proteome level. Alternatively, there are many computational models that predict phosphosites in a given input protein sequence. Once a phosphosite is identified, either experimentally or computationally, knowing which kinase would catalyze the phosphorylation on this particular site becomes the next question. Although a subset of available computational methods provides kinase-specific predictions for phosphorylation sites, due to the need for training data in such supervised methods, these tools can provide predictions only for kinases for which a substantial number of the phosphosites are already known. A particular problem that has not received any attention is the prediction of new sites for kinases with few or no a priori known sites. None of the current computational methods which rely on the classical supervised learning settings can predict additional sites for this kinases. We present DeepKinZero, the first zero-shot learning approach, that can predict phosphosites for kinases with no known phosphosite information. DeepKinZero takes a peptide sequence centered at the phosphorylation site and learns the embeddings of these phosphosite sequences via a bi-directional recurrent neural network, whereas kinase embeddings are based on protein sequence vector representations and the taxonomy of kinases based on their functional properties. Through a compatibility function that associates the representations of the site sequences and the kinases, DeepKinZero transfers knowledge from kinases with many known sites to those kinases with no known sites. Our computational experiments show that DeepKinZero achieves a 30-fold increase in accuracy compared to baseline models. DeepKinZero complements existing approaches by expanding the knowledge of kinases through mapping of the phosphorylation sites pertaining to understudied kinases with no prior information, which are increasingly investigated as novel drug targets.en_US
dc.description.statementofresponsibilityby Iman Deznabi.en_US
dc.format.extentxiii, 42 leaves : charts (some color) ; 30 cm.en_US
dc.language.isoEnglishen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectKinase Substrate Classificationen_US
dc.subjectZero-Shot Learningen_US
dc.subjectRecurrent Neural Networksen_US
dc.subjectRNNen_US
dc.subjectLSTMen_US
dc.titleDeepkinzero: zero-shot learning for predicting kinase phosphorylation sitesen_US
dc.title.alternativeDeepkinzero: kinaz fosforilasyon yerlerinin sıfır-örnek öğrenim ile tahminien_US
dc.typeThesisen_US
dc.departmentDepartment of Computer Engineeringen_US
dc.publisherBilkent Universityen_US
dc.description.degreeM.S.en_US
dc.identifier.itemidB158771
dc.embargo.release2020-08-13


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record