• About
  • Policies
  • What is openaccess
  • Library
  • Contact
Advanced search
      View Item 
      •   BUIR Home
      • University Library
      • Bilkent Theses
      • Theses - Department of Computer Engineering
      • Dept. of Computer Engineering - Master's degree
      • View Item
      •   BUIR Home
      • University Library
      • Bilkent Theses
      • Theses - Department of Computer Engineering
      • Dept. of Computer Engineering - Master's degree
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      A weakly supervised clustering method for cancer subgroup identification

      Thumbnail
      View / Download
      7.9 Mb
      Author
      Özçelik, Duygu
      Advisor
      Okan, Öznur Taştan
      Date
      2016-08
      Publisher
      Bilkent University
      Language
      English
      Type
      Thesis
      Item Usage Stats
      94
      views
      101
      downloads
      Abstract
      Each cancer type is a heteregonous disease consisting of subtypes, which may be distinguished at the molecular, histopathological, and clinical level. Identifying the patient subtypes of a cancer type is critically important as the unique molecular characteristics of a particular patient subgroup reveal distinct disease states and opens up possibilities for targeted therapeutic regimens. Traditionally, unsupervised clustering techniques are applied on the genomic data of the tumor samples and the patient clusters are found to be of interest if they can be associated with a clinical outcome variable such as the survival of patients. In lieu of this unsupervised framework, we propose a weakly supervised clustering framework, WS-RFClust, in which the clustering partitions are guided with the clinical outcome of interest. In WS-RFClust a random forest is trained to classify the patients based on a categorical clinical variable of interest. We use the partitions of patients on the tree ensemble to construct a patient similarity matrix, which is then used as input to a clustering algorithm. WS-RFClust inherently uses the nonlinear subspace of the original features that is learned in the classiffication step for clustering. In this study, we demonstrate the effectiveness of WS-RFClust on hand-written digit datasets, which captures salient structural similarities of digit pairs. Finally, we employ WS-RFClust to find breast cancer subtypes using mRNA, protein and microRNA expressions as features. Our results on breast cancer subtype identiffication problem show that WS-RFClust could identify patients more effectively in comparison to the commonly used unsupervised clustering methods.
      Keywords
      Clustering
      Weakly supervised clustering
      Subspace clustering
      Cancer subtype identi cation
      Patient subgroup identi cation
      Permalink
      http://hdl.handle.net/11693/32162
      Collections
      • Dept. of Computer Engineering - Master's degree 511
      Show full item record

      Browse

      All of BUIRCommunities & CollectionsTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsThis CollectionTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartments

      My Account

      Login

      Statistics

      View Usage StatisticsView Google Analytics Statistics

      Bilkent University

      If you have trouble accessing this page and need to request an alternate format, contact the site administrator. Phone: (312) 290 1771
      Copyright © Bilkent University - Library IT

      Contact Us | Send Feedback | Off-Campus Access | Admin | Privacy