Survival prediction via partial ordering in feature space and sample space

Date
2016-03
Advisor
Taştan, Öznur
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Bilkent University
Volume
Issue
Pages
Language
English
Type
Thesis
Journal Title
Journal ISSN
Volume Title
Series
Abstract

Predicting the survival of a cancer patient is critical for choosing patient specific treatment strategies and is traditionally based on clinical or pathological factors such as patient age and tumor stage. In this thesis, we present two methodologies to build effective and interpretable survival models that utilize high-dimensional molecular profiles made available through next-gen sequencing technologies. Firstly, we present a method that focuses on partial ordering in the feature space. Existing models rely on the individual molecular quantities recorded in tumors; however, cancer is a complex disease where molecular mechanisms are dysregulated in various ways. This study, based on a system level perspective, incorporates the partial ordering of molecules (POF) in lieu of individual quantities. This strategy not only unveils predictive features with direct relevance to the biological mechanism and but also yields better performance in survival prediction compared to multivariate `1 penalized Cox proportional hazard and Random Survival Forest models. Testing the partial order representation of features in the subgroup identification task, we find that these features yield groups of patients, which are more quantifably distinct in terms of survival distributions. Secondly, we develop a survival prediction method based on ranking and support vector machines { Ranking Survival Vector Machines (RsurVM). RsurVM obtains a pairwise ranking of the patient survival times by learning to rank. It focuses on optimizing the most commonly used metric concordance index and can handle the censored data without making any assumptions. Our extensive tests on the ovarian adenocarcinoma patient molecular data demonstrate that RsurVM achieves better survival predictions regardless of the input molecular data (mRNA, protein, miRNA, Copy number variation and DNA methylation) than the two most commonly used methods: Cox-proportional hazards model and Random Survival Forest.

Course
Other identifiers
Book Title
Keywords
Survival estimation, Pairwise ranking, Partial ordering, Biologically interpretible features
Citation
Published Version (Please cite this version)