A web tool to explore, annotate and classify the Acıbadem Breast Cancer Cohort RNA-seq data with gene signatures and clinical/mutation data, according to molecular subtypes
Date
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
BUIR Usage Stats
views
downloads
Series
Abstract
Transcriptomics-based approaches have revealed the molecular heterogeneity and distinct gene expression patterns across breast cancer subtypes since the early 2000s. This led to the usage of molecular subtypes in clinics and translational research in prognostic assessment, therapeutic efficacy prediction, and retrospec- tive analysis of cohort studies. In this thesis, breast cancer subtypes of Acıbadem Breast Cancer Cohort (ABCC) RNA-seq data were classified with immunohisto- chemistry (IHC), PAM50, and SCMOD1 as molecular subtype predictors. The results revealed the moderate concordance of the methods across ABCC and se- lected five other public datasets. In addition, it was shown that the classification of ABCC and TCGA-BRCA RNAseq data strongly depends on the gene sig- nature selection. Further, a machine learning model trained with TCGA-BRCA RNA-seq data and PAM50 genes as predictors showed moderate results for ABCC and MATADOR due to the imbalanced nature of datasets where feature impor- tance revealed a subset of PAM50 genes as predictors. Additionally, the R-Shiny- based classABCC app was developed to facilitate clustering of ABCC with six gene signatures, molecular subtyping of ABCC, and prediction of subtypes with TCGA-BRCA RNA-seq trained machine learning model.