Spadis: selecting predictive and diverse SNPS in GWAS

Yılmaz, Serhan

Spadis: selecting predictive and diverse SNPS in GWAS

Files

serhan_yilmaz_thesis.pdf (9.3 MB)

Date

2018-05

Authors

Yılmaz, Serhan

Advisor

Çiçek, A. Ercüment.

BUIR Usage Stats

1
views

28
downloads

Abstract

Phenotypic heritability of complex traits and diseases is seldom explained by individual genetic variants identi ed in genome-wide association studies (GWAS). Many methods have been developed to select a subset of variant loci, which are associated with or predictive of the phenotype. Selecting connected Single Nucleotide Polymorphisms (SNPs) on SNP-SNP networks has been proven successful in nding biologically interpretable and predictive SNPs. However, we argue that the connectedness constraint favors selecting redundant features that a ect similar biological processes and therefore does not necessarily yield better predictive performance. To this end, we propose a novel method called SPADIS that favors the selection of remotely located SNPs in order to account for their complementary e ects in explaining a phenotype. SPADIS selects a diverse set of loci on a SNP-SNP network. This is achieved by maximizing a submodular set function with a greedy algorithm that ensures a constant factor (1 − 1=e) approximation to the optimal solution. We compare SPADIS to the state-of-the-art method SConES, on a dataset of Arabidopsis Thaliana with continuous owering time phenotypes. SPADIS has better average phenotype prediction performance in 15 out of 17 phenotypes when the same number of SNPs are selected and provides consistent improvements across multiple networks and settings on average. Moreover, it identi es more candidate genes and runs faster. We also investigate the use of Hi-C data to construct SNP-SNP network in the context of SNP selection problem for the rst time, which yields improvements in regression performance across all methods.

Keywords

GWAS, SNP Selection, SNP-SNP Networks, Hi-C, Submodularity

Degree Discipline

Computer Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Permalink

http://hdl.handle.net/11693/47721

Collections

Graduate School of Engineering and Science

Language

English

Type

Thesis

Full item page

Spadis: selecting predictive and diverse SNPS in GWAS

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Spadis: selecting predictive and diverse SNPS in GWAS

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type