SPADIS: An algorithm for selecting predictive and diverse SNPs in GWAS

buir.contributor.authorYılmaz, Serhan
buir.contributor.authorÇiçek, A. Ercüment
buir.contributor.orcidÇiçek, A. Ercüment|0000-0001-8613-6619
dc.citation.epage1216en_US
dc.citation.issueNumber3en_US
dc.citation.spage1208en_US
dc.citation.volumeNumber18en_US
dc.contributor.authorYılmaz, Serhan
dc.contributor.authorTaştan, Ö.
dc.contributor.authorÇiçek, A. Ercüment
dc.date.accessioned2022-01-27T13:30:40Z
dc.date.available2022-01-27T13:30:40Z
dc.date.issued2021
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractPhenotypic heritability of complex traits and diseases is seldom explained by individual genetic variants identified in genome-wide association studies (GWAS). Many methods have been developed to select a subset of variant loci, which are associated with or predictive of the phenotype. Selecting connected SNPs on SNP-SNP networks have been proven successful in finding biologically interpretable and predictive SNPs. However, we argue that the connectedness constraint favors selecting redundant features that affect similar biological processes and therefore does not necessarily yield better predictive performance. In this paper, we propose a novel method called SPADIS that favors the selection of remotely located SNPs in order to account for their complementary effects in explaining a phenotype. SPADIS selects a diverse set of loci on a SNP-SNP network. This is achieved by maximizing a submodular set function with a greedy algorithm that ensures a constant factor approximation to the optimal solution. We compare SPADIS to the state-of-the-art method SConES, on a dataset of Arabidopsis Thaliana with continuous flowering time phenotypes. SPADIS has better average phenotype prediction performance in 15 out of 17 phenotypes when the same number of SNPs are selected and provides consistent improvements across multiple networks and settings on average. Moreover, it identifies more candidate genes and runs faster.en_US
dc.description.provenanceSubmitted by Evrim Ergin (eergin@bilkent.edu.tr) on 2022-01-27T13:30:40Z No. of bitstreams: 1 SPADIS_An_algorithm_for_selecting_predictive_and_diverse_SNPs_in_GWAS.pdf: 1851735 bytes, checksum: 1f8de869224fa04767d479908f718e2b (MD5)en
dc.description.provenanceMade available in DSpace on 2022-01-27T13:30:40Z (GMT). No. of bitstreams: 1 SPADIS_An_algorithm_for_selecting_predictive_and_diverse_SNPs_in_GWAS.pdf: 1851735 bytes, checksum: 1f8de869224fa04767d479908f718e2b (MD5) Previous issue date: 2021en
dc.identifier.doi10.1109/TCBB.2019.2935437en_US
dc.identifier.eissn1557-9964en_US
dc.identifier.issn1545-5963en_US
dc.identifier.urihttp://hdl.handle.net/11693/76848en_US
dc.language.isoEnglishen_US
dc.publisherIEEEen_US
dc.relation.isversionofhttps://doi.org/10.1109/TCBB.2019.2935437en_US
dc.source.titleIEEE/ACM Transactions on Computational Biology and Bioinformaticsen_US
dc.subjectPhenotype predictionen_US
dc.subjectGWASen_US
dc.subjectSNP selectionen_US
dc.subjectSNP-SNP networksen_US
dc.subjectHi-Cen_US
dc.subjectSubmodular functionen_US
dc.titleSPADIS: An algorithm for selecting predictive and diverse SNPs in GWASen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
SPADIS_An_algorithm_for_selecting_predictive_and_diverse_SNPs_in_GWAS.pdf
Size:
1.77 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: