Uncovering complementary sets of variants for predicting quantitative phenotypes

buir.contributor.authorFakhouri, Mohamad
buir.contributor.orcidFakhouri, Mohamad|0000-0002-7507-0658
dc.citation.epage917en_US
dc.citation.issueNumber4en_US
dc.citation.spage908en_US
dc.citation.volumeNumber38en_US
dc.contributor.authorYılmaz, S.
dc.contributor.authorFakhouri, Mohamad
dc.contributor.authorKoyutürk, M.
dc.contributor.authorÇiçek, A. E.
dc.contributor.authorTaştan, Ö.
dc.date.accessioned2023-02-27T09:43:57Z
dc.date.available2023-02-27T09:43:57Z
dc.date.issued2021-12-02
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractMotivation: Genome-wide association studies show that variants in individual genomic loci alone are not sufficient to explain the heritability of complex, quantitative phenotypes. Many computational methods have been developed to address this issue by considering subsets of loci that can collectively predict the phenotype. This problem can be considered a challenging instance of feature selection in which the number of dimensions (loci that are screened) is much larger than the number of samples. While currently available methods can achieve decent phenotype prediction performance, they either do not scale to large datasets or have parameters that require extensive tuning. Results: We propose a fast and simple algorithm, Macarons, to select a small, complementary subset of variants by avoiding redundant pairs that are likely to be in linkage disequilibrium. Our method features two interpretable parameters that control the time/performance trade-off without requiring parameter tuning. In our computational experiments, we show that Macarons consistently achieves similar or better prediction performance than state-ofthe-art selection methods while having a simpler premise and being at least two orders of magnitude faster. Overall, Macarons can seamlessly scale to the human genome with 107 variants in a matter of minutes while taking the dependencies between the variants into account. Availabilityand implementation: Macarons is available in Matlab and Python at https://github.com/serhan-yilmaz/macarons.en_US
dc.identifier.doi10.1093/bioinformatics/btab803en_US
dc.identifier.eissn1367-4811en_US
dc.identifier.issn1367-4803en_US
dc.identifier.urihttp://hdl.handle.net/11693/111802en_US
dc.language.isoEnglishen_US
dc.publisherOxford University Pressen_US
dc.relation.isversionofhttps://doi.org/10.1093/bioinformatics/btab803en_US
dc.source.titleBioinformaticsen_US
dc.titleUncovering complementary sets of variants for predicting quantitative phenotypesen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Uncovering_complementary_sets_of_variants_for_predicting_quantitative_phenotypes.pdf
Size:
3.32 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: