Deepside: predicting drug side effects with deep learning
Item Usage Stats
Drug failures due to unforeseen adverse effects at clinical trials pose health risks for the participants and cause substantial financial losses. Side effect prediction algorithms, on the other hand, have the potential to guide the drug design process. LINCS L1000 dataset provides a vast resource of gene expression profiles across different cell lines that are induced with different dosages taken at different time points. The state-of-the-art approach in the literature relies on high-quality experiments in LINCS L1000 and discard a large portion of the recorded experiments. In this study, we investigate whether more information can be extracted from this remaining set of experiments with a deep learning-based approach. We experiment with 6 different deep learning architectures that use (i) gene expression data from the LINCS L1000 project, (ii) chemical structure fingerprints of drugs, (iii) SMILES string representation of drug structure, and (iv) the atomic structure of the drug molecules. The multilayer perceptron (MLP) based model which uses chemical structures and gene expression features achieve 88% micro- AUC and 79% macro-AUC, thus offering better performance in comparison to the state-of-the-art studies on side effect prediction. We observe that the chemical structure is more predictive than the gene expression profiles despite the fact that the features are extracted with different deep learning models. Finally, the convolutional neural network-based model that uses only SMILES strings of the drugs provides 82% macro-AUC, and 88%micro-AUC improvements, better performing than the models that use gene expression and chemical structure features simultaneously.