Title:DL-SMILES#: A Novel Encoding Scheme for Predicting Compound
Protein Affinity Using Deep Learning
Volume: 25
Issue: 4
Author(s): Shudong Wang, Jiali Liu*, Mao Ding, Yijun Gao*, Dayan Liu, Qingyu Tian and Jinfu Zhu
Affiliation:
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, Shandong,China
- Department of Physiology, Shandong Provincial Key Laboratory of Pathogenesis and Prevention of Neurological Disorders and State Key Disciplines: Physiology, School of Basic Medicine, Qingdao University, Qingdao,China
Keywords:
Deep learning, drug repositioning, drug-target interactions, IC50 value, SMILES string, compound properties
Abstract: Introduction: Drug repositioning aims to screen drugs and therapeutic goals from
approved drugs and abandoned compounds that have been identified as safe. This trend is changing
the landscape of drug development and creating a model of drug repositioning for new drug
development. In the recent decade, machine learning methods have been applied to predict the
binding affinity of compound proteins, while deep learning is recently becoming prominent and
achieving significant performances. Among the models, the way of representing the compounds is
usually simple, which is the molecular fingerprints, i.e., a single SMILES string.
Methods: In this study, we improve previous work by proposing a novel representing manner,
named SMILES#, to recode the SMILES string. This approach takes into account the properties of
compounds and achieves superior performance. After that, we propose a deep learning model that
combines recurrent neural networks with a convolutional neural network with an attention
mechanism, using unlabeled data and labeled data to jointly encode molecules and predict binding
affinity.
Results: Experimental results show that SMILES# with compound properties can effectively
improve the accuracy of the model and reduce the RMS error on most data sets.
Conclusion: We used the method to verify the related and unrelated compounds with the same target, and the
experimental results show the effectiveness of the method.