DL-SMILES#: A Novel Encoding Scheme for Predicting Compound
Protein Affinity Using Deep Learning

Shudong       Wang; Jiali       Liu; Mao      Ding; Yijun       Gao; Dayan       Liu; Qingyu       Tian; Jinfu       Zhu

doi:10.2174/1386207324666210219102728

Abstract

Introduction: Drug repositioning aims to screen drugs and therapeutic goals from approved drugs and abandoned compounds that have been identified as safe. This trend is changing the landscape of drug development and creating a model of drug repositioning for new drug development. In the recent decade, machine learning methods have been applied to predict the binding affinity of compound proteins, while deep learning is recently becoming prominent and achieving significant performances. Among the models, the way of representing the compounds is usually simple, which is the molecular fingerprints, i.e., a single SMILES string.

Methods: In this study, we improve previous work by proposing a novel representing manner, named SMILES#, to recode the SMILES string. This approach takes into account the properties of compounds and achieves superior performance. After that, we propose a deep learning model that combines recurrent neural networks with a convolutional neural network with an attention mechanism, using unlabeled data and labeled data to jointly encode molecules and predict binding affinity.

Results: Experimental results show that SMILES# with compound properties can effectively improve the accuracy of the model and reduce the RMS error on most data sets.

Conclusion: We used the method to verify the related and unrelated compounds with the same target, and the experimental results show the effectiveness of the method.

Keywords: Deep learning, drug repositioning, drug-target interactions, IC50 value, SMILES string, compound properties

« Previous Next »

Graphical Abstract

[1] 
Cheng, F.; Zhou, Y.; Li, J.; Li, W.; Liu, G.; Tang, Y. Prediction of chemical-protein interactions: multitarget-QSAR versus computational chemogenomic methods. Mol. Biosyst.,  2012, 8(9), 2373-2384.
[http://dx.doi.org/10.1039/c2mb25110h] [PMID:  22751809] 
[2] 
Pham, H.V.; Moore, P.; My, L.N.T. A knowledge-based consultancy system using ICT Newhouse indicators with reasoning techniques for consultants in e-learning. International Journal of Adaptive and Innovative Systems,  2015, 2(3), 254.
[http://dx.doi.org/10.1504/IJAIS.2015.074410] 
[3] 
Pirlo, G.; Pellicani, L.; Galiano, A. Multi-Domain Intelligent System for Document Image Retrieval. International Journal of Adaptive and Innovative Systems,  2016, 2(4), 1.
[http://dx.doi.org/10.1504/IJAIS.2016.10011128] 
[4] 
Kansal, S.; Bansod, P.P.; Kumar, A. Prediction of instantaneous heart rate using adaptive algorithms. International Journal of Adaptive and Innovative Systems,  2019, 2(4), 267.
[http://dx.doi.org/10.1504/IJAIS.2019.108397] 
[5] 
Song, T.; Pan, L. Gheorghe Păun. Asynchronous Spiking Neural P Systems with Anti-Spikes. IEEE Trans. Nanobiosci.,  2018, 16(8), 888-895.
[6] 
Liu, K.; Wang, B. Designing DNA code: quantity and quality. International Journal of Adaptive and Innovative Systems,  2019, 2(4), 298.
[http://dx.doi.org/10.1504/IJAIS.2019.108402] 
[7] 
Zhao, Y.; Liu, X.; Sun, W. A chain membrane model with application in cluster analysis. International Journal of Adaptive and Innovative Systems,  2019, 2(4), 324.
[http://dx.doi.org/10.1504/IJAIS.2019.108417] 
[8] 
Jain, K.; Kumar, A. An optimal RSSI-based cluster-head selection for sensor networks. International Journal of Adaptive and Innovative Systems,  2019, 2(4), 349-361.
[http://dx.doi.org/10.1504/IJAIS.2019.108428] 
[9] 
Ma, J.; Sheridan, R.P.; Liaw, A.; Dahl, G.E.; Svetnik, V. Deep neural nets as a method for quantitative structure-activity relationships. J. Chem. Inf. Model.,  2015, 55(2), 263-274.
[http://dx.doi.org/10.1021/ci500747n] [PMID:  25635324] 
[10] 
Liu, T.; Lin, Y.; Wen, X.; Jorissen, R.N.; Gilson, M.K. BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res.,  2007, 35(Database issue), D198-D201.
[http://dx.doi.org/10.1093/nar/gkl999] [PMID:  17145705] 
[11] 
Song, T.; Wang, X. Homogenous Spiking Neural P Systems with Inhibitory Synapses. Neural Process. Lett.,  2015, 42(1), 199-214.
[http://dx.doi.org/10.1007/s11063-014-9352-y] 
[12] 
Suzek, B.E.; Wang, Y.; Huang, H.; McGarvey, P.B.; Wu, C.H. UniProt Consortium. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics,  2015, 31(6), 926-932.
[http://dx.doi.org/10.1093/bioinformatics/btu739] [PMID:  25398609] 
[13] 
Finn, R.D.; Clements, J.; Arndt, W.; Miller, B.L.; Wheeler, T.J.; Schreiber, F.; Bateman, A.; Eddy, S.R. HMMER web server: 2015 update. Nucleic Acids Res.,  2015, 43(W1), W30-8.
[http://dx.doi.org/10.1093/nar/gkv397] [PMID:  25943547] 
[14] 
Karimi, M.; Wu, D.; Wang, Z.; Shen, Y. DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks. Bioinformatics,  2019, 35(18), 3329-3338.
[http://dx.doi.org/10.1093/bioinformatics/btz111] [PMID:  30768156] 
[15] 
Li, S.; Li, W.; Cook, C. Independently recurrent neural network (IndRNN): building A longer and deeper RNN. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition,  2018, pp. 5457-5466.
[http://dx.doi.org/10.1109/CVPR.2018.00572] 
[16] 
Wang, Y.; Xiao, J.; Suzek, T.O.; Zhang, J.; Wang, J.; Bryant, S.H. PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Res.,  2009, 3(Web Server issue), W623-33.
[http://dx.doi.org/10.1093/nar/gkp456] [PMID: 19498078] 
[17] 
Tao; Song; Faming, Spiking Neural P Systems With White Hole Neurons. IEEE Transactions on NanoBioence,  2016, 15(7), 666-673.
[18] 
Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to sequence learning with neural networks. Adv. Neural Inf. Process. Syst.,  2014, 3104-3112.
[19] 
Cho, K. Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation,  Doha, Qatar 2014, pp. 103-111.
[20] 
Song, T.; Wang, Y.; Li, G. Server consolidation energy-saving algorithm based on resource reservation and resource allocation strategy. IEEE Access,  2019, 99, 1-1.
[http://dx.doi.org/10.1109/ACCESS.2019.2954903] 
[21] 
Sutskever, I.; Martens, J.; Dahl, G. On the importance of initialization and momentum in deep learning. International Conference on Machine Learning,  2013, pp. 1139-1147.
[22] 
Shi, X.; Wang, Z.; Deng, C.; Song, T.; Pan, L.; Chen, Z. A novel bio-sensor based on DNA strand displacement. PLoS One,  2014, 9(10), e108856.
[http://dx.doi.org/10.1371/journal.pone.0108856] [PMID: 25303242] 

Rights & Permissions Print Cite

Article Metrics

49

1

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1386207324666210219102728	Print ISSN 1386-2073
Publisher Name Bentham Science Publisher	Online ISSN 1875-5402

Combinatorial Chemistry & High Throughput Screening

DL-SMILES#: A Novel Encoding Scheme for Predicting Compound Protein Affinity Using Deep Learning

Abstract

Graphical Abstract

Advances in the design of antibody & protein with conformational dynamics and artificial intelligence approaches

Emerging trends in diseases mechanisms, noble drug targets and therapeutic strategies: focus on immunological and inflammatory disorders

Exploring Spectral Graph Theory in Combinatorial Chemistry

Combinatorial Chemistry & High Throughput Screening

DL-SMILES#: A Novel Encoding Scheme for Predicting Compound Protein Affinity Using Deep Learning

Abstract

Graphical Abstract

Call for Papers in Thematic Issues

Advances in the design of antibody & protein with conformational dynamics and artificial intelligence approaches

Emerging trends in diseases mechanisms, noble drug targets and therapeutic strategies: focus on immunological and inflammatory disorders

Exploring Spectral Graph Theory in Combinatorial Chemistry

Related Journals

Related Books