Generic placeholder image

Current Proteomics

Editor-in-Chief

ISSN (Print): 1570-1646
ISSN (Online): 1875-6247

Research Article

Predicting the Secondary Structure of Proteins: A Deep Learning Approach

Author(s): Charu Kathuria, Deepti Mehrotra* and Navnit Kumar Misra

Volume 19, Issue 5, 2022

Published on: 01 November, 2022

Page: [400 - 411] Pages: 12

DOI: 10.2174/1570164619666221010100406

Price: $65

Abstract

Background: The machine learning computation paradigm touched new horizons with the development of deep learning architectures. It is widely used in complex problems and achieved significant results in many traditional applications like protein structure prediction, speech recognition, traffic management, health diagnostic systems and many more. Especially, Convolution neural network (CNN) has revolutionized visual data processing tasks.

Objective: Protein structure is an important research area in various domains, from medical science and health sectors to drug designing. Fourier Transform Infrared Spectroscopy (FTIR) is the leading tool for protein structure determination. This review aims to study the existing deep learning approaches proposed in the literature to predict proteins' secondary structure and to develop a conceptual relation between FTIR spectra images and deep learning models to predict the structure of proteins.

Methods: Various pre-trained CNN models are identified and interpreted to correlate the FTIR images of proteins containing Amide-I and Amide-II absorbance values and their secondary structure.

Results: The concept of transfer learning is efficiently incorporated using the models like Visual Geometry Group (VGG), Inception, Resnet, and Efficientnet. The dataset of protein spectra images is applied as input, and these models significantly predict the secondary structure of proteins.

Conclusion: As deep learning is recently being explored in this field of research, it worked remarkably in this application and needs continuous improvement with the development of new models.

Keywords: Deep learning, transfer learning, pre-trained models, pre-processing, fourier transform infrared spectroscopy, secondary structure.

Graphical Abstract
[1]
Ayoub, J.; Yang, X.J.; Zhou, F. Modeling dispositional and initial learned trust in automated vehicles with predictability and explainability. Transp. Res., Part F Traffic Psychol. Behav., 2021, 77, 102-116.
[http://dx.doi.org/10.1016/j.trf.2020.12.015]
[2]
Battineni, G.; Sagaro, G.G.; Chinatalapudi, N.; Amenta, F. Applications of machine learning predictive models in the chronic disease diag-nosis. J. Pers. Med., 2020, 10(2), 21.
[http://dx.doi.org/10.3390/jpm10020021] [PMID: 32244292]
[3]
Javed, A.R.; Sarwar, M.U. ur Rehman, S; Khan, H.U.; Al-Otaibi, Y.D.; Alnumay, W.S. Pp-spa: privacy preserved smartphone-based per-sonal assistant to improve routine life functioning of cognitive impaired individuals. Neural Process. Lett., 2021, 2021, 1-18.
[4]
Elbadawi, M.; Gaisford, S.; Basit, A.W. Advanced machine-learning techniques in drug discovery. Drug Discov. Today, 2021, 26(3), 769-777.
[http://dx.doi.org/10.1016/j.drudis.2020.12.003] [PMID: 33290820]
[5]
Li, D.; Deng, L.; Cai, Z. Design of traffic object recognition system based on machine learning. Neural Comput. Appl., 2021, 33(14), 8143-8156.
[http://dx.doi.org/10.1007/s00521-020-04912-9]
[6]
Roy, P.; Chowdhury, C. A survey of machine learning techniques for indoor localization and navigation systems. J. Intell. Robot. Syst., 2021, 101(3), 63.
[http://dx.doi.org/10.1007/s10846-021-01327-z]
[7]
Yoo, P.; Zhou, B.; Zomaya, A. Machine learning techniques for protein secondary structure prediction: An overview and evaluation. Curr. Bioinform., 2008, 3(2), 74-86.
[http://dx.doi.org/10.2174/157489308784340676]
[8]
Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw., 2015, 61, 85-117.
[http://dx.doi.org/10.1016/j.neunet.2014.09.003] [PMID: 25462637]
[9]
Dargan, S.; Kumar, M.; Ayyagari, M.R.; Kumar, G. A survey of deep learning and its applications: A new paradigm to machine learning. Arch. Comput. Methods Eng., 2020, 27(4), 1071-1092.
[10]
Nwankpa, C.; Ijomah, W.; Gachagan, A Marshall, S Activation functions: Comparison of trends in practice and research for deep learn-ing. arXiv:1811.03378, 2018.
[11]
Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J. Big Data, 2021, 8(1), 53.
[http://dx.doi.org/10.1186/s40537-021-00444-8] [PMID: 33425651]
[12]
Simonyan, K.; Zisserman, A Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014.
[13]
Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. Proceedings of the IEEE conference on computer vision and pattern recognition, Jun 27-30, 2016Las Vegas, NV, USA, pp. 2818-2826. 2016
[http://dx.doi.org/10.1109/CVPR.2016.308]
[14]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, Jun 27-30, 2016Las Vegas, NV, USA, pp. 770-778.2016
[15]
Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. Proceed. Mach. Learn., 2019, 97, 6105-6114.
[16]
Pauling, L.; Corey, R.B.; Branson, H.R. The structure of proteins: Two hydrogen-bonded helical configurations of the polypeptide chain. Proc. Natl. Acad. Sci., 1951, 37(4), 205-211.
[http://dx.doi.org/10.1073/pnas.37.4.205] [PMID: 14816373]
[17]
Kabsch, W.; Sander, C. Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 1983, 22(12), 2577-2637.
[http://dx.doi.org/10.1002/bip.360221211] [PMID: 6667333]
[18]
Zhou, J.; Troyanskaya, O. Deep supervised and convolutional generative stochastic network for protein secondary structure prediction.Proceed. Mach. Learn; , 2014, 32, pp. 745-753.
[19]
Spencer, M.; Eickholt, J.; Cheng, J. A deep learning network approach to ab initio protein secondary structure prediction. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2015, 12(1), 103-112.
[http://dx.doi.org/10.1109/TCBB.2014.2343960] [PMID: 25750595]
[20]
Heffernan, R.; Paliwal, K.; Lyons, J.; Dehzangi, A.; Sharma, A.; Wang, J.; Sattar, A.; Yang, Y.; Zhou, Y. Improving prediction of second-ary structure, local backbone angles and solvent accessible surface area of proteins by iterative deep learning. Sci. Rep., 2015, 5(1), 11476.
[http://dx.doi.org/10.1038/srep11476] [PMID: 26098304]
[21]
Wang, S.; Peng, J.; Ma, J.; Xu, J. Protein secondary structure prediction using deep convolutional neural fields. Sci. Rep., 2016, 6(1), 18962.
[http://dx.doi.org/10.1038/srep18962] [PMID: 26752681]
[22]
Busia, A.; Collins, J.; Jaitly, N Protein secondary structure prediction using deep multi-scale convolutional neural networks and next-step conditioning., 2016.
[23]
Chen, Y. Long sequence feature extraction based on deep learning neural network for protein secondary structure prediction. In 2017 IEEE 3rd Information Technology and Mechatronics Engineering Conference (ITOEC), 03-05 Oct, 2017, Chongqing, China, pp. 843-847.
[24]
Wang, Y.; Mao, H.; Yi, Z. Protein secondary structure prediction by using deep learning method. Knowl. Base. Syst., 2017, 118, 115-123.
[http://dx.doi.org/10.1016/j.knosys.2016.11.015]
[25]
Liu, Y.; Cheng, J.; Ma, Y.; Chen, Y. Protein secondary structure prediction based on two dimensional deep convolutional neural networks. In 2017 3rd IEEE International Conference on Computer and Communications (ICCC), 13-16 Dec, 2017, Chengdu, China, pp. 1995-1999.
[26]
Guo, Y.; Wang, B.; Li, W.; Yang, B. Protein secondary structure prediction improved by recurrent neural networks integrated with two-dimensional convolutional neural networks. J. Bioinform. Comput. Biol., 2018, 16(5), 1850021.
[http://dx.doi.org/10.1142/S021972001850021X] [PMID: 30419785]
[27]
Zhang, B.; Li, J.; Lü, Q. Prediction of 8-state protein secondary structures by a novel deep learning architecture. BMC Bioinformatics, 2018, 19(1), 293.
[http://dx.doi.org/10.1186/s12859-018-2280-5] [PMID: 30075707]
[28]
Zhou, J.; Wang, H.; Zhao, Z.; Xu, R.; Lu, Q. CNNH_PSS: protein 8-class secondary structure prediction by convolutional neural network with highway. BMC Bioinformatics, 2018, 19(Suppl. 4), 60.
[http://dx.doi.org/10.1186/s12859-018-2067-8] [PMID: 29745837]
[29]
Fang, C.; Shang, Y.; Xu, D. MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction. Proteins, 2018, 86(5), 592-598.
[http://dx.doi.org/10.1002/prot.25487] [PMID: 29492997]
[30]
Heffernan, R.; Paliwal, K.; Lyons, J.; Singh, J.; Yang, Y.; Zhou, Y. Single‐sequence‐based prediction of protein secondary structures and solvent accessibility by deep whole‐sequence learning. J. Comput. Chem., 2018, 39(26), 2210-2216.
[http://dx.doi.org/10.1002/jcc.25534] [PMID: 30368831]
[31]
Hanson, J.; Paliwal, K.; Litfin, T.; Yang, Y.; Zhou, Y. Improving prediction of protein secondary structure, backbone angles, solvent ac-cessibility and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural net-works. Bioinformatics, 2019, 35(14), 2403-2410.
[http://dx.doi.org/10.1093/bioinformatics/bty1006] [PMID: 30535134]
[32]
Guo, Y.; Li, W.; Wang, B.; Liu, H.; Zhou, D. DeepACLSTM: deep asymmetric convolutional long short-term memory neural models for protein secondary structure prediction. BMC Bioinformatics, 2019, 20(1), 341.
[http://dx.doi.org/10.1186/s12859-019-2940-0] [PMID: 31208331]
[33]
Asgari, E.; Poerner, N.; McHardy, A.C.; Mofrad, M.R. DeepPrime2Sec: Deep learning for protein secondary structure prediction from the primary sequences. bioRxiv, , 705426.2019
[http://dx.doi.org/10.1101/705426]
[34]
Kumar, P.; Bankapur, S.; Patil, N. An enhanced protein secondary structure prediction using deep learning framework on hybrid profile based features. Appl. Soft Comput., 2020, 86, 105926.
[http://dx.doi.org/10.1016/j.asoc.2019.105926]
[35]
Venkata Subramaniya, S.R.M.; Terashi, G.; Kihara, D. Protein secondary structure detection in intermediate-resolution cryo-em maps using deep learning. Biophys. J., 2020, 118(3), 43a.
[http://dx.doi.org/10.1016/j.bpj.2019.11.417]
[36]
Zhou, S.; Zou, H.; Liu, C.; Zang, M.; Liu, T. Combining deep neural networks for protein secondary structure prediction. IEEE Access, 2020, 8, 84362-84370.
[http://dx.doi.org/10.1109/ACCESS.2020.2992084]
[37]
Lyu, Z.; Wang, Z.; Luo, F.; Shuai, J.; Huang, Y. Protein secondary structure prediction with a reductive deep learning method. Front. Bioeng. Biotechnol., 2021, 9, 687426.
[http://dx.doi.org/10.3389/fbioe.2021.687426] [PMID: 34211967]
[38]
Guo, Z.; Hou, J.; Cheng, J. DNSS2: Improved ab initio protein secondary structure prediction using advanced deep learning architectures. Proteins, 2021, 89(2), 207-217.
[http://dx.doi.org/10.1002/prot.26007] [PMID: 32893403]
[39]
AlGhamdi, R.; Aziz, A.; Alshehri, M.; Pardasani, K.R.; Aziz, T. Deep learning model with ensemble techniques to compute the secondary structure of proteins. J. Supercomput., 2021, 77(5), 5104-5119.
[http://dx.doi.org/10.1007/s11227-020-03467-9]
[40]
Kotowski, K.; Smolarczyk, T.; Roterman-Konieczna, I.; Stapor, K. PROTEINUNET - An efficient alternative to spider3‐single for se-quence‐based prediction of protein secondary structures. J. Comput. Chem., 2021, 42(1), 50-59.
[http://dx.doi.org/10.1002/jcc.26432] [PMID: 33058261]
[41]
De Meutter, J.; Goormaghtigh, E. FTIR imaging of protein microarrays for high throughput secondary structure determination. Anal. Chem., 2021, 93(8), 3733-3741.
[http://dx.doi.org/10.1021/acs.analchem.0c03677] [PMID: 33577285]
[42]
De Meutter, J.; Goormaghtigh, E. Protein structural denaturation evaluated by MCR-ALS of protein microarray FTIR spectra. Anal. Chem., 2021, 93(40), 13441-13449.
[http://dx.doi.org/10.1021/acs.analchem.1c01416] [PMID: 34592098]
[43]
De Meutter, J.; Goormaghtigh, E. Amino acid side chain contribution to protein FTIR spectra: Impact on secondary structure evaluation. Eur. Biophys. J., 2021, 50(3-4), 641-651.
[http://dx.doi.org/10.1007/s00249-021-01507-7] [PMID: 33558954]
[44]
De Meutter, J.; Goormaghtigh, E. Searching for a better match between protein secondary structure definitions and protein FTIR spectra. Anal. Chem., 2021, 93(3), 1561-1568.
[http://dx.doi.org/10.1021/acs.analchem.0c03943] [PMID: 33332103]
[45]
Yu, S.; Zhang, J.; Fu, C.; Qiao, L.; Dai, X.; Ding, C.; Fang, X. Obtaining information on protein dynamics using FT-IR spectroscopy, Protocol exchange 2018. Available from: https://protocolexchange.researchsquare.com/article/nprot-6855/v1
[http://dx.doi.org/10.1038/protex.2018.075]
[46]
Sukumaran, S. Protein secondary structure elucidation using FTIR spectroscopy; Thermo Fisher Scientific, 2017, pp. 1-4.
[47]
Wilcox, K.E.; Blanch, E.W.; Doig, A.J. Determination of protein secondary structure from infrared spectra using partial least-squares re-gression. Biochemistry, 2016, 55(27), 3794-3802.
[http://dx.doi.org/10.1021/acs.biochem.6b00403] [PMID: 27322779]
[48]
Yang, H.; Yang, S.; Kong, J.; Dong, A.; Yu, S. Obtaining information about protein secondary structures in aqueous solution using Fourier transform IR spectroscopy. Nat. Protoc., 2015, 10(3), 382-396.
[http://dx.doi.org/10.1038/nprot.2015.024] [PMID: 25654756]
[49]
Manor, J.; Arkin, I.T. Gaining insight into membrane protein structure using isotope-edited FTIR. Biochim. Biophys. Acta Biomembr., 2013, 1828(10), 2256-2264.
[http://dx.doi.org/10.1016/j.bbamem.2012.11.020] [PMID: 23196348]
[50]
Goormaghtigh, E.; Ruysschaert, J.M.; Raussens, V. Evaluation of the information content in infrared spectra for protein secondary struc-ture determination. Biophys. J., 2006, 90(8), 2946-2957.
[http://dx.doi.org/10.1529/biophysj.105.072017] [PMID: 16428280]
[51]
Oberg, K.A.; Ruysschaert, J.M.; Goormaghtigh, E. The optimization of protein secondary structure determination with infrared and circu-lar dichroism spectra. Eur. J. Biochem., 2004, 271(14), 2937-2948.
[http://dx.doi.org/10.1111/j.1432-1033.2004.04220.x] [PMID: 15233789]
[52]
Cai, S.; Singh, B.R. A distinct utility of the amide III infrared band for secondary structure estimation of aqueous protein solutions using partial least squares methods. Biochemistry, 2004, 43(9), 2541-2549.
[http://dx.doi.org/10.1021/bi030149y] [PMID: 14992591]
[53]
Hering, J.A.; Innocent, P.R.; Haris, P.I. Neuro‐fuzzy structural classification of proteins for improved protein secondary structure predic-tion. Proteomics, 2003, 3(8), 1464-1475.
[54]
Hering, J.A.; Innocent, P.R.; Haris, P.I. Automatic amide I frequency selection for rapid quantification of protein secondary structure from Fourier transform infrared spectra of proteins. Proteomics, 2002, 2(7), 839-849.
[55]
Jiang, M.; Shu, T.; Ye, C.; Ren, J.; Ling, S. Predicting the conformations of the silk protein through deep learning. Analyst, 2021, 146(8), 2490-2498.
[http://dx.doi.org/10.1039/D1AN00290B] [PMID: 33899058]
[56]
Rong, D.; Wang, H.; Ying, Y.; Zhang, Z.; Zhang, Y. Peach variety detection using VIS-NIR spectroscopy and deep learning. Comput. Electron. Agric., 2020, 175, 105553.
[http://dx.doi.org/10.1016/j.compag.2020.105553]
[57]
Yang, J.; Wang, X.; Wang, R.; Wang, H. Combination of convolutional neural networks and recurrent neural networks for predicting soil properties using Vis–NIR spectroscopy. Geoderma, 2020, 380, 114616.
[http://dx.doi.org/10.1016/j.geoderma.2020.114616]
[58]
Zhang, C.; Wu, W.; Zhou, L.; Cheng, H.; Ye, X.; He, Y. Developing deep learning based regression approaches for determination of chem-ical compositions in dry black goji berries (Lycium ruthenicum Murr.) using near-infrared hyperspectral imaging. Food Chem., 2020, 319, 126536.
[http://dx.doi.org/10.1016/j.foodchem.2020.126536] [PMID: 32146292]
[59]
Zhou, X.; Sun, J.; Tian, Y.; Lu, B.; Hang, Y.; Chen, Q. Hyperspectral technique combined with deep learning algorithm for detection of compound heavy metals in lettuce. Food Chem., 2020, 321, 126503.
[http://dx.doi.org/10.1016/j.foodchem.2020.126503] [PMID: 32240914]
[60]
Zhang, C.; Zhou, L.; Zhao, Y.; Zhu, S.; Liu, F.; He, Y. Noise reduction in the spectral domain of hyperspectral images using denoising autoencoder methods. Chemom. Intell. Lab. Syst., 2020, 203, 104063.
[http://dx.doi.org/10.1016/j.chemolab.2020.104063]
[61]
Nie, P.; Zhang, J.; Feng, X.; Yu, C.; He, Y. Classification of hybrid seeds using near-infrared hyperspectral imaging technology combined with deep learning. Sens. Actuators B Chem., 2019, 296, 126630.
[http://dx.doi.org/10.1016/j.snb.2019.126630]
[62]
Ng, W.; Minasny, B.; Montazerolghaem, M.; Padarian, J.; Ferguson, R.; Bailey, S.; McBratney, A.B. Convolutional neural network for simultaneous prediction of several soil properties using visible/near-infrared, mid-infrared, and their combined spectra. Geoderma, 2019, 352, 251-267.
[http://dx.doi.org/10.1016/j.geoderma.2019.06.016]
[63]
Cui, C.; Fearn, T. Modern practical convolutional neural networks for multivariate regression: Applications to NIR calibration. Chemom. Intell. Lab. Syst., 2018, 182, 9-20.
[http://dx.doi.org/10.1016/j.chemolab.2018.07.008]
[64]
Yu, X.; Lu, H.; Liu, Q. Deep-learning-based regression model and hyperspectral imaging for rapid detection of nitrogen concentration in oilseed rape (Brassica napus L.) leaf. Chemom. Intell. Lab. Syst., 2018, 172, 188-193.
[http://dx.doi.org/10.1016/j.chemolab.2017.12.010]
[65]
Signoroni, A.; Savardi, M.; Pezzoni, M.; Guerrini, F.; Arrigoni, S.; Turra, G. Combining the use of CNN classification and strength‐driven compression for the robust identification of bacterial species on hyperspectral culture plate images. IET Comput. Vis., 2018, 12(7), 941-949.
[http://dx.doi.org/10.1049/iet-cvi.2018.5237]
[66]
Liu, T.; Li, Z.; Yu, C.; Qin, Y. NIRS feature extraction based on deep auto-encoder neural network. Infrared Phys. Technol., 2017, 87, 124-128.
[http://dx.doi.org/10.1016/j.infrared.2017.07.015]
[67]
Gautam, R.; Vanga, S.; Ariese, F.; Umapathy, S. Review of multidimensional data processing approaches for Raman and infrared spectros-copy. EPJ Tech. Instrum., 2015, 2(1), 8.
[http://dx.doi.org/10.1140/epjti/s40485-015-0018-6] [PMID: 26146600]
[68]
Zhang, Z.M.; Chen, S.; Liang, Y.Z. Baseline correction using adaptive iteratively reweighted penalized least squares. Analyst, 2010, 135(5), 1138-1146.
[http://dx.doi.org/10.1039/b922045c] [PMID: 20419267]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy