Title:Recent Advances in Predicting ncRNA-Protein Interactions Based on Machine Learning
Volume: 1
Issue: 5
Author(s): Jingjing Wang, Yanpeng Zhao, Xiaoqian Huang, Yi Shi and Jianjun Tan*
Affiliation:
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing 100124,China
Keywords:
ncRNA-protein interactions, ncRNAs-protein interaction databases, classical datasets, sequence encoding methods, conventional machine learning and deep learning, genome sequencing.
Abstract: Non-coding RNAs (ncRNAs) play significant roles in various physiological and pathological
proces ses via interacting with the proteins. The existing experimental methods used for predicting
ncRNA-protein interactions are costly and time-consuming. Therefore, an increasing number
of machine learning models have been developed to efficiently predict ncRNA-protein interactions
(ncRPIs), including shallow machine learning and deep learning models, which have achieved dramatic
advancements on the identification of ncRPIs. In this review, we provided an overview of the
recent advances in various machine learning methods for predicting ncRPIs, mainly focusing on
ncRNAs-protein interaction databases, classical datasets, ncRNA/protein sequence encoding methods,
conventional machine learning-based models, deep learning-based models, and the two integration-
based models. Furthermore, we compared the reported accuracy of these approaches and discussed
the potential and limitation of deep learning applications in ncRPIs. Finding that the predictive
performance of integrated deep learning is the best, and those deep learning-based methods do
not always perform better than shallow machine learning-based methods. We discussed the potential
of using deep learning and proposed a research approach on the basis of the existing research. We
believe that the model based on integrated deep learning is able to achieve a higher accuracy in the
prediction if substantial experimental data were available in the near future.