Title:Identify Diabetes-related Targets based on ForgeNet_GPC
Volume: 20
Issue: 7
Author(s): Bin Yang, Linlin Wang*Wenzheng Bao*
Affiliation:
- School of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277160, China
- School of Information
and Electrical Engineering, Xuzhou University of Technology, Xuzhou, 221018, China
Keywords:
Protein, target, feature extraction, classification, diabetes, polygenic genetic diseases.
Abstract:
Background: Research on potential therapeutic targets and new mechanisms of action
can greatly improve the efficiency of new drug development.
Aims: Polygenic genetic diseases, such as diabetes, are caused by the interaction of multiple
gene loci and environmental factors.
Objectives: In this study, a disease target identification algorithm based on protein recognition is
proposed.
Materials and Methods: In this method, the related and unrelated targets are collected from literature
databases for treating diabetes. The transcribed proteins corresponding to each target are
queried in order to construct a protein dataset. Six protein feature extraction algorithms (AAC,
CKSAAGP, DDE, DPC, GAAP, and TPC) are utilized to obtain the feature vectors of each protein,
which are merged into the full feature vectors.
Results: A novel classifier (forgeNet_GPC) based on forgeNet and Gaussian process classifier
(GPC) is proposed to classify the proteins.
Conclusion: In forgeNet_GPC, forgeNet is utilized to select the important features, and GPC is
utilized to solve the classification problem. The experimental results reveal that forgeNet_GPC
performs better than 22 classifiers in terms of ROC-AUC, PR-AUC, MCC, Youden Index, and
Kappa.