Title:Identification of Potential Genes and Critical Pathways in Postoperative
Recurrence of Crohn’s Disease by Machine Learning And WGCNA
Network Analysis
Volume: 24
Issue: 2
Author(s): Aruna Rajalingam, Kanagaraj Sekar and Anjali Ganjiwale*
Affiliation:
- Department of Life Sciences, Bangalore University, Bangalore, Karnataka, 560056, India
Keywords:
Crohn's disease, postoperative recurrence, protein-protein interaction (PPI) network, diagnostic biomarker, remission, diagnosis.
Abstract:
Background: Crohn's disease (CD) is a chronic idiopathic inflammatory bowel disease affecting
the entire gastrointestinal tract from the mouth to the anus. These patients often experience a
period of symptomatic relapse and remission. A 20 - 30% symptomatic recurrence rate is reported in
the first year after surgery, with a 10% increase each subsequent year. Thus, surgery is done only to
relieve symptoms and not for the complete cure of the disease. The determinants and the genetic factors
of this disease recurrence are also not well-defined. Therefore, enhanced diagnostic efficiency and
prognostic outcome are critical for confronting CD recurrence.
Methods: We analysed ileal mucosa samples collected from neo-terminal ileum six months after surgery
(M6=121 samples) from Crohn's disease dataset (GSE186582). The primary aim of this study is to identify
the potential genes and critical pathways in post-operative recurrence of Crohn’s disease. We combined
the differential gene expression analysis with Recursive feature elimination (RFE), a machine
learning approach to get five critical genes for the postoperative recurrence of Crohn's disease. The features
(genes) selected by different methods were validated using five binary classifiers for recurrence and
remission samples: Logistic Regression (LR), Decision tree classifier (DT), Support Vector Machine
(SVM), Random Forest classifier (RF), and K-nearest neighbor (KNN) with 10-fold cross-validation. We
also performed weighted gene co-expression network analysis (WGCNA) to select specific modules and
feature genes associated with Crohn's disease postoperative recurrence, smoking, and biological sex.
Combined with other biological interpretations, including Gene Ontology (GO) analysis, pathway enrichment,
and protein-protein interaction (PPI) network analysis, our current study sheds light on the indepth
research of CD diagnosis and prognosis in postoperative recurrence.
Results: PLOD2, ZNF165, BOK, CX3CR1, and ARMCX4, are the important genes identified from
the machine learning approach. These genes are reported to be involved in the viral protein interaction
with cytokine and cytokine receptors, lysine degradation, and apoptosis. They are also linked with various
cellular and molecular functions such as Peptidyl-lysine hydroxylation, Central nervous system
maturation, G protein-coupled chemoattractant receptor activity, BCL-2 homology (BH) domain binding,
Gliogenesis and negative regulation of mitochondrial depolarization. WGCNA identified a gene
co-expression module that was primarily involved in mitochondrial translational elongation, mitochondrial
translational termination, mitochondrial translation, mitochondrial respiratory chain complex,
mRNA splicing via spliceosome pathways, etc.; Both the analysis result emphasizes that the mitochondrial
depolarization pathway is linked with CD recurrence leading to oxidative stress in promoting
inflammation in CD patients.
Conclusion: These key genes serve as the novel diagnostic biomarker for the postoperative recurrence
of Crohn’s disease. Thus, among other treatment options present until now, these biomarkers would
provide success in both diagnosis and prognosis, aiming for a long-lasting remission to prevent further
complications in CD.