Title:Predicting COVID-19 Severity Integrating RNA-Seq Data Using Machine
Learning Techniques
Volume: 18
Issue: 3
Author(s): Javier Bajo-Morales, Daniel Castillo-Secilla, Luis Javier Herrera, Octavio Caba, Jose Carlos Prados and Ignacio Rojas*
Affiliation:
- Department of Computer Architecture and Technology, University of Granada, C.I.T.I.C., Periodista Rafael Gómez
Montero, 2, 18014, Granada, Spain
Keywords:
COVID-19, CDSS, severity, gene expression, machine learning, feature selection.
Abstract:
A fundamental challenge in the fight against COVID-19 is the development of reliable and
accurate tools to predict disease progression in a patient. This information can be extremely useful in
distinguishing hospitalized patients at higher risk for needing UCI from patients with low severity.
How SARS-CoV-2 infection will evolve is still unclear.
Methods: A novel pipeline was developed that can integrate RNA-Seq data from different databases to
obtain a genetic biomarker COVID-19 severity index using an artificial intelligence algorithm. Our
pipeline ensures robustness through multiple cross-validation processes in different steps.
Results: CD93, RPS24, PSCA, and CD300E were identified as COVID-19 severity gene signatures.
Furthermore, using the obtained gene signature, an effective multi-class classifier capable of discriminating
between control, outpatient, inpatient, and ICU COVID-19 patients was optimized, achieving an
accuracy of 97.5%.
Conclusion: In summary, during this research, a new intelligent pipeline was implemented to develop a
specific gene signature that can detect the severity of patients suffering COVID-19. Our approach to
clinical decision support systems achieved excellent results, even when processing unseen samples. Our
system can be of great clinical utility for the strategy of planning, organizing and managing human and
material resources, as well as for automatically classifying the severity of patients affected by COVID-19.