Generic placeholder image

Combinatorial Chemistry & High Throughput Screening


ISSN (Print): 1386-2073
ISSN (Online): 1875-5402

Research Article

Diagnosis and Prognosis of Non-small Cell Lung Cancer based on Machine Learning Algorithms

Author(s): Yiyi Zhou, Yuchao Dong*, Qinying Sun and Chen Fang

Volume 26, Issue 12, 2023

Published on: 15 February, 2023

Page: [2170 - 2183] Pages: 14

DOI: 10.2174/1386207326666230110115804

Price: $65


Background: Non-small cell lung cancer (NSCLC) has been the subject of intense scholarly debate. We aimed to identify the potential biomarkers via bioinformatics analysis.

Methods: Three datasets were downloaded from gene expression omnibus database (GEO). R software was applied to screen differentially expressed genes (DEGs)and analyze immune cell infiltrates. Gene set enrichment analysis (GSEA) showed significant function and pathway in two groups. The diagnostic markers were further investigated by multiple machine learning algorithms (least absolute shrinkage and selection operator (LASSO) and support vector machine-recursive feature elimination (SVM-RFE)). Various online analytic platforms were utilized to explore the expression and prognostic value of differential genes. Furthermore, western blotting was performed to test the effects of genes on cell proliferation in vitro.

Results: We identified 181 DEGs shared by two datasets and selected nine diagnostic markers. Those genes were also significantly overexpressed in the third dataset. Topoisomerase II alpha (TOP2A) is overexpressed in lung cancer and associated with a poor prognosis, which was confirmed using immunohistochemistry (IHC) and Western blotting. Additionally, TOP2A showed a negative correlation with immune cells, such as CD8+ T cells, eosinophils and natural killer (NK) cell.

Conclusion: Collectively, for the first time, we applied multiple machine learning algorithms, online databases and experiments in vitro to show that TOP2A is a potential biomarker for lung adenocarcinoma and could facilitate the development of new treatment strategies.

Keywords: Bioinformatics, machine learning algorithms, TOP2A, immune infiltration, immune cells, lung cancer.

Krist, A.H.; Davidson, K.W.; Mangione, C.M.; Barry, M.J.; Cabana, M.; Caughey, A.B.; Davis, E.M.; Donahue, K.E.; Doubeni, C.A.; Kubik, M.; Landefeld, C.S.; Li, L.; Ogedegbe, G.; Owens, D.K.; Pbert, L.; Silverstein, M.; Stevermer, J.; Tseng, C.W.; Wong, J.B. Screening for lung cancer. JAMA, 2021, 325(10), 962-970.
[] [PMID: 33687470]
Cancer Stat Facts: lung and bronchus cancer. National Cancer Institute. Available from:
Greillier, L.; Gauvrit, M.; Paillaud, E.; Girard, N.; Montégut, C.; Boulahssass, R.; Wislez, M.; Pamoukdjian, F.; Corre, R.; Cabart, M.; Caillet, P.; Belaroussi, Y.; Frasca, M.; Noize, P.; Wang, P.; Mebarki, S.; Mathoulin-Pelissier, S.; Couderc, A.L. Targeted Therapy for Older Patients with Non-Small Cell Lung Cancer: Systematic Review and Guidelines from the French Society of Geriatric Oncology (SoFOG) and the French-Language Society of Pulmonology (SPLF)/French-Language Oncology Group (GOLF). Cancers (Basel), 2022, 14(3), 769.
[] [PMID: 35159036]
Mamdani, H.; Matosevic, S.; Khalid, A.B.; Durm, G.; Jalal, S.I. Immunotherapy in lung cancer: current landscape and future directions. Front. Immunol., 2022, 13, 823618.
[] [PMID: 35222404]
Schussler, O.; Bobbio, A.; Dermine, H.; Lupo, A.; Damotte, D.; Lecarpentier, Y.; Alifano, M. Twenty-year survival of patients operated on for non-small-cell lung cancer: The impact of tumor stage and patient-related parameters. Cancers (Basel), 2022, 14(4), 874.
[] [PMID: 35205621]
Gu, C.; Pan, X.; Wang, R.; Li, Y.; Shen, X.; Shi, J.; Chen, H. Analysis of mutational and clinicopathologic characteristics of lung adenocarcinoma with clear cell component. Oncotarget, 2016, 7(17), 24596-24603.
[] [PMID: 27013585]
Siegel, R.L.; Miller, K.D.; Jemal, A. Cancer statistics, 2020. CA Cancer J. Clin., 2020, 70(1), 7-30.
[] [PMID: 31912902]
Zhu, W.; Li, L.L.; Songyang, Y.; Shi, Z.; Li, D. Identification and validation of HELLS (Helicase, Lymphoid-Specific) and ICAM1 (Intercellular adhesion molecule 1) as potential diagnostic biomarkers of lung cancer. PeerJ, 2020, 8, e8731.
[] [PMID: 32195055]
Cheng, Y.; Hou, K.; Wang, Y.; Chen, Y.; Zheng, X.; Qi, J.; Yang, B.; Tang, S.; Han, X.; Shi, D.; Wang, X.; Liu, Y.; Hu, X.; Che, X. Identification of prognostic signature and gliclazide as candidate drugs in lung adenocarcinoma. Front. Oncol., 2021, 11, 665276.
[] [PMID: 34249701]
Zhao, W.; Zhang, L.N.; Wang, X.L.; Zhang, J.; Yu, H.X. Long noncoding RNA NSCLCAT1 increases non–small cell lung cancer cell invasion and migration through the Hippo signaling pathway by interacting with CDH1. FASEB J., 2019, 33(1), 1151-1166.
[] [PMID: 30148675]
Chen, Y.; Jin, L.; Jiang, Z.; Liu, S.; Feng, W. Identifying and validating potential biomarkers of early stage lung adenocarcinoma diagnosis and prognosis. Front. Oncol., 2021, 11, 644426.
[] [PMID: 33937050]
Barrett, T.; Troup, D.B.; Wilhite, S.E.; Ledoux, P.; Rudnev, D.; Evangelista, C.; Kim, I.F.; Soboleva, A.; Tomashevsky, M.; Edgar, R. NCBI GEO: Mining tens of millions of expression profiles--database and tools update. Nucleic Acids Res., 2007, 35, D760-D765.
[] [PMID: 17099226]
Latendresse, M.; Paley, S.; Karp, P.D. Browsing metabolic and regulatory networks with BioCyc. Methods Mol. Biol., 2012, 804, 197-216.
[] [PMID: 22144155]
Lu, T.P.; Tsai, M.H.; Lee, J.M.; Hsu, C.P.; Chen, P.C.; Lin, C.W.; Shih, J.Y.; Yang, P.C.; Hsiao, C.K.; Lai, L.C.; Chuang, E.Y. Identification of a novel biomarker, SEMA5A, for non-small cell lung carcinoma in nonsmoking women. Cancer Epidemiol. Biomarkers Prev., 2010, 19(10), 2590-2597.
[] [PMID: 20802022]
Ma, X.; Ren, H.; Peng, R.; Li, Y.; Ming, L. Identification of key genes associated with progression and prognosis for lung squamous cell carcinoma. PeerJ, 2020, 8, e9086.
[] [PMID: 32411535]
Ritchie, M.E.; Phipson, B.; Wu, D.; Hu, Y.; Law, C.W.; Shi, W.; Smyth, G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res., 2015, 43(7), e47.
[] [PMID: 25605792]
Yu, G.; Wang, L.G.; Han, Y.; He, Q.Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS, 2012, 16(5), 284-287.
[] [PMID: 22455463]
Gupta, M.; Gupta, B. A novel gene expression test method of minimizing breast cancer risk in reduced cost and time by improving SVM-RFE gene selection method combined with LASSO. J. Integr. Bioinform., 2021, 18(2), 139-153.
[] [PMID: 34171941]
Friedman, J.; Hastie, T.; Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw., 2010, 33(1), 1-22.
[] [PMID: 20808728]
Huang, M.L.; Hung, Y.H.; Lee, W.M.; Li, R.K.; Jiang, B.R. SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier. ScientificWorldJournal, 2014, 2014, 1-10.
[] [PMID: 25295306]
Zhao, E.; Xie, H.; Zhang, Y. Predicting diagnostic gene biomarkers associated with immune infiltration in patients with acute myocardial infarction. Front. Cardiovasc. Med., 2020, 7, 586871.
[] [PMID: 33195475]
Robin, X.; Turck, N.; Hainard, A.; Tiberti, N.; Lisacek, F.; Sanchez, J.C.; Müller, M. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics, 2011, 12(1), 77.
[] [PMID: 21414208]
Győrffy, B.; Surowiak, P.; Budczies, P.; Lánczky, A. Online survival analysis software to assess the prognostic value of biomarkers using transcriptomic data in non-small-cell lung cancer. PLoS One, 2013, 8(12), e82241.
[] [PMID: 24367507]
Chandrashekar, D.S.; Bashel, B.; Balasubramanya, S.A.H.; Creighton, C.J.; Ponce-Rodriguez, I.; Chakravarthi, B.V.S.K.; Varambally, S. UALCAN: A portal for facilitating tumor subgroup gene expression and survival analyses. Neoplasia, 2017, 19(8), 649-658.
[] [PMID: 28732212]
Tang, Z.; Li, C.; Kang, B.; Gao, G.; Li, C.; Zhang, Z. GEPIA: A web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res., 2017, 45(W1), W98-W102.
[] [PMID: 28407145]
Li, Z.; Qi, F.; Li, F. Establishment of a gene signature to predict prognosis for patients with lung adenocarcinoma. Int. J. Mol. Sci., 2020, 21(22), 8479.
[] [PMID: 33187219]
Gu, C.; Chen, J.; Dang, X.; Chen, C.; Huang, Z.; Shen, W.; Shi, X.; Dai, C.; Chen, C. Hippo pathway core genes based prognostic signature and immune infiltration patterns in lung squamous cell carcinoma. Front. Oncol., 2021, 11, 680918.
[] [PMID: 33996611]
Li, T.; Wang, W.; Gan, W.; Lv, S.; Zeng, Z.; Hou, Y.; Yan, Z.; Zhang, R.; Yang, M. Comprehensive bioinformatics analysis identifies LAPTM5 as a potential blood biomarker for hypertensive patients with left ventricular hypertrophy. Aging (Albany NY), 2022, 14(3), 1508-1528.
[] [PMID: 35157609]
Wen, S.; Peng, W.; Chen, Y.; Du, X.; Xia, J.; Shen, B.; Zhou, G. Four differentially expressed genes can predict prognosis and microenvironment immune infiltration in lung cancer: A study based on data from the GEO. BMC Cancer, 2022, 22(1), 193.
[] [PMID: 35184748]
Liu, P.; Li, H.; Liao, C.; Tang, Y.; Li, M.; Wang, Z.; Wu, Q.; Zhou, Y. Identification of key genes and biological pathways in Chinese lung cancer population using bioinformatics analysis. PeerJ, 2022, 10, e12731.
[] [PMID: 35178291]
Stegh, A.H. Targeting the p53 signaling pathway in cancer therapy-the promises, challenges and perils. Expert Opin. Ther. Targets, 2012, 16(1), 67-83.
[] [PMID: 22239435]
To, K.K.W.; Wu, W.K.K.; Loong, H.H.F. PPARgamma agonists sensitize PTEN-deficient resistant lung cancer cells to EGFR tyrosine kinase inhibitors by inducing autophagy. Eur. J. Pharmacol., 2018, 823, 19-26.
[] [PMID: 29378193]
Iwai, M.; Tulafu, M.; Togo, S.; Kawaji, H.; Kadoya, K.; Namba, Y.; Jin, J.; Watanabe, J.; Okabe, T.; Hidayat, M.; Sumiyoshi, I.; Itoh, M.; Koyama, Y.; Ito, Y.; Orimo, A.; Takamochi, K.; Oh, S.; Suzuki, K.; Hayashizaki, Y.; Yoshida, K.; Takahashi, K. Cancer-associated fibroblast migration in non-small cell lung cancers is modulated by increased integrin α11 expression. Mol. Oncol., 2021, 15(5), 1507-1527.
[] [PMID: 33682233]
Jin, H.O.; Hong, S.E.; Kim, J.Y.; Jang, S.K.; Kim, Y.S.; Sim, J.H.; Oh, A.C.; Kim, H.; Hong, Y.J.; Lee, J.K.; Park, I.C. Knock-down of PSAT1 enhances sensitivity of NSCLC cells to glutamine-limiting conditions. Anticancer Res., 2019, 39(12), 6723-6730.
[] [PMID: 31810937]
Liang, Y.; Xia, W.; Zhang, T.; Chen, B.; Wang, H.; Song, X.; Zhang, Z.; Xu, L.; Dong, G.; Jiang, F. Upregulated collagen COL10A1 remodels the extracellular matrix and promotes malignant progression in lung adenocarcinoma. Front. Oncol., 2020, 10, 573534.
[] [PMID: 33324550]
Xiao, X.; Rui, B.; Rui, H.; Ju, M.; Hongtao, L. MEOX1 suppresses the progression of lung cancer cells by inhibiting the cell-cycle checkpoint geneCCNB1. Environ. Toxicol., 2022, 37(3), 504-513.
[] [PMID: 34837450]
Grenda, A.; Błach, J.; Szczyrek, M.; Krawczyk, P.; Nicoś, M.; Kuźnar Kamińska, B.; Jakimiec, M.; Balicka, G.; Chmielewska, I.; Batura-Gabryel, H.; Sawicki, M.; Milanowski, J. Promoter polymorphisms of TOP2A and ERCC1 genes as predictive factors for chemotherapy in non-small cell lung cancer patients. Cancer Med., 2020, 9(2), 605-614.
[] [PMID: 31797573]
Chen, M.Y.; Zeng, Y.C.; Zhao, X.H. Chemotherapy- and immune-related gene panel in prognosis prediction and immune microenvironment of SCLC. Front. Cell Dev. Biol., 2022, 10, 893490.
[] [PMID: 35784467]
Wang, T.; Lu, J.; Wang, R.; Cao, W.; Xu, J. TOP2A promotes proliferation and metastasis of hepatocellular carcinoma regulated by miR-144-3p. J. Cancer, 2022, 13(2), 589-601.
[] [PMID: 35069905]
Gong, M.; Chen, W.; Jin, Z.; Lyu, J.; Meng, L. wu, H.; Chen, F. Prognostic value and significant pathway exploration associated with top2a involved in papillary thyroid cancer. Int. J. Gen. Med., 2021, 14, 3485-3496.
[] [PMID: 34290523]
Suelmann, B.B.M.; Rademaker, A.; van Dooijeweert, C.; van der Wall, E.; van Diest, P.J.; Moelans, C.B. Genomic copy number alterations as biomarkers for triple negative pregnancy-associated breast cancer. Cell Oncol. (Dordr.), 2022, 45(4), 591-600.
[] [PMID: 35792986]
Carvalho, R.F.; do Canto, L.M.; Cury, S.S.; Frøstrup Hansen, T.; Jensen, L.H.; Rogatto, S.R. Drug repositioning based on the reversal of gene expression signatures identifies top2a as a therapeutic target for rectal cancer. Cancers (Basel), 2021, 13(21), 5492.
[] [PMID: 34771654]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy