Background: The purpose of this study was to achieve early and accurate diagnosis of lung cancer and long-term monitoring of the therapeutic response.
Methods: We downloaded GSE20189 from GEO database as analysis data. We also downloaded human lung adenocarcinoma RNA-seq transcriptome expression data from the TCGA database as validation data. Finally, the expression of all of the genes underwent z test normalization. We used ANOVA to identify differentially expressed genes specific to each stage, as well as the intersection between them. Two methods, correlation analysis and co-expression network analysis, were used to compare the expression patterns and topological properties of each stage. Using the functional quantification algorithm, we evaluated the functional level of each significantly enriched biological function under different stages. A machine-learning algorithm was used to screen out significant functions as features and to establish an early diagnosis model. Finally, survival analysis was used to verify the correlation between the outcome and the biomarkers that we found.
Results: We screened 12 significant biomarkers that could distinguish lung cancer patients with diverse risks. Patients carrying variations in these 12 genes also presented a poor outcome in terms of survival status compared with patients without variations.
Conclusions: We propose a new molecular-based noninvasive detection method. According to the expression of the stage-specific gene set in the peripheral blood of patients with lung cancer, the difference in the functional level is quantified to realize the early diagnosis and prediction of lung cancer.
Keywords: Diagnostic model; Early diagnostic; Lung adenocarcinoma.