Family history of cancer and lung cancer: Utility of big data and artificial intelligence for exploring the role of genetic risk

Lung Cancer. 2024 Sep:195:107920. doi: 10.1016/j.lungcan.2024.107920. Epub 2024 Aug 9.

Abstract

Objectives: Lung Cancer (LC) is a multifactorial disease for which the role of genetic susceptibility has become increasingly relevant. Our aim was to use artificial intelligence (AI) to analyze differences between patients with LC based on family history of cancer (FHC).

Materials and methods: From August 2016 to June 2020 clinical information was obtained from Thoracic Tumors Registry (TTR), a nationwide database sponsored by the Spanish Lung Cancer Group. In addition to descriptive statistical analysis, an AI-assisted analysis was performed. The German Technical Information Library supported the merging of data from the electronic medical records and database of the TTR. The results of the AI-assisted analysis were reported using Knowledge Graph, Unified Schema and descriptive and predictive analyses.

Results: Analyses were performed in two phases: first, conventional statistical analysis including 11,684 patients of those 5,806 had FHC. Median overall survival (OS) for the global population was 23 months (CI 95 %: 21.39-24.61) in patients with FHC versus 21 months (CI 95 %: 19.53-22.48) in patients without FHC (NFHC), p < 0.001. The second AI-assisted analysis included 5,788 patients of those 939 had FHC. 58.48 % of women with FHC had LC. 9.53 % of patients had an EGFR or HER2 mutation or ALK translocation and at least one relative with cancer. A family history of LC was associated with an increased risk of smoking-related LC. Non-smokers with a family history of LC were more likely to have an EGFR mutation in NSCLC. In Bayesian network analysis, 55 % of patients with a family history of LC and never-smokers had an EGFR mutation.

Conclusion: In our population, the incidence of LC in patients with a FHC is higher in women and younger patients. FHC is a risk factor and predictor of LC development, especially in people ≤ 50 years. These results were confirmed by conventional statistics and AI-assisted analysis.

Keywords: Cancer registry; Family history; Lung cancer; Spain.

MeSH terms

  • Adult
  • Aged
  • Artificial Intelligence*
  • Big Data*
  • Female
  • Genetic Predisposition to Disease*
  • Humans
  • Lung Neoplasms* / epidemiology
  • Lung Neoplasms* / genetics
  • Lung Neoplasms* / mortality
  • Male
  • Middle Aged
  • Registries
  • Risk Factors