Nonalcoholic fatty liver disease (NAFLD) detection and deep learning in a Chinese community-based population

Eur Radiol. 2023 Aug;33(8):5894-5906. doi: 10.1007/s00330-023-09515-1. Epub 2023 Mar 9.

Abstract

Objectives: We aimed to develop and validate a deep learning system (DLS) by using an auxiliary section that extracts and outputs specific ultrasound diagnostic features to improve the explainable, clinical relevant utility of using DLS for detecting NAFLD.

Methods: In a community-based study of 4144 participants with abdominal ultrasound scan in Hangzhou, China, we sampled 928 (617 [66.5%] females, mean age: 56 years ± 13 [standard deviation]) participants (2 images per participant) to develop and validate DLS, a two-section neural network (2S-NNet). Radiologists' consensus diagnosis classified hepatic steatosis as none steatosis, mild, moderate, and severe. We also explored the NAFLD detection performance of six one-section neural network models and five fatty liver indices on our data set. We further evaluated the influence of participants' characteristics on the correctness of 2S-NNet by logistic regression.

Results: Area under the curve (AUROC) of 2S-NNet for hepatic steatosis was 0.90 for ≥ mild, 0.85 for ≥ moderate, and 0.93 for severe steatosis, and was 0.90 for NAFLD presence, 0.84 for moderate to severe NAFLD, and 0.93 for severe NAFLD. The AUROC of NAFLD severity was 0.88 for 2S-NNet, and 0.79-0.86 for one-section models. The AUROC of NAFLD presence was 0.90 for 2S-NNet, and 0.54-0.82 for fatty liver indices. Age, sex, body mass index, diabetes, fibrosis-4 index, android fat ratio, and skeletal muscle via dual-energy X-ray absorptiometry had no significant impact on the correctness of 2S-NNet (p > 0.05).

Conclusions: By using two-section design, 2S-NNet had improved the performance for detecting NAFLD with more explainable, clinical relevant utility than using one-section design.

Key points: • Based on the consensus review derived from radiologists, our DLS (2S-NNet) had an AUROC of 0.88 by using two-section design and yielded better performance for detecting NAFLD than using one-section design with more explainable, clinical relevant utility. • The 2S-NNet outperformed five fatty liver indices with the highest AUROCs (0.84-0.93 vs. 0.54-0.82) for different NAFLD severity screening, indicating screening utility of deep learning-based radiology may perform better than blood biomarker panels in epidemiology. • The correctness of 2S-NNet was not significantly influenced by individual's characteristics, including age, sex, body mass index, diabetes, fibrosis-4 index, android fat ratio, and skeletal muscle via dual-energy X-ray absorptiometry.

Keywords: Convolutional neural networks; Deep learning; Fatty liver indices; Nonalcoholic fatty liver disease; Ultrasound imaging.

Publication types

  • Validation Study

MeSH terms

  • Adult
  • Aged
  • Deep Learning*
  • East Asian People
  • Female
  • Fibrosis
  • Humans
  • Liver / diagnostic imaging
  • Male
  • Middle Aged
  • Non-alcoholic Fatty Liver Disease* / diagnostic imaging
  • Non-alcoholic Fatty Liver Disease* / epidemiology
  • Ultrasonography