Objective: To establish an artificial intelligence (AI)-assisted diagnostic system for lung cancer via deep transfer learning. Methods: The researchers collected 519 lung pathologic slides from 2016 to 2019, covering various lung tissues, including normal tissues, adenocarcinoma, squamous cell carcinoma and small cell carcinoma, from the Beijing Chest Hospital, the Capital Medical University. The slides were digitized by scanner, and 316 slides were used as training set and 203 as the internal test set. The researchers labeled all the training slides by pathologists and establish a semantic segmentation model based on DeepLab v3 with ResNet-50 to detect lung cancers at the pixel level. To perform transfer learning, the researchers utilized the gastric cancer detection model to initialize the deep neural network parameters. The lung cancer detection convolutional neural network was further trained by fine-tuning of the labeled data. The deep learning model was tested by 203 slides in the internal test set and 1 081 slides obtained from TCIA database, named as the external test set. Results: The model trained with transfer learning showed substantial accuracy advantage against the one trained from scratch for the internal test set [area under curve (AUC) 0.988 vs. 0.971, Kappa 0.852 vs. 0.832]. For the external test set, the transferred model achieved an AUC of 0.968 and Kappa of 0.828, indicating superior generalization ability. By studying the predictions made by the model, the researchers obtained deeper understandings of the deep learning model. Conclusions: The lung cancer histopathological diagnostic system achieves higher accuracy and superior generalization ability. With the development of histopathological AI, the transfer learning can effectively train diagnosis models and shorten the learning period, and improve the model performance.
目的: 探索建立基于深度迁移学习的人工智能肺癌辅助诊断系统并评估其应用价值。 方法: 收集2016至2019年之间首都医科大学附属北京胸科医院保存的519例肺部组织切片(包括正常肺、腺癌、鳞状细胞癌和小细胞癌),扫描成数字切片,分为316张训练集和203张内部测试集。训练集由病理医师进行标注,使用基于ResNet-50的DeepLab v3图像分割模型建立肺部癌区像素级识别模型。在模型训练过程中,将胃部癌区识别模型的参数作为初始值,通过迁移学习策略对肺部癌区识别模型参数进行二次训练优化。再分别利用首都医科大学附属北京胸科医院的203张内部测试集以及从美国癌症影像档案(TCIA)数据库获得的1 081张外部测试集对已建立的辅助诊断模型进行验证。 结果: 在较少样本量的情况下,迁移学习模型比普通模型显示出更好的识别准确度[曲线下面积(AUC)值0.988∶0.971,Kappa值0.852∶0.832]。此外,对外部测试集,该研究建立的迁移学习模型诊断AUC值为0.968,Kappa=0.828,表示该模型具有很好的推广性。 结论: 该研究建立的人工智能肺癌病理辅助诊断方法具有较好的准确性和外部推广性。随着病理人工智能研究的不断深入,迁移学习方法有助于缩短诊断模型训练周期,提高诊断模型的准确性。.
Keywords: Artificial intelligence; Diagnosis, differential; Lung neoplasms; Transfer learning.