Purpose: To investigate the repeatability and reproducibility of lung segmentation and their impact on the quantitative outcomes from functional pulmonary MRI. Additionally, to validate an artificial neural network (ANN) to accelerate whole-lung quantification.
Method: Ten healthy children and 25 children with cystic fibrosis underwent matrix pencil decomposition MRI (MP-MRI). Impaired relative fractional ventilation (RFV ) and relative perfusion (RQ ) from MP-MRI were compared using whole-lung segmentation performed by a physician at two time-points (At1 and At2 ), by an MRI technician (B), and by an ANN (C). Repeatability and reproducibility were assess with Dice similarity coefficient (DSC), paired t-test and Intraclass-correlation coefficient (ICC).
Results: The repeatability within an observer (At1 vs At2 ) resulted in a DSC of 0.94 ± 0.01 (mean ± SD) and an unsystematic difference of -0.01% for RFV (P = .92) and +0.1% for RQ (P = .21). The reproducibility between human observers (At1 vs B) resulted in a DSC of 0.88 ± 0.02, and a systematic absolute difference of -0.81% (P < .001) for RFV and -0.38% (P = .037) for RQ . The reproducibility between human and the ANN (At1 vs C) resulted in a DSC of 0.89 ± 0.03 and a systematic absolute difference of -0.36% for RFV (P = .017) and -0.35% for RQ (P = .002). The ICC was >0.98 for all variables and comparisons.
Conclusions: Despite high overall agreement, there were systematic differences in lung segmentation between observers. This needs to be considered for longitudinal studies and could be overcome by using an ANN, which performs as good as human observers and fully automatizes MP-MRI post-processing.
Keywords: automated segmentation; functional lung MRI; inter-reader reproducibility; neural networks; pediatrics.
© 2020 International Society for Magnetic Resonance in Medicine.