Consensus about a standard segmentation method to derive metabolic tumor volume (MTV) in classical Hodgkin lymphoma (cHL) is lacking, and it is unknown how different segmentation methods influence quantitative PET features. Therefore, we aimed to evaluate the delineation and completeness of lesion selection and the need for manual adaptation with different segmentation methods, and to assess the influence of segmentation methods on the prognostic value of MTV, intensity, and dissemination radiomics features in cHL patients. Methods: We analyzed a total of 105 18F-FDG PET/CT scans from patients with newly diagnosed (n = 35) and relapsed/refractory (n = 70) cHL with 6 segmentation methods: 2 fixed thresholds on SUV4.0 and SUV2.5, 2 relative methods of 41% of SUVmax (41max) and a contrast-corrected 50% of SUVpeak (A50P), and 2 combination majority vote (MV) methods (MV2, MV3). Segmentation quality was assessed by 2 reviewers on the basis of predefined quality criteria: completeness of selection, the need for manual adaptation, and delineation of lesion borders. Correlations and prognostic performance of resulting radiomics features were compared among the methods. Results: SUV4.0 required the least manual adaptation but tended to underestimate MTV and often missed small lesions with low 18F-FDG uptake. SUV2.5 most frequently included all lesions but required minor manual adaptations and generally overestimated MTV. In contrast, few lesions were missed when using 41max, A50P, MV2, and MV3, but these segmentation methods required extensive manual adaptation and overestimated MTV in most cases. MTV and dissemination features significantly differed among the methods. However, correlations among methods were high for MTV and most intensity and dissemination features. There were no significant differences in prognostic performance for all features among the methods. Conclusion: A high correlation existed between MTV, intensity, and most dissemination features derived with the different segmentation methods, and the prognostic performance is similar. Despite frequently missing small lesions with low 18F-FDG avidity, segmentation with a fixed threshold of SUV4.0 required the least manual adaptation, which is critical for future research and implementation in clinical practice. However, the importance of small, low 18F-FDG-avidity lesions should be addressed in a larger cohort of cHL patients.
Keywords: 18F-FDG PET/CT; Hodgkin lymphoma; outcome prediction; radiomics; segmentation methods.
© 2022 by the Society of Nuclear Medicine and Molecular Imaging.