Temporomandibular joint CBCT image segmentation via multi-view ensemble learning network

Piaolin Hu; Jupeng Li; Ruohan Ma; Kai Zhang; Yong Guo; Gang Li

doi:10.1007/s11517-024-03225-6

Temporomandibular joint CBCT image segmentation via multi-view ensemble learning network

Med Biol Eng Comput. 2024 Oct 28. doi: 10.1007/s11517-024-03225-6. Online ahead of print.

Authors

Piaolin Hu¹, Jupeng Li², Ruohan Ma³, Kai Zhang¹, Yong Guo¹, Gang Li³

Affiliations

¹ School of Electronics and Information Engineering, Beijing Jiaotong University, Beijing, 100044, China.
² School of Electronics and Information Engineering, Beijing Jiaotong University, Beijing, 100044, China. [email protected].
³ Peking University School and Hospital of Stomatology, Peking University, Beijing, 100195, China.

PMID: 39465436
DOI: 10.1007/s11517-024-03225-6

Abstract

Accurate segmentation of the temporomandibular joint (TMJ) from cone beam CT (CBCT) images holds significant clinical value for diagnosing temporomandibular joint osteoarthrosis (TMJOA) and related conditions. Convolutional neural network-based medical image segmentation methods have achieved state-of-the-art performance in various segmentation tasks. However, 3D medical images segmentation requires substantial global context and rich spatial semantic information, demanding much more GPU memory and computational resources. To address these challenges in 3D medical image segmentation, we propose a novel network- the MVEL-Net (Multi-view Ensemble Learning Network) for TMJ CBCT image segmentation. By resampling images along three dimensions, we generate multiple weak learners with different spatial semantic information. A subsequent strong learning network effectively integrates the outputs from these weak learners to achieve more accurate segmentation results. We evaluated our network model using a clinical dataset comprising 88 subjects with TMJ CBCT images. The average Dice similarity coefficient (DSC) was 0.9817 ± 0.0049, the average surface distance was 0.0540 ± 0.0179 mm, and the 95% Hausdorff distance was 0.1743 ± 0.0550 mm. Our proposed MVEL-Net demonstrates excellent segmentation performance on TMJ from CBCT images, while using fewer GPU memory resources compared to other 3D networks. The effectiveness of this method in capturing spatial context could be leveraged for tasks like organ segmentation from volumetric scans. This may facilitate wider adoption of AI-based solutions for automated analysis of 3D medical images.

Keywords: Cone beam CT images; Convolutional neural networks; Medical image segmentation; Multi-view ensemble learning; Temporomandibular joint.

Grants and funding

81671034/Joint Fund of Coal