Abstract
Introduction
Three-dimensional (3D) morphological understanding of the hip joint, specifically the joint space and surrounding anatomy, including the proximal femur and the pelvis bone, is crucial for a range of orthopedic diagnoses and surgical planning. While deep learning algorithms can provide higher accuracy for segmenting bony structures, delineating hip joint space formed by cartilage layers is often left for subjective manual evaluation. This study compared the performance of two state-of-the-art 3D deep learning architectures (3D UNET and 3D UNETR) for automated segmentation of proximal femur bone, pelvis bone, and hip joint space with single and multi-class label segmentation strategies.
Method
A dataset of 56 3D CT images covering the hip joint was used for the study. Two bones and hip joint space were manually segmented for training and evaluation. Deep learning models were trained and evaluated for a single-class approach for each label (proximal femur, pelvis, and the joint space) separately, and for a multi-class approach to segment all three labels simultaneously. A consistent training configuration of hyperparameters was used across all models by implementing the AdamW optimizer and Dice Loss as the primary loss function. Dice score, Root Mean Squared Error, and Mean Absolute Error were utilized as evaluation metrics.
Results
Both the models performed at excellent levels for single-label segmentations in bones (dice > 0.95), but single-label joint space performance remained considerably lower (dice < 0.87). Multi-class segmentations remained at lower performance (dice < 0.88) for both models. Combining bone and joint space labels may have introduced a class imbalance problem in multi-class models, leading to lower performance.
Conclusion
It is not clear if 3D UNETR provides better performance as the selection of hyperparameters was the same across the models and was not optimized. Further evaluations will be needed with baseline UNET and nnUNET modeling architectures.