Detection, classification, and characterization of proximal humerus fractures on plain radiographs: do convolutional neural networks still outperform humans when the task becomes increasingly complex?

Reinier W. A. Spek; William J. Smith; Marat Sverdlov; Sebastiaan Broos; Yang Zhao; Zhibin Liao; Johan W. Verjans; Jasper Prijs; Minh-Son To; Henrik Åberg; Wael Chiri; Frank F. A. IJpma; Bhavin Jadav; John White; Gregory I. Bain; Paul C. Jutte; Michel P. J. van den Bekerom; Ruurd L. Jaarsma; Job N. Doornberg

doi:10.1302/0301-620X.106B11.BJJ-2024-0264.R1

Current issue

Trauma

Detection, classification, and characterization of proximal humerus fractures on plain radiographs

do convolutional neural networks still outperform humans when the task becomes increasingly complex?

Reinier W. A. Spek
William J. Smith
Marat Sverdlov
Sebastiaan Broos
Yang Zhao
Zhibin Liao
Johan W. Verjans
Jasper Prijs
Minh-Son To
Henrik Åberg
Wael Chiri
Frank F. A. IJpma
Bhavin Jadav
John White
Gregory I. Bain
Paul C. Jutte
Michel P. J. van den Bekerom
Ruurd L. Jaarsma
Job N. Doornberg
the Machine Learning Consortium

Download PDF

Detection, classification, and characterization of proximal humerus fractures on plain radiographs

Spek RWA, Smith WJ, Sverdlov M, et al. Detection, classification, and characterization of proximal humerus fractures on plain radiographs. Bone Joint J. 2024;106-B(11):1348-1360. doi:10.1302/0301-620X.106B11.BJJ-2024-0264.R1

Spek, Reinier W. A., et al. “Detection, classification, and characterization of proximal humerus fractures on plain radiographs.” The Bone & Joint Journal, vol. 106-B, no. 11, 2024, pp. 1348-1360., https://doi.org/10.1302/0301-620X.106B11.BJJ-2024-0264.R1

Spek, R. W. A., Smith, W. J., Sverdlov, M., Broos, S., Zhao, Y., Liao, Z., Verjans, J. W., Prijs, J., To, M., Åberg, H., Chiri, W., IJpma, F. F. A., Jadav, B., White, J., Bain, G. I., Jutte, P. C., van den Bekerom, M. P. J., Jaarsma, R. L., & Doornberg, J. N. (2024). Detection, classification, and characterization of proximal humerus fractures on plain radiographs. The Bone & Joint Journal, 106-B(11), 1348-1360. https://doi.org/10.1302/0301-620X.106B11.BJJ-2024-0264.R1

Spek, R. W. A., Smith, W. J., Sverdlov, M., Broos, S., Zhao, Y., Liao, Z. et al. (2024) “Detection, classification, and characterization of proximal humerus fractures on plain radiographs.” The Bone & Joint Journal, 106-B(11), pp. 1348-1360. Available at: https://doi.org/10.1302/0301-620X.106B11.BJJ-2024-0264.R1

Spek, Reinier W. A., William J. Smith, Marat Sverdlov, Sebastiaan Broos, Yang Zhao, Zhibin Liao, Johan W. Verjans, et al. “Detection, classification, and characterization of proximal humerus fractures on plain radiographs.” The Bone & Joint Journal 106-B, no. 11 (2024): 1348-1360. https://doi.org/10.1302/0301-620X.106B11.BJJ-2024-0264.R1

Spek RWA, Smith WJ, Sverdlov M, Broos S, Zhao Y, Liao Z, et al. Detection, classification, and characterization of proximal humerus fractures on plain radiographs. Bone Joint J. 2024 Nov 1;106-B(11):1348-1360. https://doi.org/10.1302/0301-620X.106B11.BJJ-2024-0264.R1

Copy to clipboard

Mendeley

BibTeX

EndNote

RIS

Abstract

Aims

The purpose of this study was to develop a convolutional neural network (CNN) for fracture detection, classification, and identification of greater tuberosity displacement ≥ 1 cm, neck-shaft angle (NSA) ≤ 100°, shaft translation, and articular fracture involvement, on plain radiographs.

Methods

The CNN was trained and tested on radiographs sourced from 11 hospitals in Australia and externally validated on radiographs from the Netherlands. Each radiograph was paired with corresponding CT scans to serve as the reference standard based on dual independent evaluation by trained researchers and attending orthopaedic surgeons. Presence of a fracture, classification (non- to minimally displaced; two-part, multipart, and glenohumeral dislocation), and four characteristics were determined on 2D and 3D CT scans and subsequently allocated to each series of radiographs. Fracture characteristics included greater tuberosity displacement ≥ 1 cm, NSA ≤ 100°, shaft translation (0% to < 75%, 75% to 95%, > 95%), and the extent of articular involvement (0% to < 15%, 15% to 35%, or > 35%).

Results

For detection and classification, the algorithm was trained on 1,709 radiographs (n = 803), tested on 567 radiographs (n = 244), and subsequently externally validated on 535 radiographs (n = 227). For characterization, healthy shoulders and glenohumeral dislocation were excluded. The overall accuracy for fracture detection was 94% (area under the receiver operating characteristic curve (AUC) = 0.98) and for classification 78% (AUC 0.68 to 0.93). Accuracy to detect greater tuberosity fracture displacement ≥ 1 cm was 35.0% (AUC 0.57). The CNN did not recognize NSAs ≤ 100° (AUC 0.42), nor fractures with ≥ 75% shaft translation (AUC 0.51 to 0.53), or with ≥ 15% articular involvement (AUC 0.48 to 0.49). For all objectives, the model’s performance on the external dataset showed similar accuracy levels.

Conclusion

CNNs proficiently rule out proximal humerus fractures on plain radiographs. Despite rigorous training methodology based on CT imaging with multi-rater consensus to serve as the reference standard, artificial intelligence-driven classification is insufficient for clinical implementation. The CNN exhibited poor diagnostic ability to detect greater tuberosity displacement ≥ 1 cm and failed to identify NSAs ≤ 100°, shaft translations, or articular fractures.

Cite this article: Bone Joint J 2024;106-B(11):1348–1360.

Correspondence should be sent to Reinier Willem Alfred Spek. E-mail: reinierspek@gmail.com

W. J. Smith and M. Sverdlov are joint senior authors.

For access options please click here

Figure 1

Some description here

Information

Journal

The Bone & Joint Journal

Volume

106-B No.11 | Pages 1348 - 1360

Section

Trauma

Published

01 November 2024

DOI

Authors

Expand all

Reinier W. A. Spek

PhD Candidate, Orthopaedic Resident

Department of Orthopaedic Surgery, Flinders Medical Centre, and Flinders University, Adelaide, Australia

Department of Orthopaedic Surgery, University Medical Center Groningen, and University of Groningen, Groningen, Netherlands

Department of Orthopaedic Surgery, OLVG, Amsterdam, Netherlands

Trauma

Detection, classification, and characterization of proximal humerus fractures on plain radiographs

do convolutional neural networks still outperform humans when the task becomes increasingly complex?

Detection, classification, and characterization of proximal humerus fractures on plain radiographs

Abstract

Aims

Methods

Results

Conclusion

Information

Share

Figures

Metrics

References

Individual subscription options

Purchase the article