Advertisement for orthosearch.org.uk
Results 1 - 4 of 4
Results per page:
Bone & Joint Open
Vol. 2, Issue 10 | Pages 879 - 885
20 Oct 2021
Oliveira e Carmo L van den Merkhof A Olczak J Gordon M Jutte PC Jaarsma RL IJpma FFA Doornberg JN Prijs J

Aims. The number of convolutional neural networks (CNN) available for fracture detection and classification is rapidly increasing. External validation of a CNN on a temporally separate (separated by time) or geographically separate (separated by location) dataset is crucial to assess generalizability of the CNN before application to clinical practice in other institutions. We aimed to answer the following questions: are current CNNs for fracture recognition externally valid?; which methods are applied for external validation (EV)?; and, what are reported performances of the EV sets compared to the internal validation (IV) sets of these CNNs?. Methods. The PubMed and Embase databases were systematically searched from January 2010 to October 2020 according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. The type of EV, characteristics of the external dataset, and diagnostic performance characteristics on the IV and EV datasets were collected and compared. Quality assessment was conducted using a seven-item checklist based on a modified Methodologic Index for NOn-Randomized Studies instrument (MINORS). Results. Out of 1,349 studies, 36 reported development of a CNN for fracture detection and/or classification. Of these, only four (11%) reported a form of EV. One study used temporal EV, one conducted both temporal and geographical EV, and two used geographical EV. When comparing the CNN’s performance on the IV set versus the EV set, the following were found: AUCs of 0.967 (IV) versus 0.975 (EV), 0.976 (IV) versus 0.985 to 0.992 (EV), 0.93 to 0.96 (IV) versus 0.80 to 0.89 (EV), and F1-scores of 0.856 to 0.863 (IV) versus 0.757 to 0.840 (EV). Conclusion. The number of externally validated CNNs in orthopaedic trauma for fracture recognition is still scarce. This greatly limits the potential for transfer of these CNNs from the developing institute to another hospital to achieve similar diagnostic performance. We recommend the use of geographical EV and statements such as the Consolidated Standards of Reporting Trials–Artificial Intelligence (CONSORT-AI), the Standard Protocol Items: Recommendations for Interventional Trials–Artificial Intelligence (SPIRIT-AI) and the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis–Machine Learning (TRIPOD-ML) to critically appraise performance of CNNs and improve methodological rigor, quality of future models, and facilitate eventual implementation in clinical practice. Cite this article: Bone Jt Open 2021;2(10):879–885


Bone & Joint Research
Vol. 13, Issue 10 | Pages 588 - 595
17 Oct 2024
Breu R Avelar C Bertalan Z Grillari J Redl H Ljuhar R Quadlbauer S Hausner T

Aims. The aim of this study was to create artificial intelligence (AI) software with the purpose of providing a second opinion to physicians to support distal radius fracture (DRF) detection, and to compare the accuracy of fracture detection of physicians with and without software support. Methods. The dataset consisted of 26,121 anonymized anterior-posterior (AP) and lateral standard view radiographs of the wrist, with and without DRF. The convolutional neural network (CNN) model was trained to detect the presence of a DRF by comparing the radiographs containing a fracture to the inconspicuous ones. A total of 11 physicians (six surgeons in training and five hand surgeons) assessed 200 pairs of randomly selected digital radiographs of the wrist (AP and lateral) for the presence of a DRF. The same images were first evaluated without, and then with, the support of the CNN model, and the diagnostic accuracy of the two methods was compared. Results. At the time of the study, the CNN model showed an area under the receiver operating curve of 0.97. AI assistance improved the physician’s sensitivity (correct fracture detection) from 80% to 87%, and the specificity (correct fracture exclusion) from 91% to 95%. The overall error rate (combined false positive and false negative) was reduced from 14% without AI to 9% with AI. Conclusion. The use of a CNN model as a second opinion can improve the diagnostic accuracy of DRF detection in the study setting. Cite this article: Bone Joint Res 2024;13(10):588–595


Bone & Joint Open
Vol. 5, Issue 6 | Pages 524 - 531
24 Jun 2024
Woldeyesus TA Gjertsen J Dalen I Meling T Behzadi M Harboe K Djuv A

Aims. To investigate if preoperative CT improves detection of unstable trochanteric hip fractures. Methods. A single-centre prospective study was conducted. Patients aged 65 years or older with trochanteric hip fractures admitted to Stavanger University Hospital (Stavanger, Norway) were consecutively included from September 2020 to January 2022. Radiographs and CT images of the fractures were obtained, and surgeons made individual assessments of the fractures based on these. The assessment was conducted according to a systematic protocol including three classification systems (AO/Orthopaedic Trauma Association (OTA), Evans Jensen (EVJ), and Nakano) and questions addressing specific fracture patterns. An expert group provided a gold-standard assessment based on the CT images. Sensitivities and specificities of surgeons’ assessments were estimated and compared in regression models with correlations for the same patients. Intra- and inter-rater reliability were presented as Cohen’s kappa and Gwet’s agreement coefficient (AC1). Results. We included 120 fractures in 119 patients. Compared to radiographs, CT increased the sensitivity of detecting unstable trochanteric fractures from 63% to 70% (p = 0.028) and from 70% to 76% (p = 0.004) using AO/OTA and EVJ, respectively. Compared to radiographs alone, CT increased the sensitivity of detecting a large posterolateral trochanter major fragment or a comminuted trochanter major fragment from 63% to 76% (p = 0.002) and from 38% to 55% (p < 0.001), respectively. CT improved intra-rater reliability for stability assessment using EVJ (AC1 0.68 to 0.78; p = 0.049) and for detecting a large posterolateral trochanter major fragment (AC1 0.42 to 0.57; p = 0.031). Conclusion. A preoperative CT of trochanteric fractures increased detection of unstable fractures using the AO/OTA and EVJ classification systems. Compared to radiographs, CT improved intra-rater reliability when assessing fracture stability and detecting large posterolateral trochanter major fragments. Cite this article: Bone Jt Open 2024;5(6):524–531


Bone & Joint Research
Vol. 12, Issue 7 | Pages 447 - 454
10 Jul 2023
Lisacek-Kiosoglous AB Powling AS Fontalis A Gabr A Mazomenos E Haddad FS

The use of artificial intelligence (AI) is rapidly growing across many domains, of which the medical field is no exception. AI is an umbrella term defining the practical application of algorithms to generate useful output, without the need of human cognition. Owing to the expanding volume of patient information collected, known as ‘big data’, AI is showing promise as a useful tool in healthcare research and across all aspects of patient care pathways. Practical applications in orthopaedic surgery include: diagnostics, such as fracture recognition and tumour detection; predictive models of clinical and patient-reported outcome measures, such as calculating mortality rates and length of hospital stay; and real-time rehabilitation monitoring and surgical training. However, clinicians should remain cognizant of AI’s limitations, as the development of robust reporting and validation frameworks is of paramount importance to prevent avoidable errors and biases. The aim of this review article is to provide a comprehensive understanding of AI and its subfields, as well as to delineate its existing clinical applications in trauma and orthopaedic surgery. Furthermore, this narrative review expands upon the limitations of AI and future direction.

Cite this article: Bone Joint Res 2023;12(7):447–454.