Aim. This study aimed to externally validate promising preoperative PJI prediction models in a recent, multinational European cohort. Method. Three preoperative PJI prediction models (by Tan et al., Del Toro et al., and Bülow et al.) which previously demonstrated high levels of accuracy were selected for validation. A multicenter retrospective observational analysis was performed of patients undergoing total hip arthroplasty (THA) and total knee arthroplasty (TKA) between January 2020 and December 2021 and treated at centers in the Netherlands, Portugal, and Spain. Patient characteristics were compared between our cohort and those used to develop the prediction
Introduction. With advances in artificial intelligence, the use of computer-aided detection and diagnosis in clinical imaging is gaining traction. Typically, very large datasets are required to train machine-learning models, potentially limiting use of this technology when only small datasets are available. This study investigated whether pretraining of fracture detection models on large, existing datasets could improve the performance of the model when locating and classifying wrist fractures in a small X-ray image dataset. This concept is termed “transfer learning”. Method. Firstly, three detection models, namely, the faster region-based convolutional neural network (faster R-CNN), you only look once version eight (YOLOv8), and RetinaNet, were pretrained using the large, freely available dataset, common objects in context (COCO) (330000 images). Secondly, these models were pretrained using an open-source wrist X-ray dataset called “Graz Paediatric Wrist Digital X-rays” (GRAZPEDWRI-DX) on a (1) fracture detection dataset (20327 images) and (2) fracture location and classification dataset (14390 images). An orthopaedic surgeon classified the small available dataset of 776 distal radius X-rays (Arbeidsgmeischaft für Osteosynthesefragen Foundation / Orthopaedic Trauma Association; AO/OTA), on which the models were tested. Result. Detection models without pre-training on the large datasets were the least precise when tested on the small distal radius dataset. The model with the best accuracy to detect and classify wrist fractures was the YOLOv8 model pretrained on the GRAZPEDWRI-DX fracture detection dataset (mean average precision at intersection over union of 50=59.7%). This model showed up to 33.6% improved detection precision compared to the same models with no pre-training. Conclusion. Optimisation of machine-learning models can be challenging when only relatively small datasets are available. The findings of this study support the potential of transfer learning from large datasets to improve
Introduction. The objective assessment of shoulder function is important for personalized diagnosis, therapies and evidence-based practice but has been limited by specialized equipment and dedicated movement laboratories. Advances in AI-driven computer vision (CV) using consumer RGB cameras (red-blue-green) and open-source CV models offer the potential for routine clinical use. However, key concepts, evidence, and research gaps have not yet been synthesized to drive clinical translation. This scoping review aims to map related literature. Method. Following the JBI Manual for Evidence Synthesis, a scoping review was conducted on PubMed and Scholar using search terms including “shoulder,” “pose estimation,” “camera”, and others. From 146 initial results, 27 papers focusing on clinical applicability and using consumer cameras were included. Analysis employed a Grounded Theory approach guided iterative refinement. Result. Studies primarily used Microsoft Kinect (infrared-based depth sensing, RGB camera; discontinued) or monocular consumer cameras with open-source CV-models, sometimes supplemented by LiDAR (laser-based depth sensing), wearables or markers. Technical validation studies against gold standards were scarce and too inconsistent for comparison. Larger range of motion (RoM) movements were accurately recorded, but smaller movements, rotations and scapula tracking remained challenging. For instance, one larger validation study comparing shoulder angles during arm raises to a marker-based gold-standard reported Pearson's R = 0.98 and a standard error of 2.4deg. OpenPose and Mediapipe were the most used CV-models. Recent efforts try to improve
Introduction. The ability to walk over various surfaces such as cobblestones, slopes or stairs is a very patient centric and clinically meaningful mobility outcome. Current wearable sensors only measure step counts or walking speed regardless of such context relevant for assessing gait function. This study aims to improve deep learning (DL) models to classify surfaces of walking by altering and comparing model features and sensor configurations. Method. Using a public dataset, signals from 6 IMUs (Movella DOT) worn on various body locations (trunk, wrist, right/left thigh, right/left shank) of 30 subjects walking on 9 surfaces were analyzed (flat ground, ramps (up/down), stairs (up/down), cobblestones (irregular), grass (soft), banked (left/right)). Two variations of a CNN Bi-directional LSTM model, with different Batch Normalization layer placement (beginning vs end) as well as data reduction to individual sensors (versus combined) were explored and
Aims. The purpose of this study was to develop a convolutional neural network (CNN) for fracture detection, classification, and identification of greater tuberosity displacement ≥ 1 cm, neck-shaft angle (NSA) ≤ 100°, shaft translation, and articular fracture involvement, on plain radiographs. Methods. The CNN was trained and tested on radiographs sourced from 11 hospitals in Australia and externally validated on radiographs from the Netherlands. Each radiograph was paired with corresponding CT scans to serve as the reference standard based on dual independent evaluation by trained researchers and attending orthopaedic surgeons. Presence of a fracture, classification (non- to minimally displaced; two-part, multipart, and glenohumeral dislocation), and four characteristics were determined on 2D and 3D CT scans and subsequently allocated to each series of radiographs. Fracture characteristics included greater tuberosity displacement ≥ 1 cm, NSA ≤ 100°, shaft translation (0% to < 75%, 75% to 95%, > 95%), and the extent of articular involvement (0% to < 15%, 15% to 35%, or > 35%). Results. For detection and classification, the algorithm was trained on 1,709 radiographs (n = 803), tested on 567 radiographs (n = 244), and subsequently externally validated on 535 radiographs (n = 227). For characterization, healthy shoulders and glenohumeral dislocation were excluded. The overall accuracy for fracture detection was 94% (area under the receiver operating characteristic curve (AUC) = 0.98) and for classification 78% (AUC 0.68 to 0.93). Accuracy to detect greater tuberosity fracture displacement ≥ 1 cm was 35.0% (AUC 0.57). The CNN did not recognize NSAs ≤ 100° (AUC 0.42), nor fractures with ≥ 75% shaft translation (AUC 0.51 to 0.53), or with ≥ 15% articular involvement (AUC 0.48 to 0.49). For all objectives, the
Machine learning (ML), a branch of artificial intelligence that uses algorithms to learn from data and make predictions, offers a pathway towards more personalized and tailored surgical treatments. This approach is particularly relevant to prevalent joint diseases such as osteoarthritis (OA). In contrast to end-stage disease, where joint arthroplasty provides excellent results, early stages of OA currently lack effective therapies to halt or reverse progression. Accurate prediction of OA progression is crucial if timely interventions are to be developed, to enhance patient care and optimize the design of clinical trials. A systematic review was conducted in accordance with PRISMA guidelines. We searched MEDLINE and Embase on 5 May 2024 for studies utilizing ML to predict OA progression. Titles and abstracts were independently screened, followed by full-text reviews for studies that met the eligibility criteria. Key information was extracted and synthesized for analysis, including types of data (such as clinical, radiological, or biochemical), definitions of OA progression, ML algorithms, validation methods, and outcome measures.Aims
Methods
The aim of this study was to create artificial intelligence (AI) software with the purpose of providing a second opinion to physicians to support distal radius fracture (DRF) detection, and to compare the accuracy of fracture detection of physicians with and without software support. The dataset consisted of 26,121 anonymized anterior-posterior (AP) and lateral standard view radiographs of the wrist, with and without DRF. The convolutional neural network (CNN) model was trained to detect the presence of a DRF by comparing the radiographs containing a fracture to the inconspicuous ones. A total of 11 physicians (six surgeons in training and five hand surgeons) assessed 200 pairs of randomly selected digital radiographs of the wrist (AP and lateral) for the presence of a DRF. The same images were first evaluated without, and then with, the support of the CNN model, and the diagnostic accuracy of the two methods was compared.Aims
Methods
The risk factors for recurrent instability (RI) following a primary traumatic anterior shoulder dislocation (PTASD) remain unclear. In this study, we aimed to determine the rate of RI in a large cohort of patients managed nonoperatively after PTASD and to develop a clinical prediction model. A total of 1,293 patients with PTASD managed nonoperatively were identified from a trauma database (mean age 23.3 years (15 to 35); 14.3% female). We assessed the prevalence of RI, and used multivariate regression modelling to evaluate which demographic- and injury-related factors were independently predictive for its occurrence.Aims
Methods
Despite the vast quantities of published artificial intelligence (AI) algorithms that target trauma and orthopaedic applications, very few progress to inform clinical practice. One key reason for this is the lack of a clear pathway from development to deployment. In order to assist with this process, we have developed the Clinical Practice Integration of Artificial Intelligence (CPI-AI) framework – a five-stage approach to the clinical practice adoption of AI in the setting of trauma and orthopaedics, based on the IDEAL principles ( Cite this article:
Precise implant positioning, tailored to individual spinopelvic biomechanics and phenotype, is paramount for stability in total hip arthroplasty (THA). Despite a few studies on instability prediction, there is a notable gap in research utilizing artificial intelligence (AI). The objective of our pilot study was to evaluate the feasibility of developing an AI algorithm tailored to individual spinopelvic mechanics and patient phenotype for predicting impingement. This international, multicentre prospective cohort study across two centres encompassed 157 adults undergoing primary robotic arm-assisted THA. Impingement during specific flexion and extension stances was identified using the virtual range of motion (ROM) tool of the robotic software. The primary AI model, the Light Gradient-Boosting Machine (LGBM), used tabular data to predict impingement presence, direction (flexion or extension), and type. A secondary model integrating tabular data with plain anteroposterior pelvis radiographs was evaluated to assess for any potential enhancement in prediction accuracy.Aims
Methods
Aims. To examine whether natural language processing (NLP) using a clinically based large language model (LLM) could be used to predict patient selection for total hip or total knee arthroplasty (THA/TKA) from routinely available free-text radiology reports. Methods. Data pre-processing and analyses were conducted according to the Artificial intelligence to Revolutionize the patient Care pathway in Hip and knEe aRthroplastY (ARCHERY) project protocol. This included use of de-identified Scottish regional clinical data of patients referred for consideration of THA/TKA, held in a secure data environment designed for artificial intelligence (AI) inference. Only preoperative radiology reports were included. NLP algorithms were based on the freely available GatorTron model, a LLM trained on over 82 billion words of de-identified clinical text. Two inference tasks were performed: assessment after model-fine tuning (50 Epochs and three cycles of k-fold cross validation), and external validation. Results. For THA, there were 5,558 patient radiology reports included, of which 4,137 were used for model training and testing, and 1,421 for external validation. Following training,
To examine whether Natural Language Processing (NLP) using a state-of-the-art clinically based Large Language Model (LLM) could predict patient selection for Total Hip Arthroplasty (THA), across a range of routinely available clinical text sources. Data pre-processing and analyses were conducted according to the Ai to Revolutionise the patient Care pathway in Hip and Knee arthroplasty (ARCHERY) project protocol (. https://www.researchprotocols.org/2022/5/e37092/. ). Three types of deidentified Scottish regional clinical free text data were assessed: Referral letters, radiology reports and clinic letters. NLP algorithms were based on the GatorTron model, a Bidirectional Encoder Representations from Transformers (BERT) based LLM trained on 82 billion words of de-identified clinical text. Three specific inference tasks were performed: assessment of the base GatorTron model, assessment after model-fine tuning, and external validation. There were 3911, 1621 and 1503 patient text documents included from the sources of referral letters, radiology reports and clinic letters respectively. All letter sources displayed significant class imbalance, with only 15.8%, 24.9%, and 5.9% of patients linked to the respective text source documentation having undergone surgery. Untrained
Femoroacetabular Impingement (FAI) syndrome, characterised by abnormal hip contact causing symptoms and osteoarthritis, is measured using the International Hip Outcome Tool (iHOT). This study uses machine learning to predict patient outcomes post-treatment for FAI, focusing on achieving a minimally clinically important difference (MCID) at 52 weeks. A retrospective analysis of 6133 patients from the NAHR who underwent hip arthroscopic treatment for FAI between November 2013 and March 2022 was conducted. MCID was defined as half a standard deviation (13.61) from the mean change in iHOT score at 12 months. SKLearn Maximum Absolute Scaler and Logistic Regression were applied to predict achieving MCID, using baseline and 6-month follow-up data. The
Aims. This study aimed to compare the performance of survival prediction models for bone metastases of the extremities (BM-E) with pathological fractures in an Asian cohort, and investigate patient characteristics associated with survival. Methods. This retrospective cohort study included 469 patients, who underwent surgery for BM-E between January 2009 and March 2022 at a tertiary hospital in South Korea. Postoperative survival was calculated using the PATHFx3.0, SPRING13, OPTIModel, SORG, and IOR
Machine-learning (ML) prediction models in orthopaedic trauma hold great promise in assisting clinicians in various tasks, such as personalized risk stratification. However, an overview of current applications and critical appraisal to peer-reviewed guidelines is lacking. The objectives of this study are to 1) provide an overview of current ML prediction models in orthopaedic trauma; 2) evaluate the completeness of reporting following the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement; and 3) assess the risk of bias following the Prediction model Risk Of Bias Assessment Tool (PROBAST) tool. A systematic search screening 3,252 studies identified 45 ML-based prediction models in orthopaedic trauma up to January 2023. The TRIPOD statement assessed transparent reporting and the PROBAST tool the risk of bias.Aims
Methods
Precision health aims to develop personalised and proactive strategies for predicting, preventing, and treating complex diseases such as osteoarthritis (OA). Due to OA heterogeneity, which makes developing effective treatments challenging, identifying patients at risk for accelerated disease progression is essential for efficient clinical trial design and new treatment target discovery and development. To create a reliable and interpretable precision health tool that predicts rapid knee OA progression over a 2-year period from baseline patient characteristics using an advanced automated machine learning (autoML) framework, “Autoprognosis 2.0”. All available 2-year follow-up periods of 600 patients from the FNIH OA Biomarker Consortium were analysed using “Autoprognosis 2.0” in two separate approaches, with distinct definitions of clinical outcomes: multi-class predictions (categorising disease progression into pain and/or radiographic progression) and binary predictions. Models were developed using a training set of 1352 instances and all available variables (including clinical, X-ray, MRI, and biochemical features), and validated through both stratified 10-fold cross-validation and hold-out validation on a testing set of 339 instances.
Abstract. Introduction. Precision health aims to develop personalised and proactive strategies for predicting, preventing, and treating complex diseases such as osteoarthritis (OA), a degenerative joint disease affecting over 300 million people worldwide. Due to OA heterogeneity, which makes developing effective treatments challenging, identifying patients at risk for accelerated disease progression is essential for efficient clinical trial design and new treatment target discovery and development. Objectives. This study aims to create a trustworthy and interpretable precision health tool that predicts rapid knee OA progression based on baseline patient characteristics using an advanced automated machine learning (autoML) framework, “Autoprognosis 2.0”. Methods. All available 2-year follow-up periods of 600 patients from the FNIH OA Biomarker Consortium were analysed using “Autoprognosis 2.0” in two separate approaches, with distinct definitions of clinical outcomes: multi-class predictions (categorising patients into non-progressors, pain-only progressors, radiographic-only progressors, and both pain and radiographic progressors) and binary predictions (categorising patients into non-progressors and progressors). Models were developed using a training set of 1352 instances and all available variables (including clinical, X-ray, MRI, and biochemical features), and validated through both stratified 10-fold cross-validation and hold-out validation on a testing set of 339 instances.
The October 2023 Children’s orthopaedics Roundup360 looks at: Outcomes of open reduction in children with developmental hip dislocation: a multicentre experience over a decade; A torn discoid lateral meniscus impacts lower-limb alignment regardless of age; Who benefits from allowing the physis to grow in slipped capital femoral epiphysis?; Consensus guidelines on the management of musculoskeletal infection affecting children in the UK; Diagnosis of developmental dysplasia of the hip by ultrasound imaging using deep learning; Outcomes at a mean of 13 years after proximal humeral fracture during adolescence; Clubfeet treated according to Ponseti at four years; Controlled ankle movement boot provides improved outcomes with lower complications than short leg walking cast.
The aim of this study was to identify factors associated with five-year cancer-related mortality in patients with limb and trunk soft-tissue sarcoma (STS) and develop and validate machine learning algorithms in order to predict five-year cancer-related mortality in these patients. Demographic, clinicopathological, and treatment variables of limb and trunk STS patients in the Surveillance, Epidemiology, and End Results Program (SEER) database from 2004 to 2017 were analyzed. Multivariable logistic regression was used to determine factors significantly associated with five-year cancer-related mortality. Various machine learning models were developed and compared using area under the curve (AUC), calibration, and decision curve analysis. The model that performed best on the SEER testing data was further assessed to determine the variables most important in its predictive capacity. This model was externally validated using our institutional dataset.Aims
Methods
To identify variables independently associated with same-day discharge (SDD) of patients following revision total knee arthroplasty (rTKA) and to develop machine learning algorithms to predict suitable candidates for outpatient rTKA. Data were obtained from the American College of Surgeons National Quality Improvement Programme (ACS-NSQIP) database from the years 2018 to 2020. Patients with elective, unilateral rTKA procedures and a total hospital length of stay between zero and four days were included. Demographic, preoperative, and intraoperative variables were analyzed. A multivariable logistic regression (MLR) model and various machine learning techniques were compared using area under the curve (AUC), calibration, and decision curve analysis. Important and significant variables were identified from the models.Aims
Methods