Search | Bone & Joint

Results 1 - 20 of 66

Results per page:

Applied filters

Access

Journals

Abstracts

Dates

Specialties

The Bone & Joint Journal

Vol. 102-B, Issue 6 Supple A | Pages 101 - 106

1 Jun 2020

Incremental inputs improve the automated detection of implant loosening using machine-learning algorithms

Shah RF Bini SA Martinez AM Pedoia V Vail TP

Access Required

View article Download PDF

Aims. The aim of this study was to evaluate the ability of a machine-learning algorithm to diagnose prosthetic loosening from preoperative radiographs and to investigate the inputs that might improve its performance. Methods. A group of 697 patients underwent a first-time revision of a total hip (THA) or total knee arthroplasty (TKA) at our institution between 2012 and 2018. Preoperative anteroposterior (AP) and lateral radiographs, and historical and comorbidity information were collected from their electronic records. Each patient was defined as having loose or fixed components based on the operation notes. We trained a series of convolutional neural network (CNN) models to predict a diagnosis of loosening at the time of surgery from the preoperative radiographs. We then added historical data about the patients to the best performing model to create a final model and tested it on an independent dataset. Results. The convolutional neural network we built performed well when detecting loosening from radiographs alone. The first model built de novo with only the radiological image as input had an accuracy of 70%. The final model, which was built by fine-tuning a publicly available model named DenseNet, combining the AP and lateral radiographs, and incorporating information from the patient’s history, had an accuracy, sensitivity, and specificity of 88.3%, 70.2%, and 95.6% on the independent test dataset. It performed better for cases of revision THA with an accuracy of 90.1%, than for cases of revision TKA with an accuracy of 85.8%. Conclusion. This study showed that machine learning can detect prosthetic loosening from radiographs. Its accuracy is enhanced when using highly trained public algorithms, and when adding clinical data to the algorithm. While this algorithm may not be sufficient in its present state of development as a standalone metric of loosening, it is currently a useful augment for clinical decision making. Cite this article: Bone Joint J 2020;102-B(6 Supple A):101–106

The Bone & Joint Journal

Vol. 103-B, Issue 12 | Pages 1754 - 1758

1 Dec 2021

Interpretation and reporting of predictive or diagnostic machine-learning research in Trauma & Orthopaedics

Farrow L Zhong M Ashcroft GP Anderson L Meek RMD

Access Required

View article Download PDF

There is increasing popularity in the use of artificial intelligence and machine-learning techniques to provide diagnostic and prognostic models for various aspects of Trauma & Orthopaedic surgery. However, correct interpretation of these models is difficult for those without specific knowledge of computing or health data science methodology. Lack of current reporting standards leads to the potential for significant heterogeneity in the design and quality of published studies. We provide an overview of machine-learning techniques for the lay individual, including key terminology and best practice reporting guidelines. Cite this article: Bone Joint J 2021;103-B(12):1754–1758

Bone & Joint Research

Vol. 13, Issue 2 | Pages 66 - 82

5 Feb 2024

Transcriptomic analyses and machine-learning methods reveal dysregulated key genes and potential pathogenesis in human osteoarthritic cartilage

Zhao D Zeng L Liang G Luo M Pan J Dou Y Lin F Huang H Yang W Liu J

Full Access

View article Download PDF

Aims. This study aimed to explore the biological and clinical importance of dysregulated key genes in osteoarthritis (OA) patients at the cartilage level to find potential biomarkers and targets for diagnosing and treating OA. Methods. Six sets of gene expression profiles were obtained from the Gene Expression Omnibus database. Differential expression analysis, weighted gene coexpression network analysis (WGCNA), and multiple machine-learning algorithms were used to screen crucial genes in osteoarthritic cartilage, and genome enrichment and functional annotation analyses were used to decipher the related categories of gene function. Single-sample gene set enrichment analysis was performed to analyze immune cell infiltration. Correlation analysis was used to explore the relationship among the hub genes and immune cells, as well as markers related to articular cartilage degradation and bone mineralization. Results. A total of 46 genes were obtained from the intersection of significantly upregulated genes in osteoarthritic cartilage and the key module genes screened by WGCNA. Functional annotation analysis revealed that these genes were closely related to pathological responses associated with OA, such as inflammation and immunity. Four key dysregulated genes (cartilage acidic protein 1 (CRTAC1), iodothyronine deiodinase 2 (DIO2), angiopoietin-related protein 2 (ANGPTL2), and MAGE family member D1 (MAGED1)) were identified after using machine-learning algorithms. These genes had high diagnostic value in both the training cohort and external validation cohort (receiver operating characteristic > 0.8). The upregulated expression of these hub genes in osteoarthritic cartilage signified higher levels of immune infiltration as well as the expression of metalloproteinases and mineralization markers, suggesting harmful biological alterations and indicating that these hub genes play an important role in the pathogenesis of OA. A competing endogenous RNA network was constructed to reveal the underlying post-transcriptional regulatory mechanisms. Conclusion. The current study explores and validates a dysregulated key gene set in osteoarthritic cartilage that is capable of accurately diagnosing OA and characterizing the biological alterations in osteoarthritic cartilage; this may become a promising indicator in clinical decision-making. This study indicates that dysregulated key genes play an important role in the development and progression of OA, and may be potential therapeutic targets. Cite this article: Bone Joint Res 2024;13(2):66–82

Bone & Joint Open

Vol. 5, Issue 1 | Pages 9 - 19

16 Jan 2024

Systematic review of machine-learning models in orthopaedic trauma

Dijkstra H van de Kuit A de Groot TM Canta O Groot OQ Oosterhoff JH Doornberg JN

Full Access

View article Download PDF

Aims. Machine-learning (ML) prediction models in orthopaedic trauma hold great promise in assisting clinicians in various tasks, such as personalized risk stratification. However, an overview of current applications and critical appraisal to peer-reviewed guidelines is lacking. The objectives of this study are to 1) provide an overview of current ML prediction models in orthopaedic trauma; 2) evaluate the completeness of reporting following the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement; and 3) assess the risk of bias following the Prediction model Risk Of Bias Assessment Tool (PROBAST) tool. Methods. A systematic search screening 3,252 studies identified 45 ML-based prediction models in orthopaedic trauma up to January 2023. The TRIPOD statement assessed transparent reporting and the PROBAST tool the risk of bias. Results. A total of 40 studies reported on training and internal validation; four studies performed both development and external validation, and one study performed only external validation. The most commonly reported outcomes were mortality (33%, 15/45) and length of hospital stay (9%, 4/45), and the majority of prediction models were developed in the hip fracture population (60%, 27/45). The overall median completeness for the TRIPOD statement was 62% (interquartile range 30 to 81%). The overall risk of bias in the PROBAST tool was low in 24% (11/45), high in 69% (31/45), and unclear in 7% (3/45) of the studies. High risk of bias was mainly due to analysis domain concerns including small datasets with low number of outcomes, complete-case analysis in case of missing data, and no reporting of performance measures. Conclusion. The results of this study showed that despite a myriad of potential clinically useful applications, a substantial part of ML studies in orthopaedic trauma lack transparent reporting, and are at high risk of bias. These problems must be resolved by following established guidelines to instil confidence in ML models among patients and clinicians. Otherwise, there will remain a sizeable gap between the development of ML prediction models and their clinical application in our day-to-day orthopaedic trauma practice. Cite this article: Bone Jt Open 2024;5(1):9–19

Bone & Joint Open

Vol. 4, Issue 3 | Pages 168 - 181

14 Mar 2023

Development of machine-learning algorithms for 90-day and one-year mortality prediction in the elderly with femoral neck fractures based on the HEALTH and FAITH trials

Dijkstra H Oosterhoff JHF van de Kuit A IJpma FFA Schwab JH Poolman RW Sprague S Bzovsky S Bhandari M Swiontkowski M Schemitsch EH Doornberg JN Hendrickx LAM

Full Access

View article Download PDF

Aims. To develop prediction models using machine-learning (ML) algorithms for 90-day and one-year mortality prediction in femoral neck fracture (FNF) patients aged 50 years or older based on the Hip fracture Evaluation with Alternatives of Total Hip arthroplasty versus Hemiarthroplasty (HEALTH) and Fixation using Alternative Implants for the Treatment of Hip fractures (FAITH) trials. Methods. This study included 2,388 patients from the HEALTH and FAITH trials, with 90-day and one-year mortality proportions of 3.0% (71/2,388) and 6.4% (153/2,388), respectively. The mean age was 75.9 years (SD 10.8) and 65.9% of patients (1,574/2,388) were female. The algorithms included patient and injury characteristics. Six algorithms were developed, internally validated and evaluated across discrimination (c-statistic; discriminative ability between those with risk of mortality and those without), calibration (observed outcome compared to the predicted probability), and the Brier score (composite of discrimination and calibration). Results. The developed algorithms distinguished between patients at high and low risk for 90-day and one-year mortality. The penalized logistic regression algorithm had the best performance metrics for both 90-day (c-statistic 0.80, calibration slope 0.95, calibration intercept -0.06, and Brier score 0.039) and one-year (c-statistic 0.76, calibration slope 0.86, calibration intercept -0.20, and Brier score 0.074) mortality prediction in the hold-out set. Conclusion. Using high-quality data, the ML-based prediction models accurately predicted 90-day and one-year mortality in patients aged 50 years or older with a FNF. The final models must be externally validated to assess generalizability to other populations, and prospectively evaluated in the process of shared decision-making. Cite this article: Bone Jt Open 2023;4(3):168–181

Bone & Joint Open

Vol. 3, Issue 1 | Pages 93 - 97

10 Jan 2022

Potential benefits, unintended consequences, and future roles of artificial intelligence in orthopaedic surgery research

Kunze KN Orr M Krebs V Bhandari M Piuzzi NS

Full Access

View article Download PDF

Artificial intelligence and machine-learning analytics have gained extensive popularity in recent years due to their clinically relevant applications. A wide range of proof-of-concept studies have demonstrated the ability of these analyses to personalize risk prediction, detect implant specifics from imaging, and monitor and assess patient movement and recovery. Though these applications are exciting and could potentially influence practice, it is imperative to understand when these analyses are indicated and where the data are derived from, prior to investing resources and confidence into the results and conclusions. In this article, we review the current benefits and potential limitations of machine-learning for the orthopaedic surgeon with a specific emphasis on data quality

Bone & Joint Open

Vol. 3, Issue 10 | Pages 786 - 794

12 Oct 2022

Computerized adaptive testing for the Oxford Hip, Knee, Shoulder, and Elbow scores

Harrison CJ Plummer OR Dawson J Jenkinson C Hunt A Rodrigues JN

Full Access

View article Download PDF

Aims. The aim of this study was to develop and evaluate machine-learning-based computerized adaptive tests (CATs) for the Oxford Hip Score (OHS), Oxford Knee Score (OKS), Oxford Shoulder Score (OSS), and the Oxford Elbow Score (OES) and its subscales. Methods. We developed CAT algorithms for the OHS, OKS, OSS, overall OES, and each of the OES subscales, using responses to the full-length questionnaires and a machine-learning technique called regression tree learning. The algorithms were evaluated through a series of simulation studies, in which they aimed to predict respondents’ full-length questionnaire scores from only a selection of their item responses. In each case, the total number of items used by the CAT algorithm was recorded and CAT scores were compared to full-length questionnaire scores by mean, SD, score distribution plots, Pearson’s correlation coefficient, intraclass correlation (ICC), and the Bland-Altman method. Differences between CAT scores and full-length questionnaire scores were contextualized through comparison to the instruments’ minimal clinically important difference (MCID). Results. The CAT algorithms accurately estimated 12-item questionnaire scores from between four and nine items. Scores followed a very similar distribution between CAT and full-length assessments, with the mean score difference ranging from 0.03 to 0.26 out of 48 points. Pearson’s correlation coefficient and ICC were 0.98 for each 12-item scale and 0.95 or higher for the OES subscales. In over 95% of cases, a patient’s CAT score was within five points of the full-length questionnaire score for each 12-item questionnaire. Conclusion. Oxford Hip Score, Oxford Knee Score, Oxford Shoulder Score, and Oxford Elbow Score (including separate subscale scores) CATs all markedly reduce the burden of items to be completed without sacrificing score accuracy. Cite this article: Bone Jt Open 2022;3(10):786–794

Orthopaedic Proceedings

Vol. 104-B, Issue SUPP_13 | Pages 60 - 60

1 Dec 2022

MACHINE-LEARNING ALGORITHM TO PREDICT ANTERIOR CRUCIATE LIGAMENT REVISION DEMONSTRATES EXTERNAL VALIDITY

Martin RK Wastvedt S Pareek A Persson A Visnes H Fenstad AM Moatshe G Wolfson J Lind M Engebretsen L

Full Access

View article

External validation of machine learning predictive models is achieved through evaluation of model performance on different groups of patients than were used for algorithm development. This important step is uncommonly performed, inhibiting clinical translation of newly developed models. Recently, machine learning was used to develop a tool that can quantify revision risk for a patient undergoing primary anterior cruciate ligament (ACL) reconstruction (https://swastvedt.shinyapps.io/calculator_rev/). The source of data included nearly 25,000 patients with primary ACL reconstruction recorded in the Norwegian Knee Ligament Register (NKLR). The result was a well-calibrated tool capable of predicting revision risk one, two, and five years after primary ACL reconstruction with moderate accuracy. The purpose of this study was to determine the external validity of the NKLR model by assessing algorithm performance when applied to patients from the Danish Knee Ligament Registry (DKLR).

The primary outcome measure of the NKLR model was probability of revision ACL reconstruction within 1, 2, and/or 5 years. For the index study, 24 total predictor variables in the NKLR were included and the models eliminated variables which did not significantly improve prediction ability - without sacrificing accuracy. The result was a well calibrated algorithm developed using the Cox Lasso model that only required five variables (out of the original 24) for outcome prediction. For this external validation study, all DKLR patients with complete data for the five variables required for NKLR prediction were included. The five variables were: graft choice, femur fixation device, Knee Injury and Osteoarthritis Outcome Score (KOOS) Quality of Life subscale score at surgery, years from injury to surgery, and age at surgery. Predicted revision probabilities were calculated for all DKLR patients. The model performance was assessed using the same metrics as the NKLR study: concordance and calibration.

In total, 10,922 DKLR patients were included for analysis. Average follow-up time or time-to-revision was 8.4 (±4.3) years and overall revision rate was 6.9%. Surgical technique trends (i.e., graft choice and fixation devices) and injury characteristics (i.e., concomitant meniscus and cartilage pathology) were dissimilar between registries. The model produced similar concordance when applied to the DKLR population compared to the original NKLR test data (DKLR: 0.68; NKLR: 0.68-0.69). Calibration was poorer for the DKLR population at one and five years post primary surgery but similar to the NKLR at two years.

The NKLR machine learning algorithm demonstrated similar performance when applied to patients from the DKLR, suggesting that it is valid for application outside of the initial patient population. This represents the first machine learning model for predicting revision ACL reconstruction that has been externally validated. Clinicians can use this in-clinic calculator to estimate revision risk at a patient specific level when discussing outcome expectations pre-operatively. While encouraging, it should be noted that the performance of the model on patients undergoing ACL reconstruction outside of Scandinavia remains unknown.

Orthopaedic Proceedings

Vol. 106-B, Issue SUPP_1 | Pages 140 - 140

2 Jan 2024

REAL-WORLD VALIDATION OF A MACHINE-LEARNING ALGORITHM PREDICTING TREATMENT STRATEGY FOR HIP OSTEOARTRITIS

van der Weegen W Warren T Agricola R Das D Siebelt M

Full Access

View article

Artificial Intelligence (AI) is becoming more powerful but is barely used to counter the growth in health care burden. AI applications to increase efficiency in orthopedics are rare. We questioned if (1) we could train machine learning (ML) algorithms, based on answers from digitalized history taking questionnaires, to predict treatment of hip osteoartritis (either conservative or surgical); (2) such an algorithm could streamline clinical consultation.

Multiple ML models were trained on 600 annotated (80% training, 20% test) digital history taking questionnaires, acquired before consultation. Best performing models, based on balanced accuracy and optimized automated hyperparameter tuning, were build into our daily clinical orthopedic practice. Fifty patients with hip complaints (>45 years) were prospectively predicted and planned (partly blinded, partly unblinded) for consultation with the physician assistant (conservative) or orthopedic surgeon (operative). Tailored patient information based on the prediction was automatically sent to a smartphone app. Level of evidence: IV.

Random Forest and BernoulliNB were the most accurate ML models (0.75 balanced accuracy). Treatment prediction was correct in 45 out of 50 consultations (90%), p<0.0001 (sign and binomial test). Specialized consultations where conservatively predicted patients were seen by the physician assistant and surgical patients by the orthopedic surgeon were highly appreciated and effective.

Treatment strategy of hip osteoartritis based on answers from digital history taking questionnaires was accurately predicted before patients entered the hospital. This can make outpatient consultation scheduling more efficient and tailor pre-consultation patient education.

Orthopaedic Proceedings

Vol. 103-B, Issue SUPP_1 | Pages 11 - 11

1 Feb 2021

A MACHINE-LEARNING APPROACH FOR MEASURING ARTICULAR CARTILAGE DAMAGE IN THE KNEE

Bartolo M Accardi M Dini D Amis A

Full Access

View article

Objectives

Articular cartilage damage is a primary outcome of pre-clinical and clinical studies evaluating meniscal and cartilage repair or replacement techniques. Recent studies have quantitatively characterized India Ink stained cartilage damage through light reflectance and the application of local or global thresholds. We develop a method for the quantitative characterisation of inked cartilage damage with improved generalisation capability, and compare its performance to the threshold-based baseline approach against gold standard labels.

Methods

The Trainable WEKA Segmentation (TWS) tool (Arganda-Carreras et al., 2017) available in Fiji (Rueden et al., 2017) was used to train two separate Random Forest classifiers to automatically segment cartilage damage on ink stained cadaveric ovine stifle joints. Gold standard labels were manually annotated for the training, validation and test datasets for each of the femoral and tibial classifiers. Each dataset included a sample of medial and lateral femoral condyles and tibial plateaus from various stifle joints, selected to ensure no overlap across datasets according to ovine identifier. Training was performed on the training data with the TWS tool using edge, texture and noise reduction filters selected for their suitability and performance. The two trained classifiers were then applied to the validation data to output damage probability maps, on which a threshold value was calibrated. Model predictions on the unseen test set were evaluated against the gold standard labels using the Dice Similarity Coefficient (DSC) – an overlap-based metric, and compared with results for the baseline global threshold approach applied in Fiji as shown in Figures 1 and 2.

Orthopaedic Proceedings

Vol. 102-B, Issue SUPP_10 | Pages 8 - 8

1 Oct 2020

TOTAL HIP ARTHROPLASTY DISLOCATION RISK USING A MACHINE-LEARNING ALGORITHM

Wyles CC Maradit-Kremers H Rouzrokh P Barman P Larson DR Polley EC Lewallen DG Berry DJ Pagnano MW Taunton MJ Trousdale RT Sierra RJ

Full Access

View article

Introduction

Instability remains a common complication following total hip arthroplasty (THA) and continues to account for the highest percentage of revisions in numerous registries. Many risk factors have been described, yet a patient-specific risk assessment tool remains elusive. The purpose of this study was to apply a machine learning algorithm to develop a patient-specific risk score capable of dynamic adjustment based on operative decisions.

Methods

22,086 THA performed between 1998–2018 were evaluated. 632 THA sustained a postoperative dislocation (2.9%). Patients were robustly characterized based on non-modifiable factors: demographics, THA indication, spinal disease, spine surgery, neurologic disease, connective tissue disease; and modifiable operative decisions: surgical approach, femoral head size, acetabular liner (standard/elevated/constrained/dual-mobility). Models were built with a binary outcome (event/no event) at 1-year and 5-year postoperatively. Inverse Probability Censoring Weighting accounted for censoring bias. An ensemble algorithm was created that included Generalized Linear Model, Generalized Additive Model, Lasso Penalized Regression, Kernel-Based Support Vector Machines, Random Forest and Optimized Gradient Boosting Machine. Convex combination of weights minimized the negative binomial log-likelihood loss function. Ten-fold cross-validation accounted for the rarity of dislocation events.

Orthopaedic Proceedings

Vol. 104-B, Issue SUPP_12 | Pages 90 - 90

1 Dec 2022

MACHINE-LEARNING MODELS CAN ACCURATELY PREDICT ORTHOPAEDIC RESIDENT WORKLOAD AT A LEVEL I TRAUMA CENTRE

Abbas A Toor J Du JT Versteeg A Yee N Finkelstein J Abouali J Nousiainen M Kreder H Hall J Whyne C Larouche J

Full Access

View article

Excessive resident duty hours (RDH) are a recognized issue with implications for physician well-being and patient safety. A major component of the RDH concern is on-call duty. While considerable work has been done to reduce resident call workload, there is a paucity of research in optimizing resident call scheduling. Call coverage is scheduled manually rather than demand-based, which generally leads to over-scheduling to prevent a service gap. Machine learning (ML) has been widely applied in other industries to prevent such issues of a supply-demand mismatch. However, the healthcare field has been slow to adopt these innovations. As such, the aim of this study was to use ML models to 1) predict demand on orthopaedic surgery residents at a level I trauma centre and 2) identify variables key to demand prediction.

Daily surgical handover emails over an eight year (2012-2019) period at a level I trauma centre were collected. The following data was used to calculate demand: spine call coverage, date, and number of operating rooms (ORs), traumas, admissions and consults completed. Various ML models (linear, tree-based and neural networks) were trained to predict the workload, with their results compared to the current scheduling approach. Quality of models was determined by using the area under the receiver operator curve (AUC) and accuracy of the predictions. The top ten most important variables were extracted from the most successful model.

During training, the model with the highest AUC and accuracy was the multivariate adaptive regression splines (MARS) model, with an AUC of 0.78±0.03 and accuracy of 71.7%±3.1%. During testing, the model with the highest AUC and accuracy was the neural network model, with an AUC of 0.81 and accuracy of 73.7%. All models were better than the current approach, which had an AUC of 0.50 and accuracy of 50.1%. Key variables used by the neural network model were (descending order): spine call duty, year, weekday/weekend, month, and day of the week.

This was the first study attempting to use ML to predict the service demand on orthopaedic surgery residents at a major level I trauma centre. Multiple ML models were shown to be more appropriate and accurate at predicting the demand on surgical residents as compared to the current scheduling approach. Future work should look to incorporate predictive models with optimization strategies to match scheduling with demand in order to improve resident well being and patient care.

Orthopaedic Proceedings

Vol. 104-B, Issue SUPP_12 | Pages 33 - 33

1 Dec 2022

MACHINE-LEARNING MODELS BUILT ON PREOPERATIVE PATIENT FACTORS ACCURATELY PREDICT DURATION OF SURGERY AND LENGTH OF STAY FOR TOTAL KNEE AND HIP ARTHROPLASTY

Abbas A Lex J Toor J Mosseri J Khalil E Ravi B Whyne C

Full Access

View article

Total knee and hip arthroplasty (TKA and THA) are two of the highest volume and resource intensive surgical procedures. Key drivers of the cost of surgical care are duration of surgery (DOS) and postoperative inpatient length of stay (LOS). The ability to predict TKA and THA DOS and LOS has substantial implications for hospital finances, scheduling and resource allocation. The goal of this study was to predict DOS and LOS for elective unilateral TKAs and THAs using machine learning models (MLMs) constructed on preoperative patient factors using a large North American database.

The American College of Surgeons (ACS) National Surgical and Quality Improvement (NSQIP) database was queried for elective unilateral TKA and THA procedures from 2014-2019. The dataset was split into training, validation and testing based on year. Multiple conventional and deep MLMs such as linear, tree-based and multilayer perceptrons (MLPs) were constructed. The models with best performance on the validation set were evaluated on the testing set. Models were evaluated according to 1) mean squared error (MSE), 2) buffer accuracy (the number of times the predicted target was within a predesignated buffer of the actual target), and 3) classification accuracy (the number of times the correct class was predicted by the models). To ensure useful predictions, the results of the models were compared to a mean regressor.

A total of 499,432 patients (TKA 302,490; THA 196,942) were included. The MLP models had the best MSEs and accuracy across both TKA and THA patients. During testing, the TKA MSEs for DOS and LOS were 0.893 and 0.688 while the THA MSEs for DOS and LOS were 0.895 and 0.691. The TKA DOS 30-minute buffer accuracy and ≤120 min, >120 min classification accuracy were 78.8% and 88.3%, while the TKA LOS 1-day buffer accuracy and ≤2 days, >2 days classification accuracy were 75.2% and 76.1%. The THA DOS 30-minute buffer accuracy and ≤120 min, >120 min classification accuracy were 81.6% and 91.4%, while the THA LOS 1-day buffer accuracy and ≤2 days, >2 days classification accuracy were 78.3% and 80.4%. All models across both TKA and THA patients were more accurate than the mean regressors for both DOS and LOS predictions across both buffer and classification accuracies.

Conventional and deep MLMs have been effectively implemented to predict the DOS and LOS of elective unilateral TKA and THA patients based on preoperative patient factors using a large North American database with a high level of accuracy. Future work should include using operational factors to further refine these models and improve predictive accuracy. Results of this work will allow institutions to optimize their resource allocation, reduce costs and improve surgical scheduling.

Acknowledgements:

The American College of Surgeons National Surgical Quality Improvement Program and the hospitals participating in the ACS NSQIP are the source of the data used herein; they have not verified and are not responsible for the statistical validity of the data analysis or the conclusions derived by the authors.

Orthopaedic Proceedings

Vol. 104-B, Issue SUPP_4 | Pages 29 - 29

1 Apr 2022

CAN MACHINE-LEARNING ALGORITHMS PREDICT WHICH PATIENTS WILL ACHIEVE MCID AFTER ARTHROSCOPIC MANAGEMENT OF FEMOROACETABULAR IMPINGEMENT?

Pettit MH Hickman S Malviya A Khanduja V

Full Access

View article

Identification of patients at risk of not achieving minimally clinically important differences (MCID) in patient reported outcome measures (PROMs) is important to ensure principled and informed pre-operative decision making. Machine learning techniques may enable the generation of a predictive model for attainment of MCID in hip arthroscopy.

Aims: 1) to determine whether machine learning techniques could predict which patients will achieve MCID in the iHOT-12 PROM 6 months after arthroscopic management of femoroacetabular impingement (FAI), 2) to determine which factors contribute to their predictive power.

Data from the UK Non-Arthroplasty Hip Registry database was utilised. We identified 1917 patients who had undergone hip arthroscopy for FAI with both baseline and 6 month follow up iHOT-12 and baseline EQ-5D scores. We trained three established machine learning algorithms on our dataset to predict an outcome of iHOT-12 MCID improvement at 6 months given baseline characteristics including demographic factors, disease characteristics and PROMs. Performance was assessed using area under the receiver operating characteristic (AUROC) statistics with 5-fold cross validation.

The three machine learning algorithms showed quite different performance. The linear logistic regression model achieved AUROC = 0.59, the deep neural network achieved AUROC = 0.82, while a random forest model had the best predictive performance with AUROC 0.87. Of demographic factors, we found that BMI and age were key predictors for this model. We also found that removing all features except baseline responses to the iHOT-12 questionnaire had little effect on performance for the random forest model (AUROC = 0.85). Disease characteristics had little effect on model performance.

Machine learning models are able to predict with good accuracy 6-month post-operative MCID attainment in patients undergoing arthroscopic management for FAI. Baseline scores from the iHOT-12 questionnaire are sufficient to predict with good accuracy whether a patient is likely to reach MCID in post-operative PROMs.

Orthopaedic Proceedings

Vol. 102-B, Issue SUPP_9 | Pages 29 - 29

1 Oct 2020

MACHINE-LEARNING ALGORITHMS IDENTIFY OPTIMAL CORONAL LIMB ALIGNMENT AND SAGITTAL COMPONENT POSITION IN TOTAL KNEE ARTHROPLASTY

Farooq H Deckard ER Carlson J Ghattas N Meneghini RM

Full Access

View article

Background

Advanced technologies, like robotics, provide enhanced precision for implanting total knee arthroplasty (TKA) components; however, optimal component position and limb alignment remain unknown. This study purpose was to identify the ideal target sagittal component position and coronal limb alignment that produce optimal clinical outcomes.

Methods

A retrospective review of 1,091 consecutive TKAs was performed. All TKAs were PCL retaining or sacrificing with anterior lipped (49.4%) or conforming bearings (50.6%) performed with modern perioperative protocols. Posterior tibial slope, femoral flexion, and tibiofemoral limb alignment were measured with a standardized protocols. Patients were grouped by the ‘how often does your knee feel normal?’ outcome score at latest follow-up. Machine learning algorithms were used to identify optimal alignment zones which predicted improved outcomes scores.

Orthopaedic Proceedings

Vol. 101-B, Issue SUPP_11 | Pages 71 - 71

1 Oct 2019

INCREMENTAL INPUTS IMPROVE THE AUTOMATED DETECTION OF IMPLANT LOOSENING WITH MACHINE-LEARNING ALGORITHMS

Vail TP Shah RF Bini SA

Full Access

View article

Background

Implant loosening is a common cause of a poor outcome and pain after total knee arthroplasty (TKA). Despite the increase in use of expensive techniques like arthrography, the detection of prosthetic loosening is often unclear pre-operatively, leading to diagnostic uncertainty and extensive workup. The objective of this study was to evaluate the ability of a machine learning (ML) algorithm to diagnose prosthetic loosening from pre-operative radiographs, and to observe what model inputs improve the performance of the model.

Methods

754 patients underwent a first-time revision of a total joint at our institution from 2012–2018. Pre-operative X-Rays (XR) were collected for each patient. AP and lateral X-Rays, in addition to demographic and comorbidity information, were collected for each patient. Each patient was determined to have either loose or fixed prosthetics based on a manual abstraction of the written findings in their operative report, which is considered the gold standard of diagnosing prosthetic loosening. We trained a series of deep convolution neural network (CNN) models to predict if a prosthesis was found to be loose in the operating room from the pre-operative XR. Each XR was pre-processed to segment the bone, implant, and bone-implant interface. A series of CNN models were built using existing, proven CNN architectures and weights optimized to our dataset. We then integrated our best performing model with historical patient data to create a final model and determine the incremental accuracy provided by additional layers of clinical information fed into the model. The models were evaluated by its accuracy, sensitivity and specificity.

Orthopaedic Proceedings

Vol. 101-B, Issue SUPP_12 | Pages 68 - 68

1 Oct 2019

DEVELOPMENT OF MACHINE-LEARNING ALGORITHMS FOR PREDICTION OF SUSTAINED POSTOPERATIVE OPIOID PRESCRIPTIONS AFTER TOTAL HIP ARTHROPLASTY

Bedair HS

Full Access

View article

Background

Postoperative recovery after routine total hip arthroplasty (THA) can lead to the development of prolonged opioid use but there are few tools for predicting this adverse outcome. The purpose of this study was to develop machine learning algorithms for preoperative prediction of prolonged post-operative opioid use after THA.

Methods

A retrospective review of electronic health records was conducted at two academic medical centers and three community hospitals to identify adult patients who underwent THA for osteoarthritis between January 1^st, 2000 and August 1^st, 2018. Prolonged postoperative opioid prescriptions were defined as continuous opioid prescriptions after surgery to at least 90 days after surgery. Five machine learning algorithms were developed to predict this outcome and were assessed by discrimination, calibration, and decision curve analysis.

Orthopaedic Proceedings

Vol. 102-B, Issue SUPP_1 | Pages 76 - 76

1 Feb 2020

COMPARISON OF THE ACCURACY ASSOCIATED WITH THREE DIFFERENT MACHINE-LEARNING MODELS TO PREDICT OUTCOMES AFTER ANATOMIC TOTAL SHOULDER ARTHROPLASTY AND REVERSE TOTAL SHOULDER ARTHROPLASTY

Roche C Simovitch R Flurin P Wright T Zuckerman J Routman H

Full Access

View article

Introduction

Machine learning is a relatively novel method to orthopaedics which can be used to evaluate complex associations and patterns in outcomes and healthcare data. The purpose of this study is to utilize 3 different supervised machine learning algorithms to evaluate outcomes from a multi-center international database of a single shoulder prosthesis to evaluate the accuracy of each model to predict post-operative outcomes of both aTSA and rTSA.

Methods

Data from a multi-center international database consisting of 6485 patients who received primary total shoulder arthroplasty using a single shoulder prosthesis (Equinoxe, Exactech, Inc) were analyzed from 19,796 patient visits in this study. Specifically, demographic, comorbidity, implant type and implant size, surgical technique, pre-operative PROMs and ROM measures, post-operative PROMs and ROM measures, pre-operative and post-operative radiographic data, and also adverse event and complication data were obtained for 2367 primary aTSA patients from 8042 visits at an average follow-up of 22 months and 4118 primary rTSA from 11,754 visits at an average follow-up of 16 months were analyzed to create a predictive model using 3 different supervised machine learning techniques: 1) linear regression, 2) random forest, and 3) XGBoost. Each of these 3 different machine learning techniques evaluated the pre-operative parameters and created a predictive model which targeted the post-operative composite score, which was a 100 point score consisting of 50% post-operative composite outcome score (calculated from 33.3% ASES + 33.3% UCLA + 33.3% Constant) and 50% post-operative composite ROM score (calculated from S curves weighted by 70% active forward flexion + 15% internal rotation score + 15% active external rotation). 3 additional predictive models were created to control for the time required for patient improvement after surgery, to do this, each primary aTSA and primary rTSA cohort was subdivided to only include patient data follow-up visits >20 months after surgery, this yielded 1317 primary aTSA patients from 2962 visits at an average follow-up of 50 months and 1593 primary rTSA from 3144 visits at an average follow-up of 42 months. Each of these 6 predictive models were trained using a random selection of 80% of each cohort, then each model predicted the outcomes of the remaining 20% of the data based upon the demographic, comorbidity, implant type and implant size, surgical technique, pre-operative PROMs and ROM measures inputs of each 20% cohort. The error of all 6 predictive models was calculated from the root mean square error (RMSE) between the actual and predicted post-op composite score. The accuracy of each model was determined by subtracting the percent difference of each RMSE value from the average composite score associated with each cohort.

Orthopaedic Proceedings

Vol. 99-B, Issue SUPP_20 | Pages 49 - 49

1 Dec 2017

A MACHINE-LEARNING APPROACH TO DISCRIMINATE BETWEEN SOFT AND HARD BONE TISSUES USING DRILLING SOUNDS

Zakeri V Fabri F Karasawa M Hodgson AJ

Full Access

View article

Bone drilling is conducted in many surgical disciplines such as orthopedics, maxillofacial, and spine surgery. Most of these procedures involve drilling of different bone materials including hard (cortical) and soft (cancellous) tissues. Identifying these tissues is essential for surgeons to minimise damage to underlying nerves and vessels.

The sound signal generated during drilling is a valuable source of information that could potentially be employed. Such sounds can be captured readily and easily through non-contact sensors. Therefore, our goal in this preliminary study is to investigate whether drilling sounds can enable us to distinguish between cortical and cancellous tissues.

A bovine tibial bone was drilled, and the cortical and cancellous drilling sounds were captured. Each sound record was divided into small windows with a length of 50 ms and a 50% overlap. The window length was selected small, because our intended longer-term application is to provide the surgeon with near-real-time feedback. Short time Fourier Transform (STFT) coefficients were extracted from each window and were averaged accordingly to obtain p features. A support vector machine (SVM) algorithm was used for classification, and its accuracy was evaluated for different number of features (p). Two training/testing scenarios were considered, atlas (ATL) and leave- one-out (LOO).

The total accuracies for ATL and LOO were 100% and 93.8% respectively obtained for p=128. Our study on a single specimen demonstrated that it is possible to discriminate between cortical and cancellous bones based on relatively short 50 ms windows of drilling sounds.

Orthopaedic Proceedings

Vol. 106-B, Issue SUPP_18 | Pages 57 - 57

14 Nov 2024

FRACTURE DETECTION IN WRIST TRAUMA RADIOGRAPH: OPTIMIZING ALGORITHM PERFORMANCE USING TRANSFER LEARNING

Birkholtz F Eken M Boyes A Engelbrecht A

Full Access

View article

Introduction. With advances in artificial intelligence, the use of computer-aided detection and diagnosis in clinical imaging is gaining traction. Typically, very large datasets are required to train machine-learning models, potentially limiting use of this technology when only small datasets are available. This study investigated whether pretraining of fracture detection models on large, existing datasets could improve the performance of the model when locating and classifying wrist fractures in a small X-ray image dataset. This concept is termed “transfer learning”. Method. Firstly, three detection models, namely, the faster region-based convolutional neural network (faster R-CNN), you only look once version eight (YOLOv8), and RetinaNet, were pretrained using the large, freely available dataset, common objects in context (COCO) (330000 images). Secondly, these models were pretrained using an open-source wrist X-ray dataset called “Graz Paediatric Wrist Digital X-rays” (GRAZPEDWRI-DX) on a (1) fracture detection dataset (20327 images) and (2) fracture location and classification dataset (14390 images). An orthopaedic surgeon classified the small available dataset of 776 distal radius X-rays (Arbeidsgmeischaft für Osteosynthesefragen Foundation / Orthopaedic Trauma Association; AO/OTA), on which the models were tested. Result. Detection models without pre-training on the large datasets were the least precise when tested on the small distal radius dataset. The model with the best accuracy to detect and classify wrist fractures was the YOLOv8 model pretrained on the GRAZPEDWRI-DX fracture detection dataset (mean average precision at intersection over union of 50=59.7%). This model showed up to 33.6% improved detection precision compared to the same models with no pre-training. Conclusion. Optimisation of machine-learning models can be challenging when only relatively small datasets are available. The findings of this study support the potential of transfer learning from large datasets to improve model performance in smaller datasets. This is encouraging for wider application of machine-learning technology in medical imaging evaluation, including less common orthopaedic pathologies

Results 1 - 20 of 66

1 2 3 4

Results per page: