Advertisement for orthosearch.org.uk
Results 1 - 50 of 122
Results per page:
Bone & Joint Open
Vol. 4, Issue 6 | Pages 399 - 407
1 Jun 2023
Yeramosu T Ahmad W Satpathy J Farrar JM Golladay GJ Patel NK

Aims. To identify variables independently associated with same-day discharge (SDD) of patients following revision total knee arthroplasty (rTKA) and to develop machine learning algorithms to predict suitable candidates for outpatient rTKA. Methods. Data were obtained from the American College of Surgeons National Quality Improvement Programme (ACS-NSQIP) database from the years 2018 to 2020. Patients with elective, unilateral rTKA procedures and a total hospital length of stay between zero and four days were included. Demographic, preoperative, and intraoperative variables were analyzed. A multivariable logistic regression (MLR) model and various machine learning techniques were compared using area under the curve (AUC), calibration, and decision curve analysis. Important and significant variables were identified from the models. Results. Of the 5,600 patients included in this study, 342 (6.1%) underwent SDD. The random forest (RF) model performed the best overall, with an internally validated AUC of 0.810. The ten crucial factors favoring SDD in the RF model include operating time, anaesthesia type, age, BMI, American Society of Anesthesiologists grade, race, history of diabetes, rTKA type, sex, and smoking status. Eight of these variables were also found to be significant in the MLR model. Conclusion. The RF model displayed excellent accuracy and identified clinically important variables for determining candidates for SDD following rTKA. Machine learning techniques such as RF will allow clinicians to accurately risk-stratify their patients preoperatively, in order to optimize resources and improve patient outcomes. Cite this article: Bone Jt Open 2023;4(6):399–407


The Bone & Joint Journal
Vol. 101-B, Issue 12 | Pages 1476 - 1478
1 Dec 2019
Bayliss L Jones LD

This annotation briefly reviews the history of artificial intelligence and machine learning in health care and orthopaedics, and considers the role it will have in the future, particularly with reference to statistical analyses involving large datasets. Cite this article: Bone Joint J 2019;101-B:1476–1478


Bone & Joint Open
Vol. 1, Issue 6 | Pages 236 - 244
11 Jun 2020
Verstraete MA Moore RE Roche M Conditt MA

Aims. The use of technology to assess balance and alignment during total knee surgery can provide an overload of numerical data to the surgeon. Meanwhile, this quantification holds the potential to clarify and guide the surgeon through the surgical decision process when selecting the appropriate bone recut or soft tissue adjustment when balancing a total knee. Therefore, this paper evaluates the potential of deploying supervised machine learning (ML) models to select a surgical correction based on patient-specific intra-operative assessments. Methods. Based on a clinical series of 479 primary total knees and 1,305 associated surgical decisions, various ML models were developed. These models identified the indicated surgical decision based on available, intra-operative alignment, and tibiofemoral load data. Results. With an associated area under the receiver-operator curve ranging between 0.75 and 0.98, the optimized ML models resulted in good to excellent predictions. The best performing model used a random forest approach while considering both alignment and intra-articular load readings. Conclusion. The presented model has the potential to make experience available to surgeons adopting new technology, bringing expert opinion in their operating theatre, but also provides insight in the surgical decision process. More specifically, these promising outcomes indicated the relevance of considering the overall limb alignment in the coronal and sagittal plane to identify the appropriate surgical decision


Orthopaedic Proceedings
Vol. 102-B, Issue SUPP_8 | Pages 20 - 20
1 Aug 2020
Maher A Phan P Hoda M
Full Access

Degenerative lumbar spondylolisthesis (DLS) is a common condition with many available treatment options. The Degenerative Spondylolisthesis Instability Classification (DSIC) scheme, based on a systematic review of best available evidence, was proposed by Simmonds et al. in 2015. This classification scheme proposes that the stability of the patient's pathology be determined by a surgeon based on quantitative and qualitative clinical and radiographic parameters. The purpose of the study is to utilise machine learning to classify DLS patients according to the DSIC scheme, offering a novel approach in which an objectively consistent system is employed. The patient data was collected by CSORN between 2015 and 2018 and included 224 DLS surgery cases. The data was cleaned by two methods, firstly, by deleting all patient entries with missing data, and secondly, by imputing the missing data using a maximum likelihood function. Five machine learning algorithms were used: logistic regression, boosted trees, random forests, support vector machines, and decision trees. The models were built using Python-based libraries and trained and tested using sklearn and pandas librairies. The algorithms were trained and tested using the two data sets (deletion and imputation cleaning methods). The matplotlib library was used to graph the ROC curves, including the area under the curve. The machine learning models were all able to predict the DSIC grade. Of all the models, the support vector machine model performed best, achieving an area under the curve score of 0.82. This model achieved an accuracy of 63% and an F1 score of 0.58. Between the two data cleaning methods, the imputation method was better, achieving higher areas under the curve than the deletion method. The accuracy, recall, precision, and F1 scores were similar for both data cleaning methods. The machine learning models were able to effectively predict physician decision making and score patients based on the DSIC scheme. The support vector machine model was able to achieve an area under the curve of 0.82 in comparison to physician classification. Since the data set was relatively small, the results could be improved with training on a larger data set. The use of machine learning models in DLS classification could prove to be an efficient approach to reduce human bias and error. Further efforts are necessary to test the inter- and intra-observer reliability of the DSIC scheme, as well as to determine if the surgeons using the scheme are following DLS treatment recommendations


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_6 | Pages 49 - 49
2 May 2024
Green J Khanduja V Malviya A
Full Access

Femoroacetabular Impingement (FAI) syndrome, characterised by abnormal hip contact causing symptoms and osteoarthritis, is measured using the International Hip Outcome Tool (iHOT). This study uses machine learning to predict patient outcomes post-treatment for FAI, focusing on achieving a minimally clinically important difference (MCID) at 52 weeks. A retrospective analysis of 6133 patients from the NAHR who underwent hip arthroscopic treatment for FAI between November 2013 and March 2022 was conducted. MCID was defined as half a standard deviation (13.61) from the mean change in iHOT score at 12 months. SKLearn Maximum Absolute Scaler and Logistic Regression were applied to predict achieving MCID, using baseline and 6-month follow-up data. The model's performance was evaluated by accuracy, area under the curve, and recall, using pre-operative and up to 6-month postoperative variables. A total of 23.1% (1422) of patients completed both baseline and 1-year follow-up iHOT surveys. The best results were obtained using both pre and postoperative variables. The machine learning model achieved 88.1% balanced accuracy, 89.6% recall, and 92.3% AUC. Sensitivity was 83.7% and specificity 93.5%. Key variables determining outcomes included MCID achievement at 6 months, baseline iHOT score, 6-month iHOT scores for pain, and difficulty in walking or using stairs. The study confirmed the utility of machine learning in predicting long-term outcomes following arthroscopic treatment for FAI. MCID, based on the iHOT 12 tools, indicates meaningful clinical changes. Machine learning demonstrated high accuracy and recall in distinguishing between patients achieving MCID and those who did not. This approach could help early identification of patients at risk of not meeting the MCID threshold one year after treatment


Orthopaedic Proceedings
Vol. 101-B, Issue SUPP_9 | Pages 2 - 2
1 Sep 2019
Nijeweme - d'Hollosy WO Poel M van Velsen L Groothuis-Oudshoorn C Hermens H Stegeman P Wolff A Reneman M Soer R
Full Access

Aims. Clinical decision support systems (CDSS) can support clinicians in selecting appropriate treatments for patients. The objective of this study was to examine if triaging patients with LBP to the most optimal treatment can be improved by using a data-driven approach with the help of machine learning as base of such a CDSS. Methods. A clinical database of the Groningen Spine Center containing patient-reported data from 1546 patients with LBP was used. From this dataset, a training dataset with 354 features was labeled on eight different treatments actually received by these patients. With this dataset, models were trained. A test dataset with 50 cases judged on treatments by 4 experts in LBP triage was used to test these models with data not used to train the models. Prediction accuracy and average area under curve (AUC) were used as performance measures for the models. Results. The AUC values indicated small to medium learning effects showing that machine learning on patient-reported data, to model decision-making processes on treatments for LBP, may be possible. One of the best performing models was the Bayesian Network (BN) model; e.g. predicted surgery with accuracy 0.78 (95% C.I. 0.68– 0.87) and AUC 0.70. Conclusion. Benefits to using BNs compared to other supervised machine learning techniques are that it is easy to exploit expert knowledge in BN models, meaning that advices generated by the model can be explained. The next step is to improve the BN accuracy so that it can actually be used in a CDSS. No conflicts of interest. Sources of funding: This work is partly funded by a grant from the Netherlands Organization for Health Research and Development (ZonMw), grant 10-10400-98-009


Orthopaedic Proceedings
Vol. 101-B, Issue SUPP_4 | Pages 110 - 110
1 Apr 2019
Verstraete M Conditt M Goodchild G
Full Access

Introduction & Aims. Patient recovery after total knee arthroplasty remains highly variable. Despite the growing interest in and implementation of patient reported outcome measures (e.g. Knee Society Score, Oxford Knee Score), the recovery process of the individual patient is poorly monitored. Unfortunately, patient reported outcomes represent a complex interaction of multiple physiological and psychological aspects, they are also limited by the discrete time intervals at which they are administered. The use of wearable sensors presents a potential alternative by continuously monitoring a patient's physical activity. These sensors however present their own challenges. This paper deals with the interpretation of the high frequency time signals acquired when using accelerometer-based wearable sensors. Method. During a preliminary validation, five healthy subjects were equipped with two wireless inertial measurement units (IMUs). Using adhesive tape, these IMU sensors were attached to the thigh and shank respectively. All subjects performed a series of supervised activities of daily living (ADL) in their everyday environment (1: walking, 2: stair ascent, 3: stair descent, 4: sitting, 5: laying, 6: standing). The supervisor timestamped the performed activities, such that the raw IMU signals could be uniquely linked to the performed activities. Subsequently, the acquired signals were reduced in Python. Each five second time window was characterized by the minimum, maximum and mean acceleration per sensor node. In addition, the frequency response was analyzed per sensor node as well as the correlation between both sensor nodes. Various machine learning approaches were subsequently implemented to predict the performed activities. Thereby, 60% of the acquired signals were used to train the mathematical models. These models were than used to predict the activity associated with the remaining 40% of the experimentally obtained data. Results. An overview of the obtained prediction accuracy per model stratified by ADL is provided in Table 1. The Nearest Neighbor and Random Forest algorithms performed worse compared to the Support Vector Machine and Decision Tree approaches. Even for the latter, differentiating between walking and stair ascent/descent remains challenging as well as differentiating between sitting, standing and laying. The prediction accuracies are however exceeding 90% for all activities when using the Support Vector Machine approach. This is further illustrated in Figure 1, indicating the actual versus predicted activity for the validation set. Conclusions. In conclusion, this paper presents an evaluation of different machine learning algorithms for the classification of activities of daily living from accelerometer-based wearable sensors. This facilitates evaluating a patient's ability to walk, climb or descend stairs, stand, lay or sit on a daily basis, understanding how active the patient is overall and which activities are routinely performed following arthroplasty surgery. Currently, effort is undertaken to understand how participation in these activities progresses with recovery following total knee arthroplasty


Orthopaedic Proceedings
Vol. 104-B, Issue SUPP_13 | Pages 42 - 42
1 Dec 2022
Abbas A Toor J Lex J Finkelstein J Larouche J Whyne C Lewis S
Full Access

Single level discectomy (SLD) is one of the most commonly performed spinal surgery procedures. Two key drivers of their cost-of-care are duration of surgery (DOS) and postoperative length of stay (LOS). Therefore, the ability to preoperatively predict SLD DOS and LOS has substantial implications for both hospital and healthcare system finances, scheduling and resource allocation. As such, the goal of this study was to predict DOS and LOS for SLD using machine learning models (MLMs) constructed on preoperative factors using a large North American database. The American College of Surgeons (ACS) National Surgical and Quality Improvement (NSQIP) database was queried for SLD procedures from 2014-2019. The dataset was split in a 60/20/20 ratio of training/validation/testing based on year. Various MLMs (traditional regression models, tree-based models, and multilayer perceptron neural networks) were used and evaluated according to 1) mean squared error (MSE), 2) buffer accuracy (the number of times the predicted target was within a predesignated buffer), and 3) classification accuracy (the number of times the correct class was predicted by the models). To ensure real world applicability, the results of the models were compared to a mean regressor model. A total of 11,525 patients were included in this study. During validation, the neural network model (NNM) had the best MSEs for DOS (0.99) and LOS (0.67). During testing, the NNM had the best MSEs for DOS (0.89) and LOS (0.65). The NNM yielded the best 30-minute buffer accuracy for DOS (70.9%) and ≤120 min, >120 min classification accuracy (86.8%). The NNM had the best 1-day buffer accuracy for LOS (84.5%) and ≤2 days, >2 days classification accuracy (94.6%). All models were more accurate than the mean regressors for both DOS and LOS predictions. We successfully demonstrated that MLMs can be used to accurately predict the DOS and LOS of SLD based on preoperative factors. This big-data application has significant practical implications with respect to surgical scheduling and inpatient bedflow, as well as major implications for both private and publicly funded healthcare systems. Incorporating this artificial intelligence technique in real-time hospital operations would be enhanced by including institution-specific operational factors such as surgical team and operating room workflow


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_2 | Pages 19 - 19
2 Jan 2024
Castagno S Birch M van der Schaar M McCaskie A
Full Access

Precision health aims to develop personalised and proactive strategies for predicting, preventing, and treating complex diseases such as osteoarthritis (OA). Due to OA heterogeneity, which makes developing effective treatments challenging, identifying patients at risk for accelerated disease progression is essential for efficient clinical trial design and new treatment target discovery and development. To create a reliable and interpretable precision health tool that predicts rapid knee OA progression over a 2-year period from baseline patient characteristics using an advanced automated machine learning (autoML) framework, “Autoprognosis 2.0”. All available 2-year follow-up periods of 600 patients from the FNIH OA Biomarker Consortium were analysed using “Autoprognosis 2.0” in two separate approaches, with distinct definitions of clinical outcomes: multi-class predictions (categorising disease progression into pain and/or radiographic progression) and binary predictions. Models were developed using a training set of 1352 instances and all available variables (including clinical, X-ray, MRI, and biochemical features), and validated through both stratified 10-fold cross-validation and hold-out validation on a testing set of 339 instances. Model performance was assessed using multiple evaluation metrics. Interpretability analyses were carried out to identify important predictors of progression. Our final models yielded higher accuracy scores for multi-class predictions (AUC-ROC: 0.858, 95% CI: 0.856-0.860) compared to binary predictions (AUC-ROC: 0.717, 95% CI: 0.712-0.722). Important predictors of rapid disease progression included WOMAC scores and MRI features. Additionally, accurate ML models were developed for predicting OA progression in a subgroup of patients aged 65 or younger. This study presents a reliable and interpretable precision health tool for predicting rapid knee OA progression. Our models provide accurate predictions and, importantly, allow specific predictors of rapid disease progression to be identified. Furthermore, the transparency and explainability of our methods may facilitate their acceptance by clinicians and patients, enabling effective translation to clinical practice


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_1 | Pages 78 - 78
2 Jan 2024
Ponniah H Edwards T Lex J Davidson R Al-Zubaidy M Afzal I Field R Liddle A Cobb J Logishetty K
Full Access

Anterior approach total hip arthroplasty (AA-THA) has a steep learning curve, with higher complication rates in initial cases. Proper surgical case selection during the learning curve can reduce early risk. This study aims to identify patient and radiographic factors associated with AA-THA difficulty using Machine Learning (ML). Consecutive primary AA-THA patients from two centres, operated by two expert surgeons, were enrolled (excluding patients with prior hip surgery and first 100 cases per surgeon). K- means prototype clustering – an unsupervised ML algorithm – was used with two variables - operative duration and surgical complications within 6 weeks - to cluster operations into difficult or standard groups. Radiographic measurements (neck shaft angle, offset, LCEA, inter-teardrop distance, Tonnis grade) were measured by two independent observers. These factors, alongside patient factors (BMI, age, sex, laterality) were employed in a multivariate logistic regression analysis and used for k-means clustering. Significant continuous variables were investigated for predictive accuracy using Receiver Operator Characteristics (ROC). Out of 328 THAs analyzed, 130 (40%) were classified as difficult and 198 (60%) as standard. Difficult group had a mean operative time of 106mins (range 99–116) with 2 complications, while standard group had a mean operative time of 77mins (range 69–86) with 0 complications. Decreasing inter-teardrop distance (odds ratio [OR] 0.97, 95% confidence interval [CI] 0.95–0.99, p = 0.03) and right-sided operations (OR 1.73, 95% CI 1.10–2.72, p = 0.02) were associated with operative difficulty. However, ROC analysis showed poor predictive accuracy for these factors alone, with area under the curve of 0.56. Inter-observer reliability was reported as excellent (ICC >0.7). Right-sided hips (for right-hand dominant surgeons) and decreasing inter-teardrop distance were associated with case difficulty in AA-THA. These data could guide case selection during the learning phase. A larger dataset with more complications may reveal further factors


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_16 | Pages 23 - 23
17 Nov 2023
Castagno S Birch M van der Schaar M McCaskie A
Full Access

Abstract. Introduction. Precision health aims to develop personalised and proactive strategies for predicting, preventing, and treating complex diseases such as osteoarthritis (OA), a degenerative joint disease affecting over 300 million people worldwide. Due to OA heterogeneity, which makes developing effective treatments challenging, identifying patients at risk for accelerated disease progression is essential for efficient clinical trial design and new treatment target discovery and development. Objectives. This study aims to create a trustworthy and interpretable precision health tool that predicts rapid knee OA progression based on baseline patient characteristics using an advanced automated machine learning (autoML) framework, “Autoprognosis 2.0”. Methods. All available 2-year follow-up periods of 600 patients from the FNIH OA Biomarker Consortium were analysed using “Autoprognosis 2.0” in two separate approaches, with distinct definitions of clinical outcomes: multi-class predictions (categorising patients into non-progressors, pain-only progressors, radiographic-only progressors, and both pain and radiographic progressors) and binary predictions (categorising patients into non-progressors and progressors). Models were developed using a training set of 1352 instances and all available variables (including clinical, X-ray, MRI, and biochemical features), and validated through both stratified 10-fold cross-validation and hold-out validation on a testing set of 339 instances. Model performance was assessed using multiple evaluation metrics, such as AUC-ROC, AUC-PRC, F1-score, precision, and recall. Additionally, interpretability analyses were carried out to identify important predictors of rapid disease progression. Results. Our final models yielded high accuracy scores for both multi-class predictions (AUC-ROC: 0.858, 95% CI: 0.856–0.860; AUC-PRC: 0.675, 95% CI: 0.671–0.679; F1-score: 0.560, 95% CI: 0.554–0.566) and binary predictions (AUC-ROC: 0.717, 95% CI: 0.712–0.722; AUC-PRC: 0.620, 95% CI: 0.616–0.624; F1-score: 0.676, 95% CI: 0.673–0679). Important predictors of rapid disease progression included the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) scores and MRI features. Our models were further successfully validated using a hold-out dataset, which was previously omitted from model development and training (AUC-ROC: 0.877 for multi-class predictions; AUC-ROC: 0.746 for binary predictions). Additionally, accurate ML models were developed for predicting OA progression in a subgroup of patients aged 65 or younger (AUC-ROC: 0.862, 95% CI: 0.861–0.863 for multi-class predictions; AUC-ROC: 0.736, 95% CI: 0.734–0.738 for binary predictions). Conclusions. This study presents a reliable and interpretable precision health tool for predicting rapid knee OA progression using “Autoprognosis 2.0”. Our models provide accurate predictions and offer insights into important predictors of rapid disease progression. Furthermore, the transparency and interpretability of our methods may facilitate their acceptance by clinicians and patients, enabling effective utilisation in clinical practice. Future work should focus on refining these models by increasing the sample size, integrating additional features, and using independent datasets for external validation. Declaration of Interest. (b) declare that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research reported:I declare that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research project


Orthopaedic Proceedings
Vol. 102-B, Issue SUPP_1 | Pages 122 - 122
1 Feb 2020
Flood P Jensen A Banks S
Full Access

Disorders of human joints manifest during dynamic movement, yet no objective tools are widely available for clinicians to assess or diagnose abnormal joint motion during functional activity. Machine learning tools have supported advances in many applications for image interpretation and understanding and have the potential to enable clinically and economically practical methods for objective assessment of human joint mechanics. We performed a study using convolutional neural networks to autonomously segment radiographic images of knee replacements and to determine the potential for autonomous measurement of knee kinematics. The autonomously segmented images provided superior kinematic measurements for both femur and tibia implant components. We believe this is an encouraging first step towards realization of a completely autonomous capability to accurately quantify dynamic joint motion using a clinically and economically practical methodology


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_7 | Pages 71 - 71
4 Apr 2023
Arrowsmith C Burns D Mak T Hardisty M Whyne C
Full Access

Access to health care, including physiotherapy, is increasingly occurring through virtual formats. At-home adherence to physical therapy programs is often poor and few tools exist to objectively measure low back physiotherapy exercise participation without the direct supervision of a medical professional. The aim of this study was to develop and evaluate the potential for performing automatic, unsupervised video-based monitoring of at-home low back physiotherapy exercises using a single mobile phone camera. 24 healthy adult subjects performed seven exercises based on the McKenzie low back physiotherapy program while being filmed with two smartphone cameras. Joint locations were automatically extracted using an open-source pose estimation framework. Engineered features were extracted from the joint location time series and used to train a support vector machine classifier (SVC). A convolutional neural network (CNN) was trained directly on the joint location time series data to classify exercises based on a recording from a single camera. The models were evaluated using a 5-fold cross validation approach, stratified by subject, with the class-balanced accuracy used as the performance metric. Optimal performance was achieved when using a total of 12 pose estimation landmarks from the upper and lower body, with the SVC model achieving a classification accuracy of 96±4% and the CNN model an accuracy of 97±2%. This study demonstrates the feasibility of using a smartphone camera and a supervised machine learning model to effectively assess at-home low back physiotherapy adherence. This approach could provide a low-cost, scalable method for tracking adherence to physical therapy exercise programs in a variety of settings


Full Access

Background. The advent of value-based conscientiousness and rapid-recovery discharge pathways presents surgeons, hospitals, and payers with the challenge of providing the same total hip arthroplasty episode of care in the safest and most economic fashion for the same fee, despite patient differences. Various predictive analytic techniques have been applied to medical risk models, such as sepsis risk scores, but none have been applied or validated to the elective primary total hip arthroplasty (THA) setting for key payment-based metrics. The objective of this study was to develop and validate a predictive machine learning model using preoperative patient demographics for length of stay (LOS) after primary THA as the first step in identifying a patient-specific payment model (PSPM). Methods. Using 229,945 patients undergoing primary THA for osteoarthritis from an administrative database between 2009– 16, we created a naïve Bayesian model to forecast LOS after primary THA using a 3:2 split in which 60% of the available patient data “built” the algorithm and the remaining 40% of patients were used for “testing.” This process was iterated five times for algorithm refinement, and model performance was determined using the area under the receiver operating characteristic curve (AUC), percent accuracy, and positive predictive value. LOS was either grouped as 1–5 days or greater than 5 days. Results. The machine learning model algorithm required age, race, gender, and two comorbidity scores (“risk of illness” and “risk of morbidity”) to demonstrate excellent validity, reliability, and responsiveness with an AUC of 0.87 after five iterations. Hospital stays of greater than 5 days for THA were most associated with increased risk of illness and risk of comorbidity scores during admission compared to 1–5 days of stay. Conclusions. Our machine learning model derived from administrative big data demonstrated excellent validity, reliability, and responsiveness after primary THA while accurately predicting LOS and identifying two comorbidity scores as key value-based metrics. Predictive data has the potential to engender a risk-based PSPM prior to primary THA and other elective orthopaedic procedures


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_6 | Pages 59 - 59
2 May 2024
Adla SR Ameer A Silva MD Unnithan A
Full Access

Arthroplasties are widely performed to improve mobility and quality of life for symptomatic knee/hip osteoarthritis patients. With increasing rates of Total Joint Replacements in the United Kingdom, predicting length of stay is vital for hospitals to control costs, manage resources, and prevent postoperative complications. A longer Length of stay has been shown to negatively affect the quality of care, outcomes and patient satisfaction. Thus, predicting LOS enables us to make full use of medical resources. Clinical characteristics were retrospectively collected from 1,303 patients who received TKA and THR. A total of 21 variables were included, to develop predictive models for LOS by multiple machine learning (ML) algorithms, including Random Forest Classifier (RFC), K-Nearest Neighbour (KNN), Extreme Gradient Boost (XgBoost), and Na¯ve Bayes (NB). These models were evaluated by the receiver operating characteristic (ROC) curve for predictive performance. A feature selection approach was used to identify optimal predictive factors. Based on the ROC of Training result, XgBoost algorithm was selected to be applied to the Test set. The areas under the ROC curve (AUCs) of the 4 models ranged from 0.730 to 0.966, where higher AUC values generally indicate better predictive performance. All the ML-based models performed better than conventional statistical methods in ROC curves. The XgBoost algorithm with 21 variables was identified as the best predictive model. The feature selection indicated the top six predictors: Age, Operation Duration, Primary Procedure, BMI, creatinine and Month of Surgery. By analysing clinical characteristics, it is feasible to develop ML-based models for the preoperative prediction of LOS for patients who received TKA and THR, and the XgBoost algorithm performed the best, in terms of accuracy of predictive performance. As this model was originally crafted at Ashford and St. Peters Hospital, we have naturally named it as THE ASHFORD OUTCOME


Orthopaedic Proceedings
Vol. 102-B, Issue SUPP_1 | Pages 27 - 27
1 Feb 2020
Bloomfield R Williams H Broberg J Lanting B Teeter M
Full Access

Objective. Wearable sensors have enabled objective functional data collection from patients before total knee replacement (TKR) and at clinical follow-ups post-surgery whereas traditional evaluation has solely relied on self-reported subjective measures. The timed-up-and-go (TUG) test has been used to evaluate function but is commonly measured using only total completion time, which does not assess joint function or test completion strategy. The current work employs machine learning techniques to distinguish patient groups based on derived functional metrics from the TUG test and expose clinically important functional parameters that are predictive of patient recovery. Methods. Patients scheduled for TKR (n=70) were recruited and instrumented with a wearable sensor system while performing three TUG test trials. Remaining study patients (n=68) also completed three TUG trials at their 2, 6, and 13-week follow-ups. Many patients (n=36) have also participated up to their 26-week appointment. Custom developed software was used to segment recorded tests into sub-activities and extract 54 functional metrics to evaluate op/non-operative knee function. All preoperative TUG samples and their standardized metrics were clustered into two unlabelled groups using the k-means algorithm. Both groups were tracked forward to see how their early functional parameters translated to functional improvement at their three-month assessment. Test total completion time was used to estimate overall functional improvement and to relate findings to existing literature. Patients that completed their 26-week tests were tracked further to their most recent timepoint. Results. Preoperative clustering separated two groups with different test completion times (n=46 vs. n=22 with mean times of 13s vs. 22s). Of the faster preoperative group, 63% of patients maintained their time, 26% improved, and 11% worsened whereas of the slower preoperative group, 27% maintained, 64% improved, and 9% worsened. The high improvement group improved their times by 4.9s (p<0.01) between preoperative and 13-week visits whereas the other group had no significant change. Test times were different between both groups preoperatively (p<0.001) and at 6 (p=0.01) and 13 (p=0.03) weeks but not at 26 weeks (p=0.67). The high improvement group reached an overall improvement of 9s (p<0.01) at 26 weeks whereas the low improvement group still showed no improvement greater than the TUG minimal detectable change of 2.2s (1.8s, p<0.01)[1]. Test sub-activity times for both groups at each timepoint can be seen in Figure 1. Conclusions. This work has demonstrated that machine learning has the potential to find patterns in preoperative functional parameters that can predict functional improvement after surgery. While useful for assigning labels to the distinguished clusters, test completion time was not among the top distinguishable metrics between groups at three months which highlights the necessity for these more descriptive performance metrics when analyzing patient recovery. It is expected that these early predictions will be used to realistically adjust patient expectations or highlight opportunities for physiotherapeutic intervention to improve future outcomes. For any figures or tables, please contact the authors directly


Bone & Joint Research
Vol. 7, Issue 3 | Pages 223 - 225
1 Mar 2018
Jones LD Golan D Hanna SA Ramachandran M


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_8 | Pages 102 - 102
11 Apr 2023
Mosseri J Lex J Abbas A Toor J Ravi B Whyne C Khalil E
Full Access

Total knee and hip arthroplasty (TKA and THA) are the most commonly performed surgical procedures, the costs of which constitute a significant healthcare burden. Improving access to care for THA/TKA requires better efficiency. It is hypothesized that this may be possible through a two-stage approach that utilizes prediction of surgical time to enable optimization of operating room (OR) schedules.

Data from 499,432 elective unilateral arthroplasty procedures, including 302,490 TKAs, and 196,942 THAs, performed from 2014-2019 was extracted from the American College of Surgeons (ACS) National Surgical and Quality Improvement (NSQIP) database. A deep multilayer perceptron model was trained to predict duration of surgery (DOS) based on pre-operative clinical and biochemical patient factors. A two-stage approach, utilizing predicted DOS from a held out “test” dataset, was utilized to inform the daily OR schedule. The objective function of the optimization was the total OR utilization, with a penalty for overtime. The scheduling problem and constraints were simulated based on a high-volume elective arthroplasty centre in Canada. This approach was compared to current patient scheduling based on mean procedure DOS. Approaches were compared by performing 1000 simulated OR schedules.

The predict then optimize approach achieved an 18% increase in OR utilization over the mean regressor. The two-stage approach reduced overtime by 25-minutes per OR day, however it created a 7-minute increase in underutilization. Better objective value was seen in 85.1% of the simulations.

With deep learning prediction and mathematical optimization of patient scheduling it is possible to improve overall OR utilization compared to typical scheduling practices. Maximizing utilization of existing healthcare resources can, in limited resource environments, improve patient's access to arthritis care by increasing patient throughput, reducing surgical wait times and in the immediate future, help clear the backlog associated with the COVID-19 pandemic.


Orthopaedic Proceedings
Vol. 98-B, Issue SUPP_8 | Pages 11 - 11
1 May 2016
Chanda S Gupta S Pratihar D
Full Access

The success of a cementless Total Hip Arthroplasty (THA) depends not only on initial micromotion, but also on long-term failure mechanisms, e.g., implant-bone interface stresses and stress shielding. Any preclinical investigation aimed at designing femoral implant needs to account for temporal evolution of interfacial condition, while dealing with these failure mechanisms. The goal of the present multi-criteria optimization study was to search for optimum implant geometry by implementing a novel machine learning framework comprised of a neural network (NN), genetic algorithm (GA) and finite element (FE) analysis. The optimum implant model was subsequently evaluated based on evolutionary interface conditions. The optimization scheme of our earlier study [1] has been used here with an additional inclusion of an NN to predict the initial fixation of an implant model. The entire CAD based parameterization technique for the implant was described previously [1]. Three objective functions, the first two based on proximal resorbed Bone Mass Fraction (BMF) [1] and implant-bone interface failure index [1], respectively, and the other based on initial micromotion, were formulated to model the multi-criteria optimization problem. The first two objective functions, e.g., objectives f1 and f2, were calculated from the FE analysis (Ansys), whereas the third objective (f3) involved an NN developed for the purpose of predicting the post-operative micromotion based on the stem design parameters. Bonded interfacial condition was used to account for the effects of stress shielding and interface stresses, whereas a set of contact models were used to develop the NN for faster prediction of post-operative micromotion. A multi-criteria GA was executed up to a desired number of generations for optimization (Fig. 1). The final trade-off model was further evaluated using a combined remodelling and bone ingrowth simulation based on an evolutionary interface condition [2], and subsequently compared with a generic TriLock implant. The non-dominated solutions obtained from the GA execution were interpolated to determine the 3D nature of the Pareto-optimal surface (Fig. 2). The effects of all failure mechanisms were found to be minimized in these optimized solutions (Fig. 2). However, the most compromised solution, i.e., the trade-off stem geometry (TSG), was chosen for further assessment based on evolutionary interfacial condition. The simulation-based combined remodelling and bone ingrowth study predicted a faster ingrowth for TSG as compared to the generic design. The surface area with post-operative (i.e., iteration 1) ingrowth was found to be ∼50% for the TSG, while that for the TriLock model was ∼38% (Fig. 3). However, both designs predicted similar long-term ingrowth (∼89% surface area). The long-term proximal bone resorption (upto lesser trochanter) was found to be ∼30% for the TSG, as compared to ∼37% for the TriLock model. The TSG was found to be bone-preserving with prominent frontal wedge and rectangular proximal section for better rotational stability; features present in some recent designs. The optimization scheme, therefore, appears to be a quick and robust preclinical assessment tool for cementless femoral implant design. To view tables/figures, please contact authors directly


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_7 | Pages 134 - 134
4 Apr 2023
Arrowsmith C Alfakir A Burns D Razmjou H Hardisty M Whyne C
Full Access

Physiotherapy is a critical element in successful conservative management of low back pain (LBP). The aim of this study was to develop and evaluate a system with wearable inertial sensors to objectively detect sitting postures and performance of unsupervised exercises containing movement in multiple planes (flexion, extension, rotation).

A set of 8 inertial sensors were placed on 19 healthy adult subjects. Data was acquired as they performed 7 McKenzie low-back exercises and 3 sitting posture positions. This data was used to train two models (Random Forest (RF) and XGBoost (XGB)) using engineered time series features. In addition, a convolutional neural network (CNN) was trained directly on the time series data. A feature importance analysis was performed to identify sensor locations and channels that contributed most to the models. Finally, a subset of sensor locations and channels was included in a hyperparameter grid search to identify the optimal sensor configuration and the best performing algorithm(s) for exercise classification. Models were evaluated using F1-score in a 10-fold cross validation approach.

The optimal hardware configuration was identified as a 3-sensor setup using lower back, left thigh, and right ankle sensors with acceleration, gyroscope, and magnetometer channels. The XBG model achieved the highest exercise (F1=0.94±0.03) and posture (F1=0.90±0.11) classification scores. The CNN achieved similar results with the same sensor locations, using only the accelerometer and gyroscope channels for exercise classification (F1=0.94±0.02) and the accelerometer channel alone for posture classification (F1=0.91±0.03).

This study demonstrates the potential of a 3-sensor lower body wearable solution (e.g. smart pants) that can identify proper sitting postures and exercises in multiple planes, suitable for low back pain. This technology has the potential to improve the effectiveness of LBP rehabilitation by facilitating quantitative feedback, early problem diagnosis, and possible remote monitoring.


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_7 | Pages 9 - 9
4 Apr 2023
Fridberg M Annadatha S Hua Q Jensen T Liu J Kold S Rahbek O Shen M Ghaffari A
Full Access

To detect early signs of infection infrared thermography has been suggested to provide quantitative information. Our vision is to invent a pin site infection thermographic surveillance tool for patients at home. A preliminary step to this goal is the aim of this study, to automate the process of locating the pin and detecting the pin sites in thermal images efficiently, exactly, and reliably for extracting pin site temperatures.

A total of 1708 pin sites was investigated with Thermography and augmented by 9 different methods in to totally 10.409 images. The dataset was divided into a training set (n=8325), a validation set (n=1040), and a test set (n=1044) of images. The Pin Detection Model (PDM) was developed as follows: A You Only Look Once (YOLOv5) based object detection model with a Complete Detection Intersection over Union (CDIoU), it was pre-trained and finetuned by the through transfer learning. The basic performance of the YOLOv5 with CDIoU model was compared with other conventional models (FCOS and YOLOv4) for deep and transition learning to improve performance and precision. Maximum Temperature Extraction (MTE) Based on Region of Interest (ROI) for all pin sites was generated by the model. Inference of MTE using PDM with infected and un-infected datasets was investigated.

An automatic tool that can identify and annotate pin sites on conventional images using bounding boxes was established. The bounding box was transferred to the infrared image. The PMD algorithm was built on YOLOv5 with CDIoU and has a precision of 0.976. The model offers the pin site detection in 1.8 milliseconds. The thermal data from ROI at the pin site was automatically extracted.

These results enable automatic pin site annotation on thermography. The model tracks the correlation between temperature and infection from the detected pin sites and demonstrates it is a promising tool for automatic pin site detection and maximum temperature extraction for further infection studies. Our work for automatic pin site annotation on thermography paves the way for future research on infection assessment using thermography.


Orthopaedic Proceedings
Vol. 99-B, Issue SUPP_20 | Pages 46 - 46
1 Dec 2017
Esfandiari H Anglin C Street J Guy P Hodgson A
Full Access

Pedicle screw fixation is a technically demanding procedure with potential difficulties and reoperation rates are currently on the order of 11%. The most common intraoperative practice for position assessment of pedicle screws is biplanar fluoroscopic imaging that is limited to two- dimensions and is associated to low accuracies. We have previously introduced a full-dimensional position assessment framework based on registering intraoperative X-rays to preoperative volumetric images with sufficient accuracies. However, the framework requires a semi-manual process of pedicle screw segmentation and the intraoperative X-rays have to be taken from defined positions in space in order to avoid pedicle screws' head occlusion. This motivated us to develop advancements to the system to achieve higher levels of automation in the hope of higher clinical feasibility.

In this study, we developed an automatic segmentation and X-ray adequacy assessment protocol. An artificial neural network was trained on a dataset that included a number of digitally reconstructed radiographs representing pedicle screw projections from different points of view. This model was able to segment the projection of any pedicle screw given an X-ray as its input with accuracy of 93% of the pixels. Once the pedicle screw was segmented, a number of descriptive geometric features were extracted from the isolated blob. These segmented images were manually labels as ‘adequate’ or ‘not adequate’ depending on the visibility of the screw axis. The extracted features along with their corresponding labels were used to train a decision tree model that could classify each X-ray based on its adequacy with accuracies on the order of 95%.

In conclusion, we presented here a robust, fast and automated pedicle screw segmentation process, combined with an accurate and automatic algorithm for classifying views of pedicle screws as adequate or not. These tools represent a useful step towards full automation of our pedicle screw positioning assessment system.


The Bone & Joint Journal
Vol. 104-B, Issue 8 | Pages 911 - 914
1 Aug 2022
Prijs J Liao Z Ashkani-Esfahani S Olczak J Gordon M Jayakumar P Jutte PC Jaarsma RL IJpma FFA Doornberg JN

Artificial intelligence (AI) is, in essence, the concept of ‘computer thinking’, encompassing methods that train computers to perform and learn from executing certain tasks, called machine learning, and methods to build intricate computer models that both learn and adapt, called complex neural networks. Computer vision is a function of AI by which machine learning and complex neural networks can be applied to enable computers to capture, analyze, and interpret information from clinical images and visual inputs. This annotation summarizes key considerations and future perspectives concerning computer vision, questioning the need for this technology (the ‘why’), the current applications (the ‘what’), and the approach to unlocking its full potential (the ‘how’). Cite this article: Bone Joint J 2022;104-B(8):911–914


Bone & Joint Research
Vol. 12, Issue 9 | Pages 512 - 521
1 Sep 2023
Langenberger B Schrednitzki D Halder AM Busse R Pross CM

Aims. A substantial fraction of patients undergoing knee arthroplasty (KA) or hip arthroplasty (HA) do not achieve an improvement as high as the minimal clinically important difference (MCID), i.e. do not achieve a meaningful improvement. Using three patient-reported outcome measures (PROMs), our aim was: 1) to assess machine learning (ML), the simple pre-surgery PROM score, and logistic-regression (LR)-derived performance in their prediction of whether patients undergoing HA or KA achieve an improvement as high or higher than a calculated MCID; and 2) to test whether ML is able to outperform LR or pre-surgery PROM scores in predictive performance. Methods. MCIDs were derived using the change difference method in a sample of 1,843 HA and 1,546 KA patients. An artificial neural network, a gradient boosting machine, least absolute shrinkage and selection operator (LASSO) regression, ridge regression, elastic net, random forest, LR, and pre-surgery PROM scores were applied to predict MCID for the following PROMs: EuroQol five-dimension, five-level questionnaire (EQ-5D-5L), EQ visual analogue scale (EQ-VAS), Hip disability and Osteoarthritis Outcome Score-Physical Function Short-form (HOOS-PS), and Knee injury and Osteoarthritis Outcome Score-Physical Function Short-form (KOOS-PS). Results. Predictive performance of the best models per outcome ranged from 0.71 for HOOS-PS to 0.84 for EQ-VAS (HA sample). ML statistically significantly outperformed LR and pre-surgery PROM scores in two out of six cases. Conclusion. MCIDs can be predicted with reasonable performance. ML was able to outperform traditional methods, although only in a minority of cases. Cite this article: Bone Joint Res 2023;12(9):512–521


Bone & Joint Research
Vol. 9, Issue 9 | Pages 623 - 632
5 Sep 2020
Jayadev C Hulley P Swales C Snelling S Collins G Taylor P Price A

Aims. The lack of disease-modifying treatments for osteoarthritis (OA) is linked to a shortage of suitable biomarkers. This study combines multi-molecule synovial fluid analysis with machine learning to produce an accurate diagnostic biomarker model for end-stage knee OA (esOA). Methods. Synovial fluid (SF) from patients with esOA, non-OA knee injury, and inflammatory knee arthritis were analyzed for 35 potential markers using immunoassays. Partial least square discriminant analysis (PLS-DA) was used to derive a biomarker model for cohort classification. The ability of the biomarker model to diagnose esOA was validated by identical wide-spectrum SF analysis of a test cohort of ten patients with esOA. Results. PLS-DA produced a streamlined biomarker model with excellent sensitivity (95%), specificity (98.4%), and reliability (97.4%). The eight-biomarker model produced a fingerprint for esOA comprising type IIA procollagen N-terminal propeptide (PIIANP), tissue inhibitor of metalloproteinase (TIMP)-1, a disintegrin and metalloproteinase with thrombospondin motifs 4 (ADAMTS-4), monocyte chemoattractant protein (MCP)-1, interferon-γ-inducible protein-10 (IP-10), and transforming growth factor (TGF)-β3. Receiver operating characteristic (ROC) analysis demonstrated excellent discriminatory accuracy: area under the curve (AUC) being 0.970 for esOA, 0.957 for knee injury, and 1 for inflammatory arthritis. All ten validation test patients were classified correctly as esOA (accuracy 100%; reliability 100%) by the biomarker model. Conclusion. SF analysis coupled with machine learning produced a partially validated biomarker model with cohort-specific fingerprints that accurately and reliably discriminated esOA from knee injury and inflammatory arthritis with almost 100% efficacy. The presented findings and approach represent a new biomarker concept and potential diagnostic tool to stage disease in therapy trials and monitor the efficacy of such interventions. Cite this article: Bone Joint Res 2020;9(9):623–632


The Bone & Joint Journal
Vol. 103-B, Issue 9 | Pages 1442 - 1448
1 Sep 2021
McDonnell JM Evans SR McCarthy L Temperley H Waters C Ahern D Cunniffe G Morris S Synnott K Birch N Butler JS

In recent years, machine learning (ML) and artificial neural networks (ANNs), a particular subset of ML, have been adopted by various areas of healthcare. A number of diagnostic and prognostic algorithms have been designed and implemented across a range of orthopaedic sub-specialties to date, with many positive results. However, the methodology of many of these studies is flawed, and few compare the use of ML with the current approach in clinical practice. Spinal surgery has advanced rapidly over the past three decades, particularly in the areas of implant technology, advanced surgical techniques, biologics, and enhanced recovery protocols. It is therefore regarded an innovative field. Inevitably, spinal surgeons will wish to incorporate ML into their practice should models prove effective in diagnostic or prognostic terms. The purpose of this article is to review published studies that describe the application of neural networks to spinal surgery and which actively compare ANN models to contemporary clinical standards allowing evaluation of their efficacy, accuracy, and relatability. It also explores some of the limitations of the technology, which act to constrain the widespread adoption of neural networks for diagnostic and prognostic use in spinal care. Finally, it describes the necessary considerations should institutions wish to incorporate ANNs into their practices. In doing so, the aim of this review is to provide a practical approach for spinal surgeons to understand the relevant aspects of neural networks. Cite this article: Bone Joint J 2021;103-B(9):1442–1448


The Bone & Joint Journal
Vol. 104-B, Issue 12 | Pages 1292 - 1303
1 Dec 2022
Polisetty TS Jain S Pang M Karnuta JM Vigdorchik JM Nawabi DH Wyles CC Ramkumar PN

Literature surrounding artificial intelligence (AI)-related applications for hip and knee arthroplasty has proliferated. However, meaningful advances that fundamentally transform the practice and delivery of joint arthroplasty are yet to be realized, despite the broad range of applications as we continue to search for meaningful and appropriate use of AI. AI literature in hip and knee arthroplasty between 2018 and 2021 regarding image-based analyses, value-based care, remote patient monitoring, and augmented reality was reviewed. Concerns surrounding meaningful use and appropriate methodological approaches of AI in joint arthroplasty research are summarized. Of the 233 AI-related orthopaedics articles published, 178 (76%) constituted original research, while the rest consisted of editorials or reviews. A total of 52% of original AI-related research concerns hip and knee arthroplasty (n = 92), and a narrative review is described. Three studies were externally validated. Pitfalls surrounding present-day research include conflating vernacular (“AI/machine learning”), repackaging limited registry data, prematurely releasing internally validated prediction models, appraising model architecture instead of inputted data, withholding code, and evaluating studies using antiquated regression-based guidelines. While AI has been applied to a variety of hip and knee arthroplasty applications with limited clinical impact, the future remains promising if the question is meaningful, the methodology is rigorous and transparent, the data are rich, and the model is externally validated. Simple checkpoints for meaningful AI adoption include ensuring applications focus on: administrative support over clinical evaluation and management; necessity of the advanced model; and the novelty of the question being answered.

Cite this article: Bone Joint J 2022;104-B(12):1292–1303.


Bone & Joint Research
Vol. 13, Issue 4 | Pages 184 - 192
18 Apr 2024
Morita A Iida Y Inaba Y Tezuka T Kobayashi N Choe H Ike H Kawakami E

Aims

This study was designed to develop a model for predicting bone mineral density (BMD) loss of the femur after total hip arthroplasty (THA) using artificial intelligence (AI), and to identify factors that influence the prediction. Additionally, we virtually examined the efficacy of administration of bisphosphonate for cases with severe BMD loss based on the predictive model.

Methods

The study included 538 joints that underwent primary THA. The patients were divided into groups using unsupervised time series clustering for five-year BMD loss of Gruen zone 7 postoperatively, and a machine-learning model to predict the BMD loss was developed. Additionally, the predictor for BMD loss was extracted using SHapley Additive exPlanations (SHAP). The patient-specific efficacy of bisphosphonate, which is the most important categorical predictor for BMD loss, was examined by calculating the change in predictive probability when hypothetically switching between the inclusion and exclusion of bisphosphonate.


Bone & Joint Open
Vol. 4, Issue 3 | Pages 168 - 181
14 Mar 2023
Dijkstra H Oosterhoff JHF van de Kuit A IJpma FFA Schwab JH Poolman RW Sprague S Bzovsky S Bhandari M Swiontkowski M Schemitsch EH Doornberg JN Hendrickx LAM

Aims

To develop prediction models using machine-learning (ML) algorithms for 90-day and one-year mortality prediction in femoral neck fracture (FNF) patients aged 50 years or older based on the Hip fracture Evaluation with Alternatives of Total Hip arthroplasty versus Hemiarthroplasty (HEALTH) and Fixation using Alternative Implants for the Treatment of Hip fractures (FAITH) trials.

Methods

This study included 2,388 patients from the HEALTH and FAITH trials, with 90-day and one-year mortality proportions of 3.0% (71/2,388) and 6.4% (153/2,388), respectively. The mean age was 75.9 years (SD 10.8) and 65.9% of patients (1,574/2,388) were female. The algorithms included patient and injury characteristics. Six algorithms were developed, internally validated and evaluated across discrimination (c-statistic; discriminative ability between those with risk of mortality and those without), calibration (observed outcome compared to the predicted probability), and the Brier score (composite of discrimination and calibration).


The Bone & Joint Journal
Vol. 103-B, Issue 12 | Pages 1754 - 1758
1 Dec 2021
Farrow L Zhong M Ashcroft GP Anderson L Meek RMD

There is increasing popularity in the use of artificial intelligence and machine-learning techniques to provide diagnostic and prognostic models for various aspects of Trauma & Orthopaedic surgery. However, correct interpretation of these models is difficult for those without specific knowledge of computing or health data science methodology. Lack of current reporting standards leads to the potential for significant heterogeneity in the design and quality of published studies. We provide an overview of machine-learning techniques for the lay individual, including key terminology and best practice reporting guidelines.

Cite this article: Bone Joint J 2021;103-B(12):1754–1758.


Bone & Joint Open
Vol. 5, Issue 1 | Pages 9 - 19
16 Jan 2024
Dijkstra H van de Kuit A de Groot TM Canta O Groot OQ Oosterhoff JH Doornberg JN

Aims

Machine-learning (ML) prediction models in orthopaedic trauma hold great promise in assisting clinicians in various tasks, such as personalized risk stratification. However, an overview of current applications and critical appraisal to peer-reviewed guidelines is lacking. The objectives of this study are to 1) provide an overview of current ML prediction models in orthopaedic trauma; 2) evaluate the completeness of reporting following the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement; and 3) assess the risk of bias following the Prediction model Risk Of Bias Assessment Tool (PROBAST) tool.

Methods

A systematic search screening 3,252 studies identified 45 ML-based prediction models in orthopaedic trauma up to January 2023. The TRIPOD statement assessed transparent reporting and the PROBAST tool the risk of bias.


Bone & Joint Research
Vol. 12, Issue 7 | Pages 447 - 454
10 Jul 2023
Lisacek-Kiosoglous AB Powling AS Fontalis A Gabr A Mazomenos E Haddad FS

The use of artificial intelligence (AI) is rapidly growing across many domains, of which the medical field is no exception. AI is an umbrella term defining the practical application of algorithms to generate useful output, without the need of human cognition. Owing to the expanding volume of patient information collected, known as ‘big data’, AI is showing promise as a useful tool in healthcare research and across all aspects of patient care pathways. Practical applications in orthopaedic surgery include: diagnostics, such as fracture recognition and tumour detection; predictive models of clinical and patient-reported outcome measures, such as calculating mortality rates and length of hospital stay; and real-time rehabilitation monitoring and surgical training. However, clinicians should remain cognizant of AI’s limitations, as the development of robust reporting and validation frameworks is of paramount importance to prevent avoidable errors and biases. The aim of this review article is to provide a comprehensive understanding of AI and its subfields, as well as to delineate its existing clinical applications in trauma and orthopaedic surgery. Furthermore, this narrative review expands upon the limitations of AI and future direction.

Cite this article: Bone Joint Res 2023;12(7):447–454.


The Bone & Joint Journal
Vol. 104-B, Issue 8 | Pages 929 - 937
1 Aug 2022
Gurung B Liu P Harris PDR Sagi A Field RE Sochart DH Tucker K Asopa V

Aims

Total hip arthroplasty (THA) and total knee arthroplasty (TKA) are common orthopaedic procedures requiring postoperative radiographs to confirm implant positioning and identify complications. Artificial intelligence (AI)-based image analysis has the potential to automate this postoperative surveillance. The aim of this study was to prepare a scoping review to investigate how AI is being used in the analysis of radiographs following THA and TKA, and how accurate these tools are.

Methods

The Embase, MEDLINE, and PubMed libraries were systematically searched to identify relevant articles. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for scoping reviews and Arksey and O’Malley framework were followed. Study quality was assessed using a modified Methodological Index for Non-Randomized Studies tool. AI performance was reported using either the area under the curve (AUC) or accuracy.


Bone & Joint Research
Vol. 13, Issue 2 | Pages 66 - 82
5 Feb 2024
Zhao D Zeng L Liang G Luo M Pan J Dou Y Lin F Huang H Yang W Liu J

Aims

This study aimed to explore the biological and clinical importance of dysregulated key genes in osteoarthritis (OA) patients at the cartilage level to find potential biomarkers and targets for diagnosing and treating OA.

Methods

Six sets of gene expression profiles were obtained from the Gene Expression Omnibus database. Differential expression analysis, weighted gene coexpression network analysis (WGCNA), and multiple machine-learning algorithms were used to screen crucial genes in osteoarthritic cartilage, and genome enrichment and functional annotation analyses were used to decipher the related categories of gene function. Single-sample gene set enrichment analysis was performed to analyze immune cell infiltration. Correlation analysis was used to explore the relationship among the hub genes and immune cells, as well as markers related to articular cartilage degradation and bone mineralization.


Bone & Joint Research
Vol. 12, Issue 4 | Pages 245 - 255
3 Apr 2023
Ryu S So J Ha Y Kuh S Chin D Kim K Cho Y Kim K

Aims. To determine the major risk factors for unplanned reoperations (UROs) following corrective surgery for adult spinal deformity (ASD) and their interactions, using machine learning-based prediction algorithms and game theory. Methods. Patients who underwent surgery for ASD, with a minimum of two-year follow-up, were retrospectively reviewed. In total, 210 patients were included and randomly allocated into training (70% of the sample size) and test (the remaining 30%) sets to develop the machine learning algorithm. Risk factors were included in the analysis, along with clinical characteristics and parameters acquired through diagnostic radiology. Results. Overall, 152 patients without and 58 with a history of surgical revision following surgery for ASD were observed; the mean age was 68.9 years (SD 8.7) and 66.9 years (SD 6.6), respectively. On implementing a random forest model, the classification of URO events resulted in a balanced accuracy of 86.8%. Among machine learning-extracted risk factors, URO, proximal junction failure (PJF), and postoperative distance from the posterosuperior corner of C7 and the vertical axis from the centroid of C2 (SVA) were significant upon Kaplan-Meier survival analysis. Conclusion. The major risk factors for URO following surgery for ASD, i.e. postoperative SVA and PJF, and their interactions were identified using a machine learning algorithm and game theory. Clinical benefits will depend on patient risk profiles. Cite this article: Bone Joint Res 2023;12(4):245–255


The Bone & Joint Journal
Vol. 105-B, Issue 6 | Pages 702 - 710
1 Jun 2023
Yeramosu T Ahmad W Bashir A Wait J Bassett J Domson G

Aims. The aim of this study was to identify factors associated with five-year cancer-related mortality in patients with limb and trunk soft-tissue sarcoma (STS) and develop and validate machine learning algorithms in order to predict five-year cancer-related mortality in these patients. Methods. Demographic, clinicopathological, and treatment variables of limb and trunk STS patients in the Surveillance, Epidemiology, and End Results Program (SEER) database from 2004 to 2017 were analyzed. Multivariable logistic regression was used to determine factors significantly associated with five-year cancer-related mortality. Various machine learning models were developed and compared using area under the curve (AUC), calibration, and decision curve analysis. The model that performed best on the SEER testing data was further assessed to determine the variables most important in its predictive capacity. This model was externally validated using our institutional dataset. Results. A total of 13,646 patients with STS from the SEER database were included, of whom 35.9% experienced five-year cancer-related mortality. The random forest model performed the best overall and identified tumour size as the most important variable when predicting mortality in patients with STS, followed by M stage, histological subtype, age, and surgical excision. Each variable was significant in logistic regression. External validation yielded an AUC of 0.752. Conclusion. This study identified clinically important variables associated with five-year cancer-related mortality in patients with limb and trunk STS, and developed a predictive model that demonstrated good accuracy and predictability. Orthopaedic oncologists may use these findings to further risk-stratify their patients and recommend an optimal course of treatment. Cite this article: Bone Joint J 2023;105-B(6):702–710


Bone & Joint Open
Vol. 2, Issue 10 | Pages 879 - 885
20 Oct 2021
Oliveira e Carmo L van den Merkhof A Olczak J Gordon M Jutte PC Jaarsma RL IJpma FFA Doornberg JN Prijs J

Aims

The number of convolutional neural networks (CNN) available for fracture detection and classification is rapidly increasing. External validation of a CNN on a temporally separate (separated by time) or geographically separate (separated by location) dataset is crucial to assess generalizability of the CNN before application to clinical practice in other institutions. We aimed to answer the following questions: are current CNNs for fracture recognition externally valid?; which methods are applied for external validation (EV)?; and, what are reported performances of the EV sets compared to the internal validation (IV) sets of these CNNs?

Methods

The PubMed and Embase databases were systematically searched from January 2010 to October 2020 according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. The type of EV, characteristics of the external dataset, and diagnostic performance characteristics on the IV and EV datasets were collected and compared. Quality assessment was conducted using a seven-item checklist based on a modified Methodologic Index for NOn-Randomized Studies instrument (MINORS).


The Bone & Joint Journal
Vol. 102-B, Issue 7 Supple B | Pages 99 - 104
1 Jul 2020
Shah RF Bini S Vail T

Aims

Natural Language Processing (NLP) offers an automated method to extract data from unstructured free text fields for arthroplasty registry participation. Our objective was to investigate how accurately NLP can be used to extract structured clinical data from unstructured clinical notes when compared with manual data extraction.

Methods

A group of 1,000 randomly selected clinical and hospital notes from eight different surgeons were collected for patients undergoing primary arthroplasty between 2012 and 2018. In all, 19 preoperative, 17 operative, and two postoperative variables of interest were manually extracted from these notes. A NLP algorithm was created to automatically extract these variables from a training sample of these notes, and the algorithm was tested on a random test sample of notes. Performance of the NLP algorithm was measured in Statistical Analysis System (SAS) by calculating the accuracy of the variables collected, the ability of the algorithm to collect the correct information when it was indeed in the note (sensitivity), and the ability of the algorithm to not collect a certain data element when it was not in the note (specificity).


Bone & Joint Open
Vol. 3, Issue 10 | Pages 767 - 776
5 Oct 2022
Jang SJ Kunze KN Brilliant ZR Henson M Mayman DJ Jerabek SA Vigdorchik JM Sculco PK

Aims

Accurate identification of the ankle joint centre is critical for estimating tibial coronal alignment in total knee arthroplasty (TKA). The purpose of the current study was to leverage artificial intelligence (AI) to determine the accuracy and effect of using different radiological anatomical landmarks to quantify mechanical alignment in relation to a traditionally defined radiological ankle centre.

Methods

Patients with full-limb radiographs from the Osteoarthritis Initiative were included. A sub-cohort of 250 radiographs were annotated for landmarks relevant to knee alignment and used to train a deep learning (U-Net) workflow for angle calculation on the entire database. The radiological ankle centre was defined as the midpoint of the superior talus edge/tibial plafond. Knee alignment (hip-knee-ankle angle) was compared against 1) midpoint of the most prominent malleoli points, 2) midpoint of the soft-tissue overlying malleoli, and 3) midpoint of the soft-tissue sulcus above the malleoli.


Bone & Joint Research
Vol. 12, Issue 3 | Pages 165 - 177
1 Mar 2023
Boyer P Burns D Whyne C

Aims

An objective technological solution for tracking adherence to at-home shoulder physiotherapy is important for improving patient engagement and rehabilitation outcomes, but remains a significant challenge. The aim of this research was to evaluate performance of machine-learning (ML) methodologies for detecting and classifying inertial data collected during in-clinic and at-home shoulder physiotherapy exercise.

Methods

A smartwatch was used to collect inertial data from 42 patients performing shoulder physiotherapy exercises for rotator cuff injuries in both in-clinic and at-home settings. A two-stage ML approach was used to detect out-of-distribution (OOD) data (to remove non-exercise data) and subsequently for classification of exercises. We evaluated the performance impact of grouping exercises by motion type, inclusion of non-exercise data for algorithm training, and a patient-specific approach to exercise classification. Algorithm performance was evaluated using both in-clinic and at-home data.


The Bone & Joint Journal
Vol. 106-B, Issue 2 | Pages 203 - 211
1 Feb 2024
Park JH Won J Kim H Kim Y Kim S Han I

Aims

This study aimed to compare the performance of survival prediction models for bone metastases of the extremities (BM-E) with pathological fractures in an Asian cohort, and investigate patient characteristics associated with survival.

Methods

This retrospective cohort study included 469 patients, who underwent surgery for BM-E between January 2009 and March 2022 at a tertiary hospital in South Korea. Postoperative survival was calculated using the PATHFx3.0, SPRING13, OPTIModel, SORG, and IOR models. Model performance was assessed with area under the curve (AUC), calibration curve, Brier score, and decision curve analysis. Cox regression analyses were performed to evaluate the factors contributing to survival.


Orthopaedic Proceedings
Vol. 104-B, Issue SUPP_4 | Pages 29 - 29
1 Apr 2022
Pettit MH Hickman S Malviya A Khanduja V
Full Access

Identification of patients at risk of not achieving minimally clinically important differences (MCID) in patient reported outcome measures (PROMs) is important to ensure principled and informed pre-operative decision making. Machine learning techniques may enable the generation of a predictive model for attainment of MCID in hip arthroscopy. Aims: 1) to determine whether machine learning techniques could predict which patients will achieve MCID in the iHOT-12 PROM 6 months after arthroscopic management of femoroacetabular impingement (FAI), 2) to determine which factors contribute to their predictive power. Data from the UK Non-Arthroplasty Hip Registry database was utilised. We identified 1917 patients who had undergone hip arthroscopy for FAI with both baseline and 6 month follow up iHOT-12 and baseline EQ-5D scores. We trained three established machine learning algorithms on our dataset to predict an outcome of iHOT-12 MCID improvement at 6 months given baseline characteristics including demographic factors, disease characteristics and PROMs. Performance was assessed using area under the receiver operating characteristic (AUROC) statistics with 5-fold cross validation. The three machine learning algorithms showed quite different performance. The linear logistic regression model achieved AUROC = 0.59, the deep neural network achieved AUROC = 0.82, while a random forest model had the best predictive performance with AUROC 0.87. Of demographic factors, we found that BMI and age were key predictors for this model. We also found that removing all features except baseline responses to the iHOT-12 questionnaire had little effect on performance for the random forest model (AUROC = 0.85). Disease characteristics had little effect on model performance. Machine learning models are able to predict with good accuracy 6-month post-operative MCID attainment in patients undergoing arthroscopic management for FAI. Baseline scores from the iHOT-12 questionnaire are sufficient to predict with good accuracy whether a patient is likely to reach MCID in post-operative PROMs


The Bone & Joint Journal
Vol. 102-B, Issue 7 Supple B | Pages 11 - 19
1 Jul 2020
Shohat N Goswami K Tan TL Yayac M Soriano A Sousa R Wouthuyzen-Bakker M Parvizi J

Aims. Failure of irrigation and debridement (I&D) for prosthetic joint infection (PJI) is influenced by numerous host, surgical, and pathogen-related factors. We aimed to develop and validate a practical, easy-to-use tool based on machine learning that may accurately predict outcome following I&D surgery taking into account the influence of numerous factors. Methods. This was an international, multicentre retrospective study of 1,174 revision total hip (THA) and knee arthroplasties (TKA) undergoing I&D for PJI between January 2005 and December 2017. PJI was defined using the Musculoskeletal Infection Society (MSIS) criteria. A total of 52 variables including demographics, comorbidities, and clinical and laboratory findings were evaluated using random forest machine learning analysis. The algorithm was then verified through cross-validation. Results. Of the 1,174 patients that were included in the study, 405 patients (34.5%) failed treatment. Using random forest analysis, an algorithm that provides the probability for failure for each specific patient was created. By order of importance, the ten most important variables associated with failure of I&D were serum CRP levels, positive blood cultures, indication for index arthroplasty other than osteoarthritis, not exchanging the modular components, use of immunosuppressive medication, late acute (haematogenous) infections, methicillin-resistant Staphylococcus aureus infection, overlying skin infection, polymicrobial infection, and older age. The algorithm had good discriminatory capability (area under the curve = 0.74). Cross-validation showed similar probabilities comparing predicted and observed failures indicating high accuracy of the model. Conclusion. This is the first study in the orthopaedic literature to use machine learning as a tool for predicting outcomes following I&D surgery. The developed algorithm provides the medical profession with a tool that can be employed in clinical decision-making and improve patient care. Future studies should aid in further validating this tool on additional cohorts. Cite this article: Bone Joint J 2020;102-B(7 Supple B):11–19


Orthopaedic Proceedings
Vol. 104-B, Issue SUPP_13 | Pages 60 - 60
1 Dec 2022
Martin RK Wastvedt S Pareek A Persson A Visnes H Fenstad AM Moatshe G Wolfson J Lind M Engebretsen L
Full Access

External validation of machine learning predictive models is achieved through evaluation of model performance on different groups of patients than were used for algorithm development. This important step is uncommonly performed, inhibiting clinical translation of newly developed models. Recently, machine learning was used to develop a tool that can quantify revision risk for a patient undergoing primary anterior cruciate ligament (ACL) reconstruction (https://swastvedt.shinyapps.io/calculator_rev/). The source of data included nearly 25,000 patients with primary ACL reconstruction recorded in the Norwegian Knee Ligament Register (NKLR). The result was a well-calibrated tool capable of predicting revision risk one, two, and five years after primary ACL reconstruction with moderate accuracy. The purpose of this study was to determine the external validity of the NKLR model by assessing algorithm performance when applied to patients from the Danish Knee Ligament Registry (DKLR). The primary outcome measure of the NKLR model was probability of revision ACL reconstruction within 1, 2, and/or 5 years. For the index study, 24 total predictor variables in the NKLR were included and the models eliminated variables which did not significantly improve prediction ability - without sacrificing accuracy. The result was a well calibrated algorithm developed using the Cox Lasso model that only required five variables (out of the original 24) for outcome prediction. For this external validation study, all DKLR patients with complete data for the five variables required for NKLR prediction were included. The five variables were: graft choice, femur fixation device, Knee Injury and Osteoarthritis Outcome Score (KOOS) Quality of Life subscale score at surgery, years from injury to surgery, and age at surgery. Predicted revision probabilities were calculated for all DKLR patients. The model performance was assessed using the same metrics as the NKLR study: concordance and calibration. In total, 10,922 DKLR patients were included for analysis. Average follow-up time or time-to-revision was 8.4 (±4.3) years and overall revision rate was 6.9%. Surgical technique trends (i.e., graft choice and fixation devices) and injury characteristics (i.e., concomitant meniscus and cartilage pathology) were dissimilar between registries. The model produced similar concordance when applied to the DKLR population compared to the original NKLR test data (DKLR: 0.68; NKLR: 0.68-0.69). Calibration was poorer for the DKLR population at one and five years post primary surgery but similar to the NKLR at two years. The NKLR machine learning algorithm demonstrated similar performance when applied to patients from the DKLR, suggesting that it is valid for application outside of the initial patient population. This represents the first machine learning model for predicting revision ACL reconstruction that has been externally validated. Clinicians can use this in-clinic calculator to estimate revision risk at a patient specific level when discussing outcome expectations pre-operatively. While encouraging, it should be noted that the performance of the model on patients undergoing ACL reconstruction outside of Scandinavia remains unknown


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_3 | Pages 118 - 118
23 Feb 2023
Zhou Y Dowsey M Spelman T Choong P Schilling C
Full Access

Approximately 20% of patients feel unsatisfied 12 months after primary total knee arthroplasty (TKA). Current predictive tools for TKA focus on the clinician as the intended user rather than the patient. The aim of this study is to develop a tool that can be used by patients without clinician assistance, to predict health-related quality of life (HRQoL) outcomes 12 months after total knee arthroplasty (TKA). All patients with primary TKAs for osteoarthritis between 2012 and 2019 at a tertiary institutional registry were analysed. The predictive outcome was improvement in Veterans-RAND 12 utility score at 12 months after surgery. Potential predictors included patient demographics, co-morbidities, and patient reported outcome scores at baseline. Logistic regression and three machine learning algorithms were used. Models were evaluated using both discrimination and calibration metrics. Predictive outcomes were categorised into deciles from 1 being the least likely to improve to 10 being the most likely to improve. 3703 eligible patients were included in the analysis. The logistic regression model performed the best in out-of-sample evaluation for both discrimination (AUC = 0.712) and calibration (gradient = 1.176, intercept = -0.116, Brier score = 0.201) metrics. Machine learning algorithms were not superior to logistic regression in any performance metric. Patients in the lowest decile (1) had a 29% probability for improvement and patients in the highest decile (10) had an 86% probability for improvement. Logistic regression outperformed machine learning algorithms in this study. The final model performed well enough with calibration metrics to accurately predict improvement after TKA using deciles. An ongoing randomised controlled trial (ACTRN12622000072718) is evaluating the effect of this tool on patient willingness for surgery. Full results of this trial are expected to be available by April 2023. A free-to-use online version of the tool is available at . smartchoice.org.au.


Background. Magnetic resonance imaging (MRI) algorithm identifies end stage severely degenerated disc as ‘black’, and a moderately degenerate to non-degenerated disc as ‘white’. MRI is based on signal intensity changes that identifies loss of proteoglycans, water, and general radial bulging but lacks association with microscopic features such as fissure, endplate damage, persistent inflammatory catabolism that facilitates proteoglycan loss leading to ultimate collapse of annulus with neo-innervation and vascularization, as an indicator of pain. Thus, we propose a novel machine learning based imaging tool that combines quantifiable microscopic histopathological features with macroscopic signal intensities changes for hybrid assessment of disc degeneration. Methods. 100-disc tissue were collected from patients undergoing surgeries and cadaveric controls, age range of 35–75 years. MRI Pfirrmann grades were collected in each case, and each disc specimen were processed to identify the 1) region of interest 2) analytical imaging vector 3) data assimilation, grading and scoring pattern 4) identification of machine learning algorithm 5) predictive learning parameters to form an interface between hardware and software operating system. Results. Kernel algorithm defines non-linear data in xy histogram. X,Y values are scored histological spatial variables that signifies loss of proteoglycans, blood vessels ingrowth, and occurrence of tears or fissures in the inner and outer annulus regions mapped with the dampening and graded series of signal intensity changes. Conclusion. To our knowledge this study is the first to propose a machine learning method between microscopic spatial tissue changes and macroscopic signal intensity grades in the intervertebral disc. No conflict of interest declared.  . Sources of Funding. ICMR/5/4-5/3/42/Neuro/2022-NCD-1, Dr TMA PAI SMU/ 131/ REG/ TMA PURK/ 164/2020. A part of the above study was presented as an oral paper at the International Society for the Study of Lumbar Spine (ISSLS) meeting held on 1–5. th. May 2023, Melbourne, Australia


Bone & Joint 360
Vol. 12, Issue 6 | Pages 46 - 47
1 Dec 2023

The December 2023 Research Roundup. 360. looks at: Tissue integration and chondroprotective potential of acetabular labral augmentation with autograft tendon: study of a porcine model; The Irish National Orthopaedic Register under cyberattack: what happened, and what were the consequences?; An overview of machine learning in orthopaedic surgery: an educational paper; Beware of the fungus…; New evidence for COVID-19 in patients undergoing joint replacement surgery


Bone & Joint 360
Vol. 12, Issue 3 | Pages 13 - 15
1 Jun 2023

The June 2023 Hip & Pelvis Roundup. 360. looks at: Machine learning to identify surgical candidates for hip and knee arthroplasty: a viable option?; Poor outcome after debridement and implant retention; Can you cement polyethylene liners into well-fixed acetabular shells in hip revision?; Revision stem in primary arthroplasties: the Exeter 44/0 125 mm stem; Depression and anxiety: could they be linked to infection?; Does where you live affect your outcomes after hip and knee arthroplasties?; Racial disparities in outcomes after total hip arthroplasty and total knee arthroplasty are substantially mediated by socioeconomic disadvantage both in black and white patients


Orthopaedic Proceedings
Vol. 102-B, Issue SUPP_1 | Pages 76 - 76
1 Feb 2020
Roche C Simovitch R Flurin P Wright T Zuckerman J Routman H
Full Access

Introduction. Machine learning is a relatively novel method to orthopaedics which can be used to evaluate complex associations and patterns in outcomes and healthcare data. The purpose of this study is to utilize 3 different supervised machine learning algorithms to evaluate outcomes from a multi-center international database of a single shoulder prosthesis to evaluate the accuracy of each model to predict post-operative outcomes of both aTSA and rTSA. Methods. Data from a multi-center international database consisting of 6485 patients who received primary total shoulder arthroplasty using a single shoulder prosthesis (Equinoxe, Exactech, Inc) were analyzed from 19,796 patient visits in this study. Specifically, demographic, comorbidity, implant type and implant size, surgical technique, pre-operative PROMs and ROM measures, post-operative PROMs and ROM measures, pre-operative and post-operative radiographic data, and also adverse event and complication data were obtained for 2367 primary aTSA patients from 8042 visits at an average follow-up of 22 months and 4118 primary rTSA from 11,754 visits at an average follow-up of 16 months were analyzed to create a predictive model using 3 different supervised machine learning techniques: 1) linear regression, 2) random forest, and 3) XGBoost. Each of these 3 different machine learning techniques evaluated the pre-operative parameters and created a predictive model which targeted the post-operative composite score, which was a 100 point score consisting of 50% post-operative composite outcome score (calculated from 33.3% ASES + 33.3% UCLA + 33.3% Constant) and 50% post-operative composite ROM score (calculated from S curves weighted by 70% active forward flexion + 15% internal rotation score + 15% active external rotation). 3 additional predictive models were created to control for the time required for patient improvement after surgery, to do this, each primary aTSA and primary rTSA cohort was subdivided to only include patient data follow-up visits >20 months after surgery, this yielded 1317 primary aTSA patients from 2962 visits at an average follow-up of 50 months and 1593 primary rTSA from 3144 visits at an average follow-up of 42 months. Each of these 6 predictive models were trained using a random selection of 80% of each cohort, then each model predicted the outcomes of the remaining 20% of the data based upon the demographic, comorbidity, implant type and implant size, surgical technique, pre-operative PROMs and ROM measures inputs of each 20% cohort. The error of all 6 predictive models was calculated from the root mean square error (RMSE) between the actual and predicted post-op composite score. The accuracy of each model was determined by subtracting the percent difference of each RMSE value from the average composite score associated with each cohort. Results. For all patient visits, the XGBoost decision tree algorithm was the most accurate model for both aTSA & rTSA patients, with an accuracy of ∼89.5% for both aTSA and rTSA. However for patients with 20+ month visits only, the random forest decision tree algorithm was the most accurate model for both aTSA & rTSA patients, with an accuracy of ∼89.5% for both aTSA and rTSA. The linear regression model was the least accurate predictive model for each of the cohorts analyzed. However, it should be noted that all 3 machine learning models provided accuracy of ∼85% or better and a RMSE <12. (Table 1) Figures 1 and 2 depict the typical spread and RMSE of the actual vs. predicted total composite score associated with the 3 models for aTSA (Figure 1) and rTSA (Figure 2). Discussion. The results of this study demonstrate that multiple different machine learning algorithms can be utilized to create models that predict outcomes with higher accuracy for both aTSA and rTSA, for numerous timepoints after surgery. Future research should test this model on different datasets and using different machine learning methods in order to reduce over- and under-fitting model errors. For any figures or tables, please contact the authors directly


Bone & Joint 360
Vol. 12, Issue 4 | Pages 13 - 16
1 Aug 2023

The August 2023 Hip & Pelvis Roundup. 360. looks at: Using machine learning to predict venous thromboembolism and major bleeding events following total joint arthroplasty; Antibiotic length in revision total hip arthroplasty; Preoperative colonization and worse outcomes; Short stem cemented total hip arthroplasty; What are the outcomes of one- versus two-stage revisions in the UK?; To cement or not to cement? The best approach in hemiarthroplasty; Similar re-revisions in cemented and cementless femoral revisions for periprosthetic femoral fractures in total hip arthroplasty; Are hip precautions still needed?