Search | Bone & Joint

Results 1 - 20 of 84

Results per page:

Applied filters

Access

Journals

Abstracts

Dates

Specialties

The Bone & Joint Journal

Vol. 103-B, Issue 9 | Pages 1442 - 1448

1 Sep 2021

The diagnostic and prognostic value of artificial intelligence and artificial neural networks in spinal surgery

McDonnell JM Evans SR McCarthy L Temperley H Waters C Ahern D Cunniffe G Morris S Synnott K Birch N Butler JS

Access Required

View article Download PDF

In recent years, machine learning (ML) and artificial neural networks (ANNs), a particular subset of ML, have been adopted by various areas of healthcare. A number of diagnostic and prognostic algorithms have been designed and implemented across a range of orthopaedic sub-specialties to date, with many positive results. However, the methodology of many of these studies is flawed, and few compare the use of ML with the current approach in clinical practice. Spinal surgery has advanced rapidly over the past three decades, particularly in the areas of implant technology, advanced surgical techniques, biologics, and enhanced recovery protocols. It is therefore regarded an innovative field. Inevitably, spinal surgeons will wish to incorporate ML into their practice should models prove effective in diagnostic or prognostic terms. The purpose of this article is to review published studies that describe the application of neural networks to spinal surgery and which actively compare ANN models to contemporary clinical standards allowing evaluation of their efficacy, accuracy, and relatability. It also explores some of the limitations of the technology, which act to constrain the widespread adoption of neural networks for diagnostic and prognostic use in spinal care. Finally, it describes the necessary considerations should institutions wish to incorporate ANNs into their practices. In doing so, the aim of this review is to provide a practical approach for spinal surgeons to understand the relevant aspects of neural networks. Cite this article: Bone Joint J 2021;103-B(9):1442–1448

Orthopaedic Proceedings

Vol. 102-B, Issue SUPP_8 | Pages 79 - 79

1 Aug 2020

A CONVOLUTIONAL NEURAL NETWORK (CNN) FOR PREDICTING FRACTURE RISK IN METASTATIC BONE DISEASE (MBD) OF THE PROXIMAL FEMUR

Bozzo A Ghert M Reilly J

Full Access

View article

Advances in cancer therapy have prolonged patient survival even in the presence of disseminated disease and an increasing number of cancer patients are living with metastatic bone disease (MBD). The proximal femur is the most common long bone involved in MBD and pathologic fractures of the femur are associated with significant morbidity, mortality and loss of quality of life (QoL). Successful prophylactic surgery for an impending fracture of the proximal femur has been shown in multiple cohort studies to result in longer survival, preserved mobility, lower transfusion rates and shorter post-operative hospital stays. However, there is currently no optimal method to predict a pathologic fracture. The most well-known tool is Mirel's criteria, established in 1989 and is limited from guiding clinical practice due to poor specificity and sensitivity. The ideal clinical decision support tool will be of the highest sensitivity and specificity, non-invasive, generalizable to all patients, and not a burden on hospital resources or the patient's time. Our research uses novel machine learning techniques to develop a model to fill this considerable gap in the treatment pathway of MBD of the femur. The goal of our study is to train a convolutional neural network (CNN) to predict fracture risk when metastatic bone disease is present in the proximal femur. Our fracture risk prediction tool was developed by analysis of prospectively collected data of consecutive MBD patients presenting from 2009–2016. Patients with primary bone tumors, pathologic fractures at initial presentation, and hematologic malignancies were excluded. A total of 546 patients comprising 114 pathologic fractures were included. Every patient had at least one Anterior-Posterior X-ray and clinical data including patient demographics, Mirel's criteria, tumor biology, all previous radiation and chemotherapy received, multiple pain and function scores, medications and time to fracture or time to death. We have trained a convolutional neural network (CNN) with AP X-ray images of 546 patients with metastatic bone disease of the proximal femur. The digital X-ray data is converted into a matrix representing the color information at each pixel. Our CNN contains five convolutional layers, a fully connected layers of 512 units and a final output layer. As the information passes through successive levels of the network, higher level features are abstracted from the data. The model converges on two fully connected deep neural network layers that output the risk of fracture. This prediction is compared to the true outcome, and any errors are back-propagated through the network to accordingly adjust the weights between connections, until overall prediction accuracy is optimized. Methods to improve learning included using stochastic gradient descent with a learning rate of 0.01 and a momentum rate of 0.9. We used average classification accuracy and the average F1 score across five test sets to measure model performance. We compute F1 = 2 x (precision x recall)/(precision + recall). F1 is a measure of a model's accuracy in binary classification, in our case, whether a lesion would result in pathologic fracture or not. Our model achieved 88.2% accuracy in predicting fracture risk across five-fold cross validation testing. The F1 statistic is 0.87. This is the first reported application of convolutional neural networks, a machine learning algorithm, to this important Orthopaedic problem. Our neural network model was able to achieve reasonable accuracy in classifying fracture risk of metastatic proximal femur lesions from analysis of X-rays and clinical information. Our future work will aim to externally validate this algorithm on an international cohort

Orthopaedic Proceedings

Vol. 102-B, Issue SUPP_7 | Pages 96 - 96

1 Jul 2020

A CONVOLUTIONAL NEURAL NETWORK (CNN) FOR PREDICTING FRACTURE RISK IN METASTATIC BONE DISEASE (MBD) OF THE PROXIMAL FEMUR

Bozzo A Ghert M

Full Access

View article

Advances in cancer therapy have prolonged cancer patient survival even in the presence of disseminated disease and an increasing number of cancer patients are living with metastatic bone disease (MBD). The proximal femur is the most common long bone involved in MBD and pathologic fractures of the femur are associated with significant morbidity, mortality and loss of quality of life (QoL). Successful prophylactic surgery for an impending fracture of the proximal femur has been shown in multiple cohort studies to result in patients more likely to walk after surgery, longer survival, lower transfusion rates and shorter post-operative hospital stays. However, there is currently no optimal method to predict a pathologic fracture. The most well-known tool is Mirel's criteria, established in 1989 and is limited from guiding clinical practice due to poor specificity and sensitivity. The goal of our study is to train a convolutional neural network (CNN) to predict fracture risk when metastatic bone disease is present in the proximal femur. Our fracture risk prediction tool was developed by analysis of prospectively collected data for MBD patients (2009–2016) in order to determine which features are most commonly associated with fracture. Patients with primary bone tumors, pathologic fractures at initial presentation, and hematologic malignancies were excluded. A total of 1146 patients comprising 224 pathologic fractures were included. Every patient had at least one Anterior-Posterior X-ray. The clinical data includes patient demographics, tumor biology, all previous radiation and chemotherapy received, multiple pain and function scores, medications and time to fracture or time to death. Each of Mirel's criteria has been further subdivided and recorded for each lesion. We have trained a convolutional neural network (CNN) with X-ray images of 1146 patients with metastatic bone disease of the proximal femur. The digital X-ray data is converted into a matrix representing the color information at each pixel. Our CNN contains five convolutional layers, a fully connected layers of 512 units and a final output layer. As the information passes through successive levels of the network, higher level features are abstracted from the data. This model converges on two fully connected deep neural network layers that output the fracture risk. This prediction is compared to the true outcome, and any errors are back-propagated through the network to accordingly adjust the weights between connections. Methods to improve learning included using stochastic gradient descent with a learning rate of 0.01 and a momentum rate of 0.9. We used average classification accuracy and the average F1 score across test sets to measure model performance. We compute F1 = 2 x (precision x recall)/(precision + recall). F1 is a measure of a test's accuracy in binary classification, in our case, whether a lesion would result in pathologic fracture or not. Five-fold cross validation testing of our fully trained model revealed accurate classification for 88.2% of patients with metastatic bone disease of the proximal femur. The F1 statistic is 0.87. This represents a 24% error reduction from using Mirel's criteria alone to classify the risk of fracture in this cohort. This is the first reported application of convolutional neural networks, a machine learning algorithm, to an important Orthopaedic problem. Our neural network model was able to achieve impressive accuracy in classifying fracture risk of metastatic proximal femur lesions from analysis of X-rays and clinical information. Our future work will aim to validate this algorithm on an external cohort

Bone & Joint Open

Vol. 2, Issue 10 | Pages 879 - 885

20 Oct 2021

An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics

Oliveira e Carmo L van den Merkhof A Olczak J Gordon M Jutte PC Jaarsma RL IJpma FFA Doornberg JN Prijs J

Full Access

View article Download PDF

Aims. The number of convolutional neural networks (CNN) available for fracture detection and classification is rapidly increasing. External validation of a CNN on a temporally separate (separated by time) or geographically separate (separated by location) dataset is crucial to assess generalizability of the CNN before application to clinical practice in other institutions. We aimed to answer the following questions: are current CNNs for fracture recognition externally valid?; which methods are applied for external validation (EV)?; and, what are reported performances of the EV sets compared to the internal validation (IV) sets of these CNNs?. Methods. The PubMed and Embase databases were systematically searched from January 2010 to October 2020 according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. The type of EV, characteristics of the external dataset, and diagnostic performance characteristics on the IV and EV datasets were collected and compared. Quality assessment was conducted using a seven-item checklist based on a modified Methodologic Index for NOn-Randomized Studies instrument (MINORS). Results. Out of 1,349 studies, 36 reported development of a CNN for fracture detection and/or classification. Of these, only four (11%) reported a form of EV. One study used temporal EV, one conducted both temporal and geographical EV, and two used geographical EV. When comparing the CNN’s performance on the IV set versus the EV set, the following were found: AUCs of 0.967 (IV) versus 0.975 (EV), 0.976 (IV) versus 0.985 to 0.992 (EV), 0.93 to 0.96 (IV) versus 0.80 to 0.89 (EV), and F1-scores of 0.856 to 0.863 (IV) versus 0.757 to 0.840 (EV). Conclusion. The number of externally validated CNNs in orthopaedic trauma for fracture recognition is still scarce. This greatly limits the potential for transfer of these CNNs from the developing institute to another hospital to achieve similar diagnostic performance. We recommend the use of geographical EV and statements such as the Consolidated Standards of Reporting Trials–Artificial Intelligence (CONSORT-AI), the Standard Protocol Items: Recommendations for Interventional Trials–Artificial Intelligence (SPIRIT-AI) and the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis–Machine Learning (TRIPOD-ML) to critically appraise performance of CNNs and improve methodological rigor, quality of future models, and facilitate eventual implementation in clinical practice. Cite this article: Bone Jt Open 2021;2(10):879–885

Orthopaedic Proceedings

Vol. 90-B, Issue SUPP_I | Pages 35 - 36

1 Mar 2008

BACK SURFACE ASSESSMENT OF SCOLIOSIS SEVERITY BY NEURAL NETWORK.

Jaremko J Hill D Moreau M Zernicke R

Full Access

View article

Recent studies have shown that scoliotic deformity can be estimated accurately from deformity of the full three hundred and sixty degrees torso shape. However, acquisition of these data requires an expensive multi-scanner system. If it was possible to estimate accurately scoliosis from the back surface shape alone, a single scanner and simplified analysis methods could be used. Here, we estimated the Cobb angle within ten degrees in 84% of forty-six patients from back surface data, compared to 99% within ten degrees for a previous, larger study using the entire torso shape. These results suggested that both back-surface and full-torso models for Cobb angle estimation should be pursued for their potential merits. The surface deformity of scoliosis, often the primary patient complaint, progresses non-linearly with the underlying spinal deformity. If it was possible to estimate reliably the degree of scoliosis from the surface, adolescent patients with non-progressing scoliosis could be spared harmful X-ray radiation. Some of us have previously estimated the scoliotic Cobb angle from three hundred and sixty degrees torso surface deformity. Here, we tested how accurately the Cobb angle could be estimated from back surface data alone, which are easier and less expensive to obtain than full-torso data. A genetic algorithm selected the clinical parameters to be used by a neural network to estimate scoliosis deformity from back surface deformity. We had forty-six consecutive patients with right-thoracic curves (Cobb angles eleven to ninety-seven degrees), in whom fifteen indices were available including age, height, bracing status, scoliometer reading, back surface rotation, and cosmetic score of landmark asymmetry. Those data were used by a neural network to estimate the Cobb angle within ten degrees in 84% of patients, a 30% improvement over regression-model accuracy, though less accurate than use of the three hundred and sixty degrees torso shape which estimated up to 99% of curves within ten degrees in a previous study. Neural network predictive accuracy was better when using the full three hundred and sixty degrees torso shape, but the simpler and more economical acquisition of back surface data alone also gave promising results. This pilot comparison study suggested that both models (using back surface data alone vs. using three hundred and sixty degrees torso data) should continue to be developed in attempts to optimize surface estimation of scoliosis

Orthopaedic Proceedings

Vol. 103-B, Issue SUPP_16 | Pages 76 - 76

1 Dec 2021

PROBABILISTIC NEURAL NETWORK ESTIMATION OF POSTOPERATIVE PATIENT-REPORTED OUTCOME MEASURES AFTER HIP AND KNEE ARTHROPLASTY SURGERIES

de Mello FL Kadirkamanathan V Wilkinson JM

Full Access

View article

Abstract. Objectives. Conventional approaches (including Tobit) do not accurately account for ceiling effects in PROMs nor give uncertainty estimates. Here, a classifier neural network was used to estimate postoperative PROMs prior to surgery and compared with conventional methods. The Oxford Knee Score (OKS) and the Oxford Hip Score (OHS) were estimated with separate models. Methods. English NJR data from 2009 to 2018 was used, with 278.655 knee and 249.634 hip replacements. For both OKS and OHS estimations, the input variables included age, BMI, surgery date, sex, ASA, thromboprophylaxis, anaesthetic and preoperative PROMs responses. Bearing, fixation, head size and approach were also included for OHS and knee type for OKS estimation. A classifier neural network (NN) was compared with linear or Tobit regression, XGB and regression NN. The performance metrics were the root mean square error (RMSE), maximum absolute error (MAE) and area under curve (AUC). 95% confidence intervals were computed using 5-fold cross-validation. Results. The classifier NN and regression NN had the best RMSE, both with the same scores of 8.59±0.04 for knee and 7.88±0.04 for hip. The classifier NN had the best MAE, with 6.73±0.03 for knee and 5.73±0.03 for hip. The Tobit model was second, with 6.86±0.03 for knee and 6.00±0.01 for hip. The classifier NN had the best AUC, with (68.7±0.4)% for knee and (73.9±0.3)% for hip. The regression NN was second, with (67.1±0.3)% for knee and (71.1±0.4)% for hip. The Tobit model had the best AUC among conventional approaches, with (66.8±0.3)% for knee and (71.0±0.4)% for hip. Conclusions. The proposed model resulted in an improvement from the current state-of-the-art. Additionally, it estimates the full probability distribution of the postoperative PROMs, making it possible to know not only the estimated value but also its uncertainty

The Bone & Joint Journal

Vol. 103-B, Issue 8 | Pages 1358 - 1366

2 Aug 2021

Artificial neural network prediction of same-day discharge following primary total knee arthroplasty based on preoperative and intraoperative variables

Wei C Quan T Wang KY Gu A Fassihi SC Kahlenberg CA Malahias M Liu J Thakkar S Gonzalez Della Valle A Sculco PK

Access Required

View article Download PDF

Aims. This study used an artificial neural network (ANN) model to determine the most important pre- and perioperative variables to predict same-day discharge in patients undergoing total knee arthroplasty (TKA). Methods. Data for this study were collected from the National Surgery Quality Improvement Program (NSQIP) database from the year 2018. Patients who received a primary, elective, unilateral TKA with a diagnosis of primary osteoarthritis were included. Demographic, preoperative, and intraoperative variables were analyzed. The ANN model was compared to a logistic regression model, which is a conventional machine-learning algorithm. Variables collected from 28,742 patients were analyzed based on their contribution to hospital length of stay. Results. The predictability of the ANN model, area under the curve (AUC) = 0.801, was similar to the logistic regression model (AUC = 0.796) and identified certain variables as important factors to predict same-day discharge. The ten most important factors favouring same-day discharge in the ANN model include preoperative sodium, preoperative international normalized ratio, BMI, age, anaesthesia type, operating time, dyspnoea status, functional status, race, anaemia status, and chronic obstructive pulmonary disease (COPD). Six of these variables were also found to be significant on logistic regression analysis. Conclusion. Both ANN modelling and logistic regression analysis revealed clinically important factors in predicting patients who can undergo safely undergo same-day discharge from an outpatient TKA. The ANN model provides a beneficial approach to help determine which perioperative factors can predict same-day discharge as of 2018 perioperative recovery protocols. Cite this article: Bone Joint J 2021;103-B(8):1358–1366

Orthopaedic Proceedings

Vol. 102-B, Issue SUPP_1 | Pages 133 - 133

1 Feb 2020

DETECTING MECHANICAL LOOSENING OF TOTAL HIP ARTHROPLASTY USING DEEP CONVOLUTIONAL NEURAL NETWORK

Borjali A Chen A Muratoglu O Varadarajan K

Full Access

View article

INTRODUCTION. Mechanical loosening of total hip replacement (THR) is primarily diagnosed using radiographs, which are diagnostically challenging and require review by experienced radiologists and orthopaedic surgeons. Automated tools that assist less-experienced clinicians and mitigate human error can reduce the risk of missed or delayed diagnosis. Thus the purposes of this study were to: 1) develop an automated tool to detect mechanical loosening of THR by training a deep convolutional neural network (CNN) using THR x-rays, and 2) visualize the CNN training process to interpret how it functions. METHODS. A retrospective study was conducted using previously collected imaging data at a single institution with IRB approval. Twenty-three patients with cementless primary THR who underwent revision surgery due to mechanical loosening (either with a loose stem and/or a loose acetabular component) had their hip x-rays evaluated immediately prior to their revision surgery (32 “loose” x-rays). A comparison group was comprised of 23 patients who underwent primary cementless THR surgery with x-rays immediately after their primary surgery (31 “not loose” x-rays). Fig. 1 shows examples of “not loose” and “loose” THR x-ray. DenseNet201-CNN was utilized by swapping the top layer with a binary classifier using 90:10 split-validation [1]. Pre-trained CNN on ImageNet [2] and not pre-trained CNN (initial zero weights) were implemented to compare the results. Saliency maps were implemented to indicate the importance of each pixel of a given x-ray on the CNN's performance [3]. RESULTS. Fig. 2 shows the saliency maps for an example x-ray and the corresponding accuracy of the CNN on the entire validation dataset at different stages of the training for both pre-trained (Fig. 2a) and not pre-trained (Fig. 2b) CNNs. Colored regions in the saliency maps, where red denotes higher relative influence than blue, indicate the most influential regions on the CNN's performance. Pre-trained CNN achieved higher accuracy (87%) on the validation set x-rays than not pre-trained CNN (62%) after 10 epochs. The pre-trained CNN's saliency map at 10 epochs identified significant influence of bone-implant interaction regions on the CNN's performance. This indicates that the CNN is ‘looking’ at the clinically relevant features in the x-rays. The saliency maps also demonstrated that the pre-trained CNN quickly learned where to ‘look’, while the not pre-trained CNN struggles. DISCUSSION. An automated tool to detect mechanical loosening of THR was developed that can potentially assist clinicians with accurate diagnosis. By visualizing the influential regions of the x-ray on the CNN performance, this study shed light into CNN learning process and demonstrated that CNN is ‘looking’ at the clinically relevant features to classify the x-rays. This visualization is crucial to build trust in the automated system by interpreting how it functions to increase the confidence in the application of artificial intelligence to the field of orthopaedics. This study also demonstrated that pre-training CNN can accelerate the learning process and achieve high accuracy even on a small dataset. For any figures or tables, please contact the authors directly

Orthopaedic Proceedings

Vol. 102-B, Issue SUPP_1 | Pages 129 - 129

1 Feb 2020

TOWARDS REAL-TIME SIMULATION: NEURAL NETWORK REPRESENTATION OF TOTAL KNEE ARTHROPLASTY-IMPLANTED LOWER EXTREMITY MODEL

Maag C Langhorn J Rullkoetter P

Full Access

View article

INTRODUCTION. While computational models have been used for many years to contribute to pre-clinical, design phase iterations of total knee replacement implants, the analysis time required has limited the real-time use as required for other applications, such as in patient-specific surgical alignment in the operating room. In this environment, the impact of variation in ligament balance and implant alignment on estimated joint mechanics must be available instantaneously. As neural networks (NN) have shown the ability to appropriately represent dynamic systems, the objective of this preliminary study was to evaluate deep learning to represent the joint level kinetic and kinematic results from a validated finite element lower limb model with varied surgical alignment. METHODS. External hip and ankle boundary conditions were created for a previously-developed finite element lower limb model [1] for step down (SD), deep knee bend (DKB) and gait to best reproduce in-vivo loading conditions as measured on patients with the Innex knee (. orthoload.com. ) (Figure1). These boundary conditions were subsequently used as inputs for the model with a current fixed-bearing total knee replacement to estimate implant-specific kinetics and kinematics during activities of daily living. Implant alignments were varied, including variation of the hip-knee-ankle angle-±3°, the frontal plane joint line −7° to +5°, internal-external femoral rotation ±3°, and the tibial posterior slope 5° and 0°. Through varying these parameters a total of 2464 simulations were completed. A NN was created utilizing the NN toolbox in MATLAB. Sequence data inputs were produced from the alignment and the external boundary conditions for each activity cycle. Sequence outputs for the model were the 6 degree of freedom kinetics and kinematics, totaling 12 outputs. All data was normalized across the entire data set. Ten percent of the simulation runs were removed at random from the training set to be used for validation, leaving 2220 simulations for training and 244 for validation. A nine-layer bi-long short-term memory (LSTM) NN was created to take advantage of bi-LSTM layers ability to learn from past and future data. Training on the network was undertaken using an RMSprop solver until the root mean square error (RMSE) stopped reducing. Evaluation of NN quality was determined by the RMSE of the validation set. RESULTS. The trained NN was able to effectively estimate the validation data. Average RMSE over the kinetics of the validation data set was 140.7N/N∗m while the average RMSE over the kinematics of the validation data set was 4.47mm/deg (Figure 2,3–DKB, Gait shown). It is noted the error may be skewed by the larger magnitude kinetics and kinematics in the DKB activity as the average RMSE for just SD and gait was 85.9N/N∗m and 2.8mm/deg for the kinetics and kinematics, respectively. DISCUSSION. The accuracy of the generated NN indicates its potential for use in real-time modeling, and further work will explore additional changes in post-operative soft-tissue balance as well as scaling to patient-specific geometry

Bone & Joint Research

Vol. 12, Issue 7 | Pages 447 - 454

10 Jul 2023

Artificial intelligence in orthopaedic surgery

Lisacek-Kiosoglous AB Powling AS Fontalis A Gabr A Mazomenos E Haddad FS

Full Access

View article Download PDF

The use of artificial intelligence (AI) is rapidly growing across many domains, of which the medical field is no exception. AI is an umbrella term defining the practical application of algorithms to generate useful output, without the need of human cognition. Owing to the expanding volume of patient information collected, known as ‘big data’, AI is showing promise as a useful tool in healthcare research and across all aspects of patient care pathways. Practical applications in orthopaedic surgery include: diagnostics, such as fracture recognition and tumour detection; predictive models of clinical and patient-reported outcome measures, such as calculating mortality rates and length of hospital stay; and real-time rehabilitation monitoring and surgical training. However, clinicians should remain cognizant of AI’s limitations, as the development of robust reporting and validation frameworks is of paramount importance to prevent avoidable errors and biases. The aim of this review article is to provide a comprehensive understanding of AI and its subfields, as well as to delineate its existing clinical applications in trauma and orthopaedic surgery. Furthermore, this narrative review expands upon the limitations of AI and future direction.

Cite this article: Bone Joint Res 2023;12(7):447–454.

The Bone & Joint Journal

Vol. 104-B, Issue 8 | Pages 911 - 914

1 Aug 2022

Artificial intelligence and computer vision in orthopaedic trauma

Prijs J Liao Z Ashkani-Esfahani S Olczak J Gordon M Jayakumar P Jutte PC Jaarsma RL IJpma FFA Doornberg JN

Access Required

View article Download PDF

Artificial intelligence (AI) is, in essence, the concept of ‘computer thinking’, encompassing methods that train computers to perform and learn from executing certain tasks, called machine learning, and methods to build intricate computer models that both learn and adapt, called complex neural networks. Computer vision is a function of AI by which machine learning and complex neural networks can be applied to enable computers to capture, analyze, and interpret information from clinical images and visual inputs. This annotation summarizes key considerations and future perspectives concerning computer vision, questioning the need for this technology (the ‘why’), the current applications (the ‘what’), and the approach to unlocking its full potential (the ‘how’). Cite this article: Bone Joint J 2022;104-B(8):911–914

Orthopaedic Proceedings

Vol. 105-B, Issue SUPP_12 | Pages 85 - 85

23 Jun 2023

USING CLASSIFIER NEURAL NETWORKS TO ESTIMATE PERSONALIZED PATIENT-REPORTED OUTCOME MEASURES AFTER HIP ARTHROPLASTY

de Mello F Kadirkamanathan V Wilkinson JM

Full Access

View article

Successful estimation of postoperative PROMs prior to a joint replacement surgery is important in deciding the best treatment option for a patient. However, estimation of the outcome is associated with substantial noise around individual prediction. Here, we test whether a classifier neural network can be used to simultaneously estimate postoperative PROMs and uncertainty better than current methods. We perform Oxford hip score (OHS) estimation using data collected by the NJR from 249,634 hip replacement surgeries performed from 2009 to 2018. The root mean square error (RMSE) of the various methods are compared to the standard deviation of outcome change distribution to measure the proportion of the total outcome variability that the model can capture. The area under the curve (AUC) for the probability of the change score being above a certain threshold was also plotted. The proposed classifier NN had a better or equivalent RMSE than all other currently used models. The threshold AUC shows similar results for all methods close to a change score of 20 but demonstrates better accuracy of the classifier neural network close to 0 change and greater than 30 change, showing that the full probability distribution performed by the classifier neural network resulted in a significant improvement in estimating the upper and lower quantiles of the change score probability distribution. Consequently, probabilistic estimation as performed by the classifier NN is the most adequate approach to this problem, since the final score has an important component of uncertainty. This study shows the importance of uncertainty estimation to accompany postoperative PROMs prediction and presents a clinically-meaningful method for personalised outcome that includes such uncertainty estimation

Orthopaedic Proceedings

Vol. 104-B, Issue SUPP_4 | Pages 5 - 5

1 Apr 2022

USING CLASSIFIER NEURAL NETWORKS TO ESTIMATE PERSONALIZED PATIENT-REPORTED OUTCOME MEASURES AFTER HIP ARTHROPLASTY

de Mello F Kadirkamanathan V Wilkinson M

Full Access

View article

Successful estimation of postoperative PROMs prior to a joint replacement surgery is important in deciding the best treatment option for a patient. However, estimation of the outcome is associated with substantial noise around individual prediction. Here, we test whether a classifier neural network can be used to simultaneously estimate postoperative PROMs and uncertainty better than current methods. We perform Oxford hip score (OHS) estimation using data collected by the NJR from 249,634 hip replacement surgeries performed from 2009 to 2018. The root mean square error (RMSE) of the various methods are compared to the standard deviation of outcome change distribution to measure the proportion of the total outcome variability that the model can capture. The area under the curve (AUC) for the probability of the change score being above a certain threshold was also plotted. The proposed classifier NN had a better or equivalent RMSE than all other currently used models. The standard deviation for the change score for the entire population was 9.93, which can be interpreted as the RMSE that would be achieved for a model that gives the same estimation for all patients regardless of the covariates. However, most of the variation in the postoperative OHS/OKS change score is not captured by the models, confirming the importance of accurate uncertainty estimation. The threshold AUC shows similar results for all methods close to a change score of 20 but demonstrates better accuracy of the classifier neural network close to 0 change and greater than 30 change, showing that the full probability distribution performed by the classifier neural network resulted in a significant improvement in estimating the upper and lower quantiles of the change score probability distribution. Consequently, probabilistic estimation as performed by the classifier NN is the most adequate approach to this problem, since the final score has an important component of uncertainty. This study shows the importance of uncertainty estimation to accompany postoperative PROMs prediction and presents a clinically-meaningful method for personalised outcome that includes such uncertainty estimation

Orthopaedic Proceedings

Vol. 106-B, Issue SUPP_18 | Pages 17 - 17

14 Nov 2024

IMAGE SEGMENTATION OF ECTOPIC BONE FORMATION USING A DEEP CONVOLUTIONAL NETWORK: ANALYSIS PROTOCOL AND MATERIAL

Kjærgaard K Ding M Mansourvar M

Full Access

View article

Introduction. Experimental bone research often generates large amounts of histology and histomorphometry data, and the analysis of these data can be time-consuming and trivial. Machine learning offers a viable alternative to manual analysis for measuring e.g. bone volume versus total volume. The objective was to develop a neural network for image segmentation, and to assess the accuracy of this network when applied to ectopic bone formation samples compared to a ground truth. Method. Thirteen tissue slides totaling 114 megapixels of ectopic bone formation were selected for model building. Slides were split into training, validation, and test data, with the test data reserved and only used for the final model assessment. We developed a neural network resembling U-Net that takes 512×512 pixel tiles. To improve model robustness, images were augmented online during training. The network was trained for 3 days on a NVidia Tesla K80 provided by a free online learning platform against ground truth masks annotated by an experienced researcher. Result. During training, the validation accuracy improved and stabilised at approx. 95%. The test accuracy was 96.1 %. Conclusion. Most experiments using ectopic bone formation will yield an inter-observer or inter-method variance of far more than 5%, so the current approach may be a valid and feasible technique for automated image segmentation for large datasets. More data or a consensus-based ground truth may improve training stability and validation accuracy. The code and data of this project are available upon request and will be available online as part of our publication

The Bone & Joint Journal

Vol. 102-B, Issue 6 Supple A | Pages 101 - 106

1 Jun 2020

Incremental inputs improve the automated detection of implant loosening using machine-learning algorithms

Shah RF Bini SA Martinez AM Pedoia V Vail TP

Access Required

View article Download PDF

Aims. The aim of this study was to evaluate the ability of a machine-learning algorithm to diagnose prosthetic loosening from preoperative radiographs and to investigate the inputs that might improve its performance. Methods. A group of 697 patients underwent a first-time revision of a total hip (THA) or total knee arthroplasty (TKA) at our institution between 2012 and 2018. Preoperative anteroposterior (AP) and lateral radiographs, and historical and comorbidity information were collected from their electronic records. Each patient was defined as having loose or fixed components based on the operation notes. We trained a series of convolutional neural network (CNN) models to predict a diagnosis of loosening at the time of surgery from the preoperative radiographs. We then added historical data about the patients to the best performing model to create a final model and tested it on an independent dataset. Results. The convolutional neural network we built performed well when detecting loosening from radiographs alone. The first model built de novo with only the radiological image as input had an accuracy of 70%. The final model, which was built by fine-tuning a publicly available model named DenseNet, combining the AP and lateral radiographs, and incorporating information from the patient’s history, had an accuracy, sensitivity, and specificity of 88.3%, 70.2%, and 95.6% on the independent test dataset. It performed better for cases of revision THA with an accuracy of 90.1%, than for cases of revision TKA with an accuracy of 85.8%. Conclusion. This study showed that machine learning can detect prosthetic loosening from radiographs. Its accuracy is enhanced when using highly trained public algorithms, and when adding clinical data to the algorithm. While this algorithm may not be sufficient in its present state of development as a standalone metric of loosening, it is currently a useful augment for clinical decision making. Cite this article: Bone Joint J 2020;102-B(6 Supple A):101–106

Bone & Joint Research

Vol. 13, Issue 10 | Pages 588 - 595

17 Oct 2024

Artificial intelligence in traumatology

Breu R Avelar C Bertalan Z Grillari J Redl H Ljuhar R Quadlbauer S Hausner T

Full Access

View article Download PDF

Aims. The aim of this study was to create artificial intelligence (AI) software with the purpose of providing a second opinion to physicians to support distal radius fracture (DRF) detection, and to compare the accuracy of fracture detection of physicians with and without software support. Methods. The dataset consisted of 26,121 anonymized anterior-posterior (AP) and lateral standard view radiographs of the wrist, with and without DRF. The convolutional neural network (CNN) model was trained to detect the presence of a DRF by comparing the radiographs containing a fracture to the inconspicuous ones. A total of 11 physicians (six surgeons in training and five hand surgeons) assessed 200 pairs of randomly selected digital radiographs of the wrist (AP and lateral) for the presence of a DRF. The same images were first evaluated without, and then with, the support of the CNN model, and the diagnostic accuracy of the two methods was compared. Results. At the time of the study, the CNN model showed an area under the receiver operating curve of 0.97. AI assistance improved the physician’s sensitivity (correct fracture detection) from 80% to 87%, and the specificity (correct fracture exclusion) from 91% to 95%. The overall error rate (combined false positive and false negative) was reduced from 14% without AI to 9% with AI. Conclusion. The use of a CNN model as a second opinion can improve the diagnostic accuracy of DRF detection in the study setting. Cite this article: Bone Joint Res 2024;13(10):588–595

Orthopaedic Proceedings

Vol. 104-B, Issue SUPP_12 | Pages 90 - 90

1 Dec 2022

MACHINE-LEARNING MODELS CAN ACCURATELY PREDICT ORTHOPAEDIC RESIDENT WORKLOAD AT A LEVEL I TRAUMA CENTRE

Abbas A Toor J Du JT Versteeg A Yee N Finkelstein J Abouali J Nousiainen M Kreder H Hall J Whyne C Larouche J

Full Access

View article

Excessive resident duty hours (RDH) are a recognized issue with implications for physician well-being and patient safety. A major component of the RDH concern is on-call duty. While considerable work has been done to reduce resident call workload, there is a paucity of research in optimizing resident call scheduling. Call coverage is scheduled manually rather than demand-based, which generally leads to over-scheduling to prevent a service gap. Machine learning (ML) has been widely applied in other industries to prevent such issues of a supply-demand mismatch. However, the healthcare field has been slow to adopt these innovations. As such, the aim of this study was to use ML models to 1) predict demand on orthopaedic surgery residents at a level I trauma centre and 2) identify variables key to demand prediction. Daily surgical handover emails over an eight year (2012-2019) period at a level I trauma centre were collected. The following data was used to calculate demand: spine call coverage, date, and number of operating rooms (ORs), traumas, admissions and consults completed. Various ML models (linear, tree-based and neural networks) were trained to predict the workload, with their results compared to the current scheduling approach. Quality of models was determined by using the area under the receiver operator curve (AUC) and accuracy of the predictions. The top ten most important variables were extracted from the most successful model. During training, the model with the highest AUC and accuracy was the multivariate adaptive regression splines (MARS) model, with an AUC of 0.78±0.03 and accuracy of 71.7%±3.1%. During testing, the model with the highest AUC and accuracy was the neural network model, with an AUC of 0.81 and accuracy of 73.7%. All models were better than the current approach, which had an AUC of 0.50 and accuracy of 50.1%. Key variables used by the neural network model were (descending order): spine call duty, year, weekday/weekend, month, and day of the week. This was the first study attempting to use ML to predict the service demand on orthopaedic surgery residents at a major level I trauma centre. Multiple ML models were shown to be more appropriate and accurate at predicting the demand on surgical residents as compared to the current scheduling approach. Future work should look to incorporate predictive models with optimization strategies to match scheduling with demand in order to improve resident well being and patient care

Bone & Joint Open

Vol. 5, Issue 8 | Pages 671 - 680

14 Aug 2024

Is it feasible to develop a supervised learning algorithm incorporating spinopelvic mobility to predict impingement in patients undergoing total hip arthroplasty?

Fontalis A Zhao B Putzeys P Mancino F Zhang S Vanspauwen T Glod F Plastow R Mazomenos E Haddad FS

Full Access

View article Download PDF

Aims. Precise implant positioning, tailored to individual spinopelvic biomechanics and phenotype, is paramount for stability in total hip arthroplasty (THA). Despite a few studies on instability prediction, there is a notable gap in research utilizing artificial intelligence (AI). The objective of our pilot study was to evaluate the feasibility of developing an AI algorithm tailored to individual spinopelvic mechanics and patient phenotype for predicting impingement. Methods. This international, multicentre prospective cohort study across two centres encompassed 157 adults undergoing primary robotic arm-assisted THA. Impingement during specific flexion and extension stances was identified using the virtual range of motion (ROM) tool of the robotic software. The primary AI model, the Light Gradient-Boosting Machine (LGBM), used tabular data to predict impingement presence, direction (flexion or extension), and type. A secondary model integrating tabular data with plain anteroposterior pelvis radiographs was evaluated to assess for any potential enhancement in prediction accuracy. Results. We identified nine predictors from an analysis of baseline spinopelvic characteristics and surgical planning parameters. Using fivefold cross-validation, the LGBM achieved 70.2% impingement prediction accuracy. With impingement data, the LGBM estimated direction with 85% accuracy, while the support vector machine (SVM) determined impingement type with 72.9% accuracy. After integrating imaging data with a multilayer perceptron (tabular) and a convolutional neural network (radiograph), the LGBM’s prediction was 68.1%. Both combined and LGBM-only had similar impingement direction prediction rates (around 84.5%). Conclusion. This study is a pioneering effort in leveraging AI for impingement prediction in THA, utilizing a comprehensive, real-world clinical dataset. Our machine-learning algorithm demonstrated promising accuracy in predicting impingement, its type, and direction. While the addition of imaging data to our deep-learning algorithm did not boost accuracy, the potential for refined annotations, such as landmark markings, offers avenues for future enhancement. Prior to clinical integration, external validation and larger-scale testing of this algorithm are essential. Cite this article: Bone Jt Open 2024;5(8):671–680

Bone & Joint Research

Vol. 12, Issue 9 | Pages 512 - 521

1 Sep 2023

Predicting whether patients will achieve minimal clinically important differences following hip or knee arthroplasty

Langenberger B Schrednitzki D Halder AM Busse R Pross CM

Full Access

View article Download PDF

Aims. A substantial fraction of patients undergoing knee arthroplasty (KA) or hip arthroplasty (HA) do not achieve an improvement as high as the minimal clinically important difference (MCID), i.e. do not achieve a meaningful improvement. Using three patient-reported outcome measures (PROMs), our aim was: 1) to assess machine learning (ML), the simple pre-surgery PROM score, and logistic-regression (LR)-derived performance in their prediction of whether patients undergoing HA or KA achieve an improvement as high or higher than a calculated MCID; and 2) to test whether ML is able to outperform LR or pre-surgery PROM scores in predictive performance. Methods. MCIDs were derived using the change difference method in a sample of 1,843 HA and 1,546 KA patients. An artificial neural network, a gradient boosting machine, least absolute shrinkage and selection operator (LASSO) regression, ridge regression, elastic net, random forest, LR, and pre-surgery PROM scores were applied to predict MCID for the following PROMs: EuroQol five-dimension, five-level questionnaire (EQ-5D-5L), EQ visual analogue scale (EQ-VAS), Hip disability and Osteoarthritis Outcome Score-Physical Function Short-form (HOOS-PS), and Knee injury and Osteoarthritis Outcome Score-Physical Function Short-form (KOOS-PS). Results. Predictive performance of the best models per outcome ranged from 0.71 for HOOS-PS to 0.84 for EQ-VAS (HA sample). ML statistically significantly outperformed LR and pre-surgery PROM scores in two out of six cases. Conclusion. MCIDs can be predicted with reasonable performance. ML was able to outperform traditional methods, although only in a minority of cases. Cite this article: Bone Joint Res 2023;12(9):512–521

The Bone & Joint Journal

Vol. 106-B, Issue 11 | Pages 1348 - 1360

1 Nov 2024

Detection, classification, and characterization of proximal humerus fractures on plain radiographs

Spek RWA Smith WJ Sverdlov M Broos S Zhao Y Liao Z Verjans JW Prijs J To M Åberg H Chiri W IJpma FFA Jadav B White J Bain GI Jutte PC van den Bekerom MPJ Jaarsma RL Doornberg JN

Access Required

View article Download PDF

Aims. The purpose of this study was to develop a convolutional neural network (CNN) for fracture detection, classification, and identification of greater tuberosity displacement ≥ 1 cm, neck-shaft angle (NSA) ≤ 100°, shaft translation, and articular fracture involvement, on plain radiographs. Methods. The CNN was trained and tested on radiographs sourced from 11 hospitals in Australia and externally validated on radiographs from the Netherlands. Each radiograph was paired with corresponding CT scans to serve as the reference standard based on dual independent evaluation by trained researchers and attending orthopaedic surgeons. Presence of a fracture, classification (non- to minimally displaced; two-part, multipart, and glenohumeral dislocation), and four characteristics were determined on 2D and 3D CT scans and subsequently allocated to each series of radiographs. Fracture characteristics included greater tuberosity displacement ≥ 1 cm, NSA ≤ 100°, shaft translation (0% to < 75%, 75% to 95%, > 95%), and the extent of articular involvement (0% to < 15%, 15% to 35%, or > 35%). Results. For detection and classification, the algorithm was trained on 1,709 radiographs (n = 803), tested on 567 radiographs (n = 244), and subsequently externally validated on 535 radiographs (n = 227). For characterization, healthy shoulders and glenohumeral dislocation were excluded. The overall accuracy for fracture detection was 94% (area under the receiver operating characteristic curve (AUC) = 0.98) and for classification 78% (AUC 0.68 to 0.93). Accuracy to detect greater tuberosity fracture displacement ≥ 1 cm was 35.0% (AUC 0.57). The CNN did not recognize NSAs ≤ 100° (AUC 0.42), nor fractures with ≥ 75% shaft translation (AUC 0.51 to 0.53), or with ≥ 15% articular involvement (AUC 0.48 to 0.49). For all objectives, the model’s performance on the external dataset showed similar accuracy levels. Conclusion. CNNs proficiently rule out proximal humerus fractures on plain radiographs. Despite rigorous training methodology based on CT imaging with multi-rater consensus to serve as the reference standard, artificial intelligence-driven classification is insufficient for clinical implementation. The CNN exhibited poor diagnostic ability to detect greater tuberosity displacement ≥ 1 cm and failed to identify NSAs ≤ 100°, shaft translations, or articular fractures. Cite this article: Bone Joint J 2024;106-B(11):1348–1360

Results 1 - 20 of 84

1 2 3 4 5

Results per page: