Advertisement for orthosearch.org.uk
Results 1 - 50 of 105
Results per page:
Bone & Joint Research
Vol. 12, Issue 7 | Pages 447 - 454
10 Jul 2023
Lisacek-Kiosoglous AB Powling AS Fontalis A Gabr A Mazomenos E Haddad FS

The use of artificial intelligence (AI) is rapidly growing across many domains, of which the medical field is no exception. AI is an umbrella term defining the practical application of algorithms to generate useful output, without the need of human cognition. Owing to the expanding volume of patient information collected, known as ‘big data’, AI is showing promise as a useful tool in healthcare research and across all aspects of patient care pathways. Practical applications in orthopaedic surgery include: diagnostics, such as fracture recognition and tumour detection; predictive models of clinical and patient-reported outcome measures, such as calculating mortality rates and length of hospital stay; and real-time rehabilitation monitoring and surgical training. However, clinicians should remain cognizant of AI’s limitations, as the development of robust reporting and validation frameworks is of paramount importance to prevent avoidable errors and biases. The aim of this review article is to provide a comprehensive understanding of AI and its subfields, as well as to delineate its existing clinical applications in trauma and orthopaedic surgery. Furthermore, this narrative review expands upon the limitations of AI and future direction. Cite this article: Bone Joint Res 2023;12(7):447–454


The Bone & Joint Journal
Vol. 104-B, Issue 12 | Pages 1292 - 1303
1 Dec 2022
Polisetty TS Jain S Pang M Karnuta JM Vigdorchik JM Nawabi DH Wyles CC Ramkumar PN

Literature surrounding artificial intelligence (AI)-related applications for hip and knee arthroplasty has proliferated. However, meaningful advances that fundamentally transform the practice and delivery of joint arthroplasty are yet to be realized, despite the broad range of applications as we continue to search for meaningful and appropriate use of AI. AI literature in hip and knee arthroplasty between 2018 and 2021 regarding image-based analyses, value-based care, remote patient monitoring, and augmented reality was reviewed. Concerns surrounding meaningful use and appropriate methodological approaches of AI in joint arthroplasty research are summarized. Of the 233 AI-related orthopaedics articles published, 178 (76%) constituted original research, while the rest consisted of editorials or reviews. A total of 52% of original AI-related research concerns hip and knee arthroplasty (n = 92), and a narrative review is described. Three studies were externally validated. Pitfalls surrounding present-day research include conflating vernacular (“AI/machine learning”), repackaging limited registry data, prematurely releasing internally validated prediction models, appraising model architecture instead of inputted data, withholding code, and evaluating studies using antiquated regression-based guidelines. While AI has been applied to a variety of hip and knee arthroplasty applications with limited clinical impact, the future remains promising if the question is meaningful, the methodology is rigorous and transparent, the data are rich, and the model is externally validated. Simple checkpoints for meaningful AI adoption include ensuring applications focus on: administrative support over clinical evaluation and management; necessity of the advanced model; and the novelty of the question being answered. Cite this article: Bone Joint J 2022;104-B(12):1292–1303


The Bone & Joint Journal
Vol. 104-B, Issue 8 | Pages 911 - 914
1 Aug 2022
Prijs J Liao Z Ashkani-Esfahani S Olczak J Gordon M Jayakumar P Jutte PC Jaarsma RL IJpma FFA Doornberg JN

Artificial intelligence (AI) is, in essence, the concept of ‘computer thinking’, encompassing methods that train computers to perform and learn from executing certain tasks, called machine learning, and methods to build intricate computer models that both learn and adapt, called complex neural networks. Computer vision is a function of AI by which machine learning and complex neural networks can be applied to enable computers to capture, analyze, and interpret information from clinical images and visual inputs. This annotation summarizes key considerations and future perspectives concerning computer vision, questioning the need for this technology (the ‘why’), the current applications (the ‘what’), and the approach to unlocking its full potential (the ‘how’). Cite this article: Bone Joint J 2022;104-B(8):911–914


The Bone & Joint Journal
Vol. 104-B, Issue 8 | Pages 929 - 937
1 Aug 2022
Gurung B Liu P Harris PDR Sagi A Field RE Sochart DH Tucker K Asopa V

Aims. Total hip arthroplasty (THA) and total knee arthroplasty (TKA) are common orthopaedic procedures requiring postoperative radiographs to confirm implant positioning and identify complications. Artificial intelligence (AI)-based image analysis has the potential to automate this postoperative surveillance. The aim of this study was to prepare a scoping review to investigate how AI is being used in the analysis of radiographs following THA and TKA, and how accurate these tools are. Methods. The Embase, MEDLINE, and PubMed libraries were systematically searched to identify relevant articles. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for scoping reviews and Arksey and O’Malley framework were followed. Study quality was assessed using a modified Methodological Index for Non-Randomized Studies tool. AI performance was reported using either the area under the curve (AUC) or accuracy. Results. Of the 455 studies identified, only 12 were suitable for inclusion. Nine reported implant identification and three described predicting risk of implant failure. Of the 12, three studies compared AI performance with orthopaedic surgeons. AI-based implant identification achieved AUC 0.992 to 1, and most algorithms reported an accuracy > 90%, using 550 to 320,000 training radiographs. AI prediction of dislocation risk post-THA, determined after five-year follow-up, was satisfactory (AUC 76.67; 8,500 training radiographs). Diagnosis of hip implant loosening was good (accuracy 88.3%; 420 training radiographs) and measurement of postoperative acetabular angles was comparable to humans (mean absolute difference 1.35° to 1.39°). However, 11 of the 12 studies had several methodological limitations introducing a high risk of bias. None of the studies were externally validated. Conclusion. These studies show that AI is promising. While it already has the ability to analyze images with significant precision, there is currently insufficient high-level evidence to support its widespread clinical use. Further research to design robust studies that follow standard reporting guidelines should be encouraged to develop AI models that could be easily translated into real-world conditions. Cite this article: Bone Joint J 2022;104-B(8):929–937


Bone & Joint Open
Vol. 4, Issue 9 | Pages 696 - 703
11 Sep 2023
Ormond MJ Clement ND Harder BG Farrow L Glester A

Aims. The principles of evidence-based medicine (EBM) are the foundation of modern medical practice. Surgeons are familiar with the commonly used statistical techniques to test hypotheses, summarize findings, and provide answers within a specified range of probability. Based on this knowledge, they are able to critically evaluate research before deciding whether or not to adopt the findings into practice. Recently, there has been an increased use of artificial intelligence (AI) to analyze information and derive findings in orthopaedic research. These techniques use a set of statistical tools that are increasingly complex and may be unfamiliar to the orthopaedic surgeon. It is unclear if this shift towards less familiar techniques is widely accepted in the orthopaedic community. This study aimed to provide an exploration of understanding and acceptance of AI use in research among orthopaedic surgeons. Methods. Semi-structured in-depth interviews were carried out on a sample of 12 orthopaedic surgeons. Inductive thematic analysis was used to identify key themes. Results. The four intersecting themes identified were: 1) validity in traditional research, 2) confusion around the definition of AI, 3) an inability to validate AI research, and 4) cautious optimism about AI research. Underpinning these themes is the notion of a validity heuristic that is strongly rooted in traditional research teaching and embedded in medical and surgical training. Conclusion. Research involving AI sometimes challenges the accepted traditional evidence-based framework. This can give rise to confusion among orthopaedic surgeons, who may be unable to confidently validate findings. In our study, the impact of this was mediated by cautious optimism based on an ingrained validity heuristic that orthopaedic surgeons develop through their medical training. Adding to this, the integration of AI into everyday life works to reduce suspicion and aid acceptance. Cite this article: Bone Jt Open 2023;4(9):696–703


Bone & Joint Open
Vol. 3, Issue 1 | Pages 93 - 97
10 Jan 2022
Kunze KN Orr M Krebs V Bhandari M Piuzzi NS

Artificial intelligence and machine-learning analytics have gained extensive popularity in recent years due to their clinically relevant applications. A wide range of proof-of-concept studies have demonstrated the ability of these analyses to personalize risk prediction, detect implant specifics from imaging, and monitor and assess patient movement and recovery. Though these applications are exciting and could potentially influence practice, it is imperative to understand when these analyses are indicated and where the data are derived from, prior to investing resources and confidence into the results and conclusions. In this article, we review the current benefits and potential limitations of machine-learning for the orthopaedic surgeon with a specific emphasis on data quality


The Bone & Joint Journal
Vol. 101-B, Issue 12 | Pages 1476 - 1478
1 Dec 2019
Bayliss L Jones LD

This annotation briefly reviews the history of artificial intelligence and machine learning in health care and orthopaedics, and considers the role it will have in the future, particularly with reference to statistical analyses involving large datasets. Cite this article: Bone Joint J 2019;101-B:1476–1478


Bone & Joint Research
Vol. 13, Issue 4 | Pages 184 - 192
18 Apr 2024
Morita A Iida Y Inaba Y Tezuka T Kobayashi N Choe H Ike H Kawakami E

Aims. This study was designed to develop a model for predicting bone mineral density (BMD) loss of the femur after total hip arthroplasty (THA) using artificial intelligence (AI), and to identify factors that influence the prediction. Additionally, we virtually examined the efficacy of administration of bisphosphonate for cases with severe BMD loss based on the predictive model. Methods. The study included 538 joints that underwent primary THA. The patients were divided into groups using unsupervised time series clustering for five-year BMD loss of Gruen zone 7 postoperatively, and a machine-learning model to predict the BMD loss was developed. Additionally, the predictor for BMD loss was extracted using SHapley Additive exPlanations (SHAP). The patient-specific efficacy of bisphosphonate, which is the most important categorical predictor for BMD loss, was examined by calculating the change in predictive probability when hypothetically switching between the inclusion and exclusion of bisphosphonate. Results. Time series clustering allowed us to divide the patients into two groups, and the predictive factors were identified including patient- and operation-related factors. The area under the receiver operating characteristic (ROC) curve (AUC) for the BMD loss prediction averaged 0.734. Virtual administration of bisphosphonate showed on average 14% efficacy in preventing BMD loss of zone 7. Additionally, stem types and preoperative triglyceride (TG), creatinine (Cr), estimated glomerular filtration rate (eGFR), and creatine kinase (CK) showed significant association with the estimated patient-specific efficacy of bisphosphonate. Conclusion. Periprosthetic BMD loss after THA is predictable based on patient- and operation-related factors, and optimal prescription of bisphosphonate based on the prediction may prevent BMD loss. Cite this article: Bone Joint Res 2024;13(4):184–192


Bone & Joint Research
Vol. 12, Issue 8 | Pages 494 - 496
9 Aug 2023
Clement ND Simpson AHRW

Cite this article: Bone Joint Res 2023;12(8):494–496.


The Bone & Joint Journal
Vol. 103-B, Issue 9 | Pages 1442 - 1448
1 Sep 2021
McDonnell JM Evans SR McCarthy L Temperley H Waters C Ahern D Cunniffe G Morris S Synnott K Birch N Butler JS

In recent years, machine learning (ML) and artificial neural networks (ANNs), a particular subset of ML, have been adopted by various areas of healthcare. A number of diagnostic and prognostic algorithms have been designed and implemented across a range of orthopaedic sub-specialties to date, with many positive results. However, the methodology of many of these studies is flawed, and few compare the use of ML with the current approach in clinical practice. Spinal surgery has advanced rapidly over the past three decades, particularly in the areas of implant technology, advanced surgical techniques, biologics, and enhanced recovery protocols. It is therefore regarded an innovative field. Inevitably, spinal surgeons will wish to incorporate ML into their practice should models prove effective in diagnostic or prognostic terms. The purpose of this article is to review published studies that describe the application of neural networks to spinal surgery and which actively compare ANN models to contemporary clinical standards allowing evaluation of their efficacy, accuracy, and relatability. It also explores some of the limitations of the technology, which act to constrain the widespread adoption of neural networks for diagnostic and prognostic use in spinal care. Finally, it describes the necessary considerations should institutions wish to incorporate ANNs into their practices. In doing so, the aim of this review is to provide a practical approach for spinal surgeons to understand the relevant aspects of neural networks.

Cite this article: Bone Joint J 2021;103-B(9):1442–1448.


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_1 | Pages 102 - 102
2 Jan 2024
Ambrosio L
Full Access

In the last decades, the use of artificial intelligence (AI) has been increasingly investigated in intervertebral disc degeneration (IDD) and chronic low back pain (LBP) research. To date, several AI-based cutting-edge technologies, such as computer vision, computer-assisted diagnosis, decision support system and natural language processing have been utilized to optimize LBP prevention, diagnosis, and treatment. This talk will provide an outline on contemporary AI applications to IDD and LBP research, with a particular attention towards actual knowledge gaps and promising innovative tools


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_11 | Pages 31 - 31
7 Jun 2023
Asopa V Womersley A Wehbe J Spence C Harris P Sochart D Tucker K Field R
Full Access

Over 8000 total hip arthroplasties (THA) in the UK were revised in 2019, half for aseptic loosening. It is believed that Artificial Intelligence (AI) could identify or predict failing THA and result in early recognition of poorly performing implants and reduce patient suffering. The aim of this study is to investigate whether Artificial Intelligence based machine learning (ML) / Deep Learning (DL) techniques can train an algorithm to identify and/or predict failing uncemented THA. Consent was sought from patients followed up in a single design, uncemented THA implant surveillance study (2010–2021). Oxford hip scores and radiographs were collected at yearly intervals. Radiographs were analysed by 3 observers for presence of markers of implant loosening/failure: periprosthetic lucency, cortical hypertrophy, and pedestal formation. DL using the RGB ResNet 18 model, with images entered chronologically, was trained according to revision status and radiographic features. Data augmentation and cross validation were used to increase the available training data, reduce bias, and improve verification of results. 184 patients consented to inclusion. 6 (3.2%) patients were revised for aseptic loosening. 2097 radiographs were analysed: 21 (11.4%) patients had three radiographic features of failure. 166 patients were used for ML algorithm testing of 3 scenarios to detect those who were revised. 1) The use of revision as an end point was associated with increased variability in accuracy. The area under the curve (AUC) was 23–97%. 2) Using 2/3 radiographic features associated with failure was associated with improved results, AUC: 75–100%. 3) Using 3/3 radiographic features, had less variability, reduced AUC of 73%, but 5/6 patients who had been revised were identified (total 66 identified). The best algorithm identified the greatest number of revised hips (5/6), predicting failure 2–8 years before revision, before all radiographic features were visible and before a significant fall in the Oxford Hip score. True-Positive: 0.77, False Positive: 0.29. ML algorithms can identify failing THA before visible features on radiographs or before PROM scores deteriorate. This is an important finding that could identify failing THA early


Orthopaedic Proceedings
Vol. 103-B, Issue SUPP_3 | Pages 30 - 30
1 Mar 2021
Gerges M Eng H Chhina H Cooper A
Full Access

Bone age is a radiographical assessment used in pediatric medicine due to its relative objectivity in determining biological maturity compared to chronological age and size.1 Currently, Greulich and Pyle (GP) is one of the most common methods used to determine bone age from hand radiographs.2–4 In recent years, new methods were developed to increase the efficiency in bone age analysis like the shorthand bone age (SBA) and the automated artificial intelligence algorithms. The purpose of this study is to evaluate the accuracy and reliability of these two methods and examine if the reduction in analysis time compromises their accuracy. Two hundred thirteen males and 213 females were selected. Each participant had their bone age determined by two separate raters using the GP (M1) and SBA methods (M2). Three weeks later, the two raters repeated the analysis of the radiographs. The raters timed themselves using an online stopwatch while analyzing the radiograph on a computer screen. De-identified radiographs were securely uploaded to an automated algorithm developed by a group of radiologists in Toronto. The gold standard was determined to be the radiology report attached to each radiograph, written by experienced radiologists using GP (M1). For intra-rater variability, intraclass correlation analysis between trial 1 (T1) and trial 2 (T2) for each rater and method was performed. For inter-rater variability, intraclass correlation was performed between rater 1 (R1) and rater 2 (R2) for each method and trial. Intraclass correlation between each method and the gold standard fell within the 0.8–0.9 range, highlighting significant agreement. Most of the comparisons showed a statistically significant difference between the two new methods and the gold standard; however it may not be clinically significant as it ranges between 0.25–0.5 years. A bone age is considered clinically abnormal if it falls outside 2 standard deviations of the chronological age; standard deviations are calculated and provided in GP atlas.6–8 For a 10-year old female, 2 standard deviations constitute 21.6 months which far outweighs the difference reported here between SBA, automated algorithm and the gold standard. The median time for completion using the GP method was 21.83 seconds for rater 1 and 9.30 seconds for rater 2. In comparison, SBA required a median time of 7 seconds for rater 1 and 5 seconds for rater 2. The automated method had no time restraint as bone age was determined immediately upon radiograph upload. The correlation between the two trials in each method and rater (i.e. R1M1T1 vs R1M1T2) was excellent (κ= 0.9–1) confirming the reliability of the two new methods. Similarly, the correlation between the two raters in each method and trial (i.e. R1M1T1 vs R2M1T1) fell within the 0.9–1 range. This indicates a limited variability between raters who may use these two methods. The shorthand bone age method and an artificial intelligence automated algorithm produced values that are in agreement with the gold standard Greulich and Pyle, while reducing analysis time and maintaining a high inter-rater and intra-rater reliability


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_6 | Pages 26 - 26
2 May 2024
Al-Naib M Afzal I Radha S
Full Access

As patient data continues to grow, the importance of efficient and precise analysis cannot be overstated. The employment of Generative Artificial Intelligence (AI), specifically Chat GPT-4, in the realm of medical data interpretation has been on the rise. However, its effectiveness in comparison to manual data analysis has been insufficiently investigated. This quality improvement project aimed to evaluate the accuracy and time-efficiency of Generative AI (GPT-4) against manual data interpretation within extensive datasets pertaining to patients with orthopaedic injuries. A dataset, containing details of 6,562 orthopaedic trauma patients admitted to a district general hospital over a span of two years, was reviewed. Two researchers operated independently: one utilised GPT-4 for insights via prompts, while the other manually examined the identical dataset employing Microsoft Excel and IBM® SPSS® software. Both were blinded on each other's procedures and outcomes. Each researcher answered 20 questions based on the dataset including injury details, age groups, injury specifics, activity trends and the duration taken to assess the data. Upon comparison, both GPT-4 and the manual researcher achieved consistent results for 19 out of the 20 questions (95% accuracy). After a subsequent review and refined prompts (prompt engineering) to GPT-4, the answer to the final question aligned with the manual researcher's findings. GPT-4 required just 30 minutes, a stark contrast to the manual researcher's 9-hour analytical duration. This quality improvement project emphasises the transformative potential of Generative AI in the domain of medical data analysis. GPT-4 not only paralleled the accuracy of manual analysis but also achieved this in significantly less time. For optimal accurate results, data analysis by AI can be enhanced through human oversight. Adopting AI-driven approaches, particularly in orthopaedic data interpretation, can enhance efficiency and ultimately improve patient care. We recommend future investigations on large and more varied datasets to reaffirm these outcomes


The Bone & Joint Journal
Vol. 105-B, Issue 6 | Pages 585 - 586
17 Apr 2023
Leopold SS Haddad FS Sandell LJ Swiontkowski M


Bone & Joint 360
Vol. 12, Issue 4 | Pages 3 - 4
1 Aug 2023
Ollivere B


INTRODUCTION. Quality monitoring is increasingly important to support and assure sustainability of the Orthopaedic practice. Many surgeons in a non-academic setting lack the resources to accurately monitor quality of care. Widespread use of electronic medical records (EMR) provides easier access to medical information and facilitates its analysis. However, manual review of EMRs is inefficient and costly. Artificial Intelligence (AI) software has allowed for development of automated search algorithms for extracting relevant complications from EMRs. We questioned whether an AI supported algorithm could be used to provide accurate feedback on the quality of care following Total Hip Arthroplasty (THA) in a high-volume, non-academic setting. METHODS. 532 Consecutive patients underwent 613 THA between January 1. st. and December 31. st. , 2017. Patients were prospectively followed pre-op, 6 weeks, 3 months and 1 year. They were seen by the surgeon who created clinical notes and reported every adverse event. A random derivation cohort (100 patients, 115 hips) was used to determine accuracy. The algorithm was compared to manual extraction to validate performance in raw data extraction. The full cohort (532 patients, 613 hips) was used to determine its recall, precision and F-value. RESULTS. The algorithm had an accuracy value of 95.0%, compared to 94.5% for manual review (p=0.69). Recall of 96.0% was achieved with precision of 88.0% and F-measure of 0.85 for all adverse events. Recovery of 80.6% of patients was completely uneventful. Re-intervention was required in 1.3% of cases and 18.1% had a ‘transient’ event such as low back pain. The infection and dislocation rate was 0,3%. CONCLUSION. An AI supported search algorithm can analyze and interpret large quantities of EMRs at greater speed but with performance comparable to manual review. Using the program, new clinical information surfaced. 18.1% of patients can be expected to have a ‘transient’ problem following a THA procedure


Bone & Joint Research
Vol. 7, Issue 3 | Pages 223 - 225
1 Mar 2018
Jones LD Golan D Hanna SA Ramachandran M


Orthopaedic Proceedings
Vol. 103-B, Issue SUPP_13 | Pages 125 - 125
1 Nov 2021
Sánchez G Cina A Giorgi P Schiro G Gueorguiev B Alini M Varga P Galbusera F Gallazzi E
Full Access

Introduction and Objective

Up to 30% of thoracolumbar (TL) fractures are missed in the emergency room. Failure to identify these fractures can result in neurological injuries up to 51% of the casesthis article aimed to clarify the incidence and risk factors of traumatic fractures in China. The China National Fracture Study (CNFS. Obtaining sagittal and anteroposterior radiographs of the TL spine are the first diagnostic step when suspecting a traumatic injury. In most cases, CT and/or MRI are needed to confirm the diagnosis. These are time and resource consuming. Thus, reliably detecting vertebral fractures in simple radiographic projections would have a significant impact. We aim to develop and validate a deep learning tool capable of detecting TL fractures on lateral radiographs of the spine. The clinical implementation of this tool is anticipated to reduce the rate of missed vertebral fractures in emergency rooms.

Materials and Methods

We collected sagittal radiographs, CT and MRI scans of the TL spine of 362 patients exhibiting traumatic vertebral fractures. Cases were excluded when CT and/or MRI where not available. The reference standard was set by an expert group of three spine surgeons who conjointly annotated (fracture/no-fracture and AO Classification) the sagittal radiographs of 171 cases. CT and/or MRI were used confirm the presence and type of the fracture in all cases. 302 cropped vertebral images were labelled “fracture” and 328 “no fracture”. After augmentation, this dataset was then used to train, validate, and test deep learning classifiers based on the ResNet18 and VGG16 architectures. To ensure that the model's prediction was based on the correct identification of the fracture zone, an Activation Map analysis was conducted.


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_2 | Pages 39 - 39
10 Feb 2023
Lutter C Grupp T Mittelmeier W Selig M Grover P Dreischarf M Rose G Bien T
Full Access

Polyethylene wear represents a significant risk factor for the long-term success of knee arthroplasty [1]. This work aimed to develop and in vivo validate an automated algorithm for accurate and precise AI based wear measurement in knee arthroplasty using clinical AP radiographs for scientifically meaningful multi-centre studies.

Twenty postoperative radiographs (knee joint AP in standing position) after knee arthroplasty were analysed using the novel algorithm. A convolutional neural network-based segmentation is used to localize the implant components on the X-Ray, and a 2D-3D registration of the CAD implant models precisely calculates the three-dimensional position and orientation of the implants in the joint at the time of acquisition. From this, the minimal distance between the involved implant components is determined, and its postoperative change over time enables the determination of wear in the radiographs.

The measured minimum inlay height of 335 unloaded inlays excluding the weight-induced deformation, served as ground truth for validation and was compared to the algorithmically calculated component distances from 20 radiographs.

With an average weight of 94 kg in the studied TKA patient cohort, it was determined that an average inlay height of 6.160 mm is expected in the patient. Based on the radiographs, the algorithm calculated a minimum component distance of 6.158 mm (SD = 81 µm), which deviated by 2 µm in comparison to the expected inlay height.

An automated method was presented that allows accurate and precise determination of the inlay height and subsequently the wear in knee arthroplasty based on a clinical radiograph and the CAD models. Precision and accuracy are comparable to the current gold standard RSA [2], but without relying on special radiographic setups. The developed method can therefore be used to objectively investigate novel implant materials with meaningful clinical cohorts, thus improving the quality of patient care.


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_6 | Pages 59 - 59
2 May 2024
Adla SR Ameer A Silva MD Unnithan A
Full Access

Arthroplasties are widely performed to improve mobility and quality of life for symptomatic knee/hip osteoarthritis patients. With increasing rates of Total Joint Replacements in the United Kingdom, predicting length of stay is vital for hospitals to control costs, manage resources, and prevent postoperative complications. A longer Length of stay has been shown to negatively affect the quality of care, outcomes and patient satisfaction. Thus, predicting LOS enables us to make full use of medical resources.

Clinical characteristics were retrospectively collected from 1,303 patients who received TKA and THR. A total of 21 variables were included, to develop predictive models for LOS by multiple machine learning (ML) algorithms, including Random Forest Classifier (RFC), K-Nearest Neighbour (KNN), Extreme Gradient Boost (XgBoost), and Na¯ve Bayes (NB). These models were evaluated by the receiver operating characteristic (ROC) curve for predictive performance. A feature selection approach was used to identify optimal predictive factors. Based on the ROC of Training result, XgBoost algorithm was selected to be applied to the Test set.

The areas under the ROC curve (AUCs) of the 4 models ranged from 0.730 to 0.966, where higher AUC values generally indicate better predictive performance. All the ML-based models performed better than conventional statistical methods in ROC curves. The XgBoost algorithm with 21 variables was identified as the best predictive model. The feature selection indicated the top six predictors: Age, Operation Duration, Primary Procedure, BMI, creatinine and Month of Surgery.

By analysing clinical characteristics, it is feasible to develop ML-based models for the preoperative prediction of LOS for patients who received TKA and THR, and the XgBoost algorithm performed the best, in terms of accuracy of predictive performance. As this model was originally crafted at Ashford and St. Peters Hospital, we have naturally named it as THE ASHFORD OUTCOME.


Background

Dislocation is a common complication following total hip arthroplasty (THA), and accounts for a high percentage of subsequent revisions. The purpose of this study was to develop a convolutional neural network (CNN) model to identify patients at high risk for dislocation based on postoperative anteroposterior (AP) pelvis radiographs.

Methods

We retrospectively evaluated radiographs for a cohort of 13,970 primary THAs with 374 dislocations over 5 years of follow-up. Overall, 1,490 radiographs from dislocated and 91,094 from non-dislocated THAs were included in the analysis. A CNN object detection model (YOLO-V3) was trained to crop the images by centering on the femoral head. A ResNet18 classifier was trained to predict subsequent hip dislocation from the cropped imaging. The ResNet18 classifier was initialized with ImageNet weights and trained using FastAI (V1.0) running on PyTorch. The training was run for 15 epochs using ten-fold cross validation, data oversampling and augmentation.


Orthopaedic Proceedings
Vol. 90-B, Issue SUPP_I | Pages 100 - 101
1 Mar 2008
Wu H Poncet P Harder J Cheriet F Labelle H Zernicke R Ronsky J
Full Access

The pathogenesis of scoliosis progression remains poorly understood. Seventy-two subject data sets, consisting of four successive values of Cobb-angle and lateral deviations at apices for six and twelve-months intervals in the coronal plane, were used to train and test an artificial neural network (ANN) to predict spinal deformity progression. The accuracies of the trained ANN (3-4-1) for training and testing data were within 3.64° (±2.58°) and 4.40° (±1.86°) of Cobb angles, and within 3.59 (±3.96) mm and 3.98 (±3.41) mm of lateral deviations, respectively. The adapted technique for predicting the scoliosis deformity progression has promising clinical applications.

Scoliosis is a common and poorly understood three-dimensional spinal deformity. The study purpose is to predict scoliosis progression at six and twelve months intervals in the future using successive spinal indices with an artificial neural network (ANN).

The adapted ANN technique enables earlier detection of scoliosis progression with high accuracy. Improved prediction of scoliosis progression will impact bracing or surgical treatment decisions, and may decrease hazardous X-ray exposure.

Seventy-two data sets from adolescent idiopathic scoliosis subjects recruited at the Alberta Children’s Hospital were used in this study. Data sets composed of four successive values of Cobb angles and lateral deviations at apices for six and twelvemonth intervals (coronal plane) were extracted to train and test a specific ANN for predicting scoliosis progression.

Progression patterns in Cobb angles (n = 10) and lateral deviations (n = 8) were successfully identified. The accuracies of the trained ANN (3-4-1) with the training and testing data sets were 3.64° (±2.58°) and 4.40° (±1.86°) of Cobb angles, 3.59 (±3.96) mm and 3.98 (±3.41) mm of lateral deviations, respectively. These results are in close agreement with those using cubic spline extrapolation techniques (3.49° ± 1.85° and 3.31 ± 4.22 mm) and adaptive neuro-fuzzy inference system (3.92° ±3.53° and 3.37 ±3.95 mm) for the same testing data.

ANN can be a promising technique for prediction of scoliosis progression with substantial improvements in accuracy over current techniques, leading to potentially important implications for scoliosis monitoring and treatment decisions.

Funding: AHFMR, CIHR, Fraternal Order of Eagles, NSERC, GEOIDE.


The Bone & Joint Journal
Vol. 106-B, Issue 7 | Pages 688 - 695
1 Jul 2024
Farrow L Zhong M Anderson L

Aims. To examine whether natural language processing (NLP) using a clinically based large language model (LLM) could be used to predict patient selection for total hip or total knee arthroplasty (THA/TKA) from routinely available free-text radiology reports. Methods. Data pre-processing and analyses were conducted according to the Artificial intelligence to Revolutionize the patient Care pathway in Hip and knEe aRthroplastY (ARCHERY) project protocol. This included use of de-identified Scottish regional clinical data of patients referred for consideration of THA/TKA, held in a secure data environment designed for artificial intelligence (AI) inference. Only preoperative radiology reports were included. NLP algorithms were based on the freely available GatorTron model, a LLM trained on over 82 billion words of de-identified clinical text. Two inference tasks were performed: assessment after model-fine tuning (50 Epochs and three cycles of k-fold cross validation), and external validation. Results. For THA, there were 5,558 patient radiology reports included, of which 4,137 were used for model training and testing, and 1,421 for external validation. Following training, model performance demonstrated average (mean across three folds) accuracy, F1 score, and area under the receiver operating curve (AUROC) values of 0.850 (95% confidence interval (CI) 0.833 to 0.867), 0.813 (95% CI 0.785 to 0.841), and 0.847 (95% CI 0.822 to 0.872), respectively. For TKA, 7,457 patient radiology reports were included, with 3,478 used for model training and testing, and 3,152 for external validation. Performance metrics included accuracy, F1 score, and AUROC values of 0.757 (95% CI 0.702 to 0.811), 0.543 (95% CI 0.479 to 0.607), and 0.717 (95% CI 0.657 to 0.778) respectively. There was a notable deterioration in performance on external validation in both cohorts. Conclusion. The use of routinely available preoperative radiology reports provides promising potential to help screen suitable candidates for THA, but not for TKA. The external validation results demonstrate the importance of further model testing and training when confronted with new clinical cohorts. Cite this article: Bone Joint J 2024;106-B(7):688–695


Bone & Joint Open
Vol. 5, Issue 8 | Pages 671 - 680
14 Aug 2024
Fontalis A Zhao B Putzeys P Mancino F Zhang S Vanspauwen T Glod F Plastow R Mazomenos E Haddad FS

Aims. Precise implant positioning, tailored to individual spinopelvic biomechanics and phenotype, is paramount for stability in total hip arthroplasty (THA). Despite a few studies on instability prediction, there is a notable gap in research utilizing artificial intelligence (AI). The objective of our pilot study was to evaluate the feasibility of developing an AI algorithm tailored to individual spinopelvic mechanics and patient phenotype for predicting impingement. Methods. This international, multicentre prospective cohort study across two centres encompassed 157 adults undergoing primary robotic arm-assisted THA. Impingement during specific flexion and extension stances was identified using the virtual range of motion (ROM) tool of the robotic software. The primary AI model, the Light Gradient-Boosting Machine (LGBM), used tabular data to predict impingement presence, direction (flexion or extension), and type. A secondary model integrating tabular data with plain anteroposterior pelvis radiographs was evaluated to assess for any potential enhancement in prediction accuracy. Results. We identified nine predictors from an analysis of baseline spinopelvic characteristics and surgical planning parameters. Using fivefold cross-validation, the LGBM achieved 70.2% impingement prediction accuracy. With impingement data, the LGBM estimated direction with 85% accuracy, while the support vector machine (SVM) determined impingement type with 72.9% accuracy. After integrating imaging data with a multilayer perceptron (tabular) and a convolutional neural network (radiograph), the LGBM’s prediction was 68.1%. Both combined and LGBM-only had similar impingement direction prediction rates (around 84.5%). Conclusion. This study is a pioneering effort in leveraging AI for impingement prediction in THA, utilizing a comprehensive, real-world clinical dataset. Our machine-learning algorithm demonstrated promising accuracy in predicting impingement, its type, and direction. While the addition of imaging data to our deep-learning algorithm did not boost accuracy, the potential for refined annotations, such as landmark markings, offers avenues for future enhancement. Prior to clinical integration, external validation and larger-scale testing of this algorithm are essential. Cite this article: Bone Jt Open 2024;5(8):671–680


The Bone & Joint Journal
Vol. 103-B, Issue 12 | Pages 1754 - 1758
1 Dec 2021
Farrow L Zhong M Ashcroft GP Anderson L Meek RMD

There is increasing popularity in the use of artificial intelligence and machine-learning techniques to provide diagnostic and prognostic models for various aspects of Trauma & Orthopaedic surgery. However, correct interpretation of these models is difficult for those without specific knowledge of computing or health data science methodology. Lack of current reporting standards leads to the potential for significant heterogeneity in the design and quality of published studies. We provide an overview of machine-learning techniques for the lay individual, including key terminology and best practice reporting guidelines. Cite this article: Bone Joint J 2021;103-B(12):1754–1758


The Bone & Joint Journal
Vol. 105-B, Issue 6 | Pages 587 - 589
1 Jun 2023
Kunze KN Jang SJ Fullerton MA Vigdorchik JM Haddad FS

The OpenAI chatbot ChatGPT is an artificial intelligence (AI) application that uses state-of-the-art language processing AI. It can perform a vast number of tasks, from writing poetry and explaining complex quantum mechanics, to translating language and writing research articles with a human-like understanding and legitimacy. Since its initial release to the public in November 2022, ChatGPT has garnered considerable attention due to its ability to mimic the patterns of human language, and it has attracted billion-dollar investments from Microsoft and PricewaterhouseCoopers. The scope of ChatGPT and other large language models appears infinite, but there are several important limitations. This editorial provides an introduction to the basic functionality of ChatGPT and other large language models, their current applications and limitations, and the associated implications for clinical practice and research. Cite this article: Bone Joint J 2023;105-B(6):587–589


Bone & Joint Open
Vol. 3, Issue 11 | Pages 877 - 884
14 Nov 2022
Archer H Reine S Alshaikhsalama A Wells J Kohli A Vazquez L Hummer A DiFranco MD Ljuhar R Xi Y Chhabra A

Aims. Hip dysplasia (HD) leads to premature osteoarthritis. Timely detection and correction of HD has been shown to improve pain, functional status, and hip longevity. Several time-consuming radiological measurements are currently used to confirm HD. An artificial intelligence (AI) software named HIPPO automatically locates anatomical landmarks on anteroposterior pelvis radiographs and performs the needed measurements. The primary aim of this study was to assess the reliability of this tool as compared to multi-reader evaluation in clinically proven cases of adult HD. The secondary aims were to assess the time savings achieved and evaluate inter-reader assessment. Methods. A consecutive preoperative sample of 130 HD patients (256 hips) was used. This cohort included 82.3% females (n = 107) and 17.7% males (n = 23) with median patient age of 28.6 years (interquartile range (IQR) 22.5 to 37.2). Three trained readers’ measurements were compared to AI outputs of lateral centre-edge angle (LCEA), caput-collum-diaphyseal (CCD) angle, pelvic obliquity, Tönnis angle, Sharp’s angle, and femoral head coverage. Intraclass correlation coefficients (ICC) and Bland-Altman analyses were obtained. Results. Among 256 hips with AI outputs, all six hip AI measurements were successfully obtained. The AI-reader correlations were generally good (ICC 0.60 to 0.74) to excellent (ICC > 0.75). There was lower agreement for CCD angle measurement. Most widely used measurements for HD diagnosis (LCEA and Tönnis angle) demonstrated good to excellent inter-method reliability (ICC 0.71 to 0.86 and 0.82 to 0.90, respectively). The median reading time for the three readers and AI was 212 (IQR 197 to 230), 131 (IQR 126 to 147), 734 (IQR 690 to 786), and 41 (IQR 38 to 44) seconds, respectively. Conclusion. This study showed that AI-based software demonstrated reliable radiological assessment of patients with HD with significant interpretation-related time savings. Cite this article: Bone Jt Open 2022;3(11):877–884


Bone & Joint Open
Vol. 3, Issue 10 | Pages 767 - 776
5 Oct 2022
Jang SJ Kunze KN Brilliant ZR Henson M Mayman DJ Jerabek SA Vigdorchik JM Sculco PK

Aims. Accurate identification of the ankle joint centre is critical for estimating tibial coronal alignment in total knee arthroplasty (TKA). The purpose of the current study was to leverage artificial intelligence (AI) to determine the accuracy and effect of using different radiological anatomical landmarks to quantify mechanical alignment in relation to a traditionally defined radiological ankle centre. Methods. Patients with full-limb radiographs from the Osteoarthritis Initiative were included. A sub-cohort of 250 radiographs were annotated for landmarks relevant to knee alignment and used to train a deep learning (U-Net) workflow for angle calculation on the entire database. The radiological ankle centre was defined as the midpoint of the superior talus edge/tibial plafond. Knee alignment (hip-knee-ankle angle) was compared against 1) midpoint of the most prominent malleoli points, 2) midpoint of the soft-tissue overlying malleoli, and 3) midpoint of the soft-tissue sulcus above the malleoli. Results. A total of 932 bilateral full-limb radiographs (1,864 knees) were measured at a rate of 20.63 seconds/image. The knee alignment using the radiological ankle centre was accurate against ground truth radiologist measurements (inter-class correlation coefficient (ICC) = 0.99 (0.98 to 0.99)). Compared to the radiological ankle centre, the mean midpoint of the malleoli was 2.3 mm (SD 1.3) lateral and 5.2 mm (SD 2.4) distal, shifting alignment by 0.34. o. (SD 2.4. o. ) valgus, whereas the midpoint of the soft-tissue sulcus was 4.69 mm (SD 3.55) lateral and 32.4 mm (SD 12.4) proximal, shifting alignment by 0.65. o. (SD 0.55. o. ) valgus. On the intermalleolar line, measuring a point at 46% (SD 2%) of the intermalleolar width from the medial malleoli (2.38 mm medial adjustment from midpoint) resulted in knee alignment identical to using the radiological ankle centre. Conclusion. The current study leveraged AI to create a consistent and objective model that can estimate patient-specific adjustments necessary for optimal landmark usage in extramedullary and computer-guided navigation for tibial coronal alignment to match radiological planning. Cite this article: Bone Jt Open 2022;3(10):767–776


The Bone & Joint Journal
Vol. 102-B, Issue 11 | Pages 1574 - 1581
2 Nov 2020
Zhang S Sun J Liu C Fang J Xie H Ning B

Aims. The diagnosis of developmental dysplasia of the hip (DDH) is challenging owing to extensive variation in paediatric pelvic anatomy. Artificial intelligence (AI) may represent an effective diagnostic tool for DDH. Here, we aimed to develop an anteroposterior pelvic radiograph deep learning system for diagnosing DDH in children and analyze the feasibility of its application. Methods. In total, 10,219 anteroposterior pelvic radiographs were retrospectively collected from April 2014 to December 2018. Clinicians labelled each radiograph using a uniform standard method. Radiographs were grouped according to age and into ‘dislocation’ (dislocation and subluxation) and ‘non-dislocation’ (normal cases and those with dysplasia of the acetabulum) groups based on clinical diagnosis. The deep learning system was trained and optimized using 9,081 radiographs; 1,138 test radiographs were then used to compare the diagnoses made by deep learning system and clinicians. The accuracy of the deep learning system was determined using a receiver operating characteristic curve, and the consistency of acetabular index measurements was evaluated using Bland-Altman plots. Results. In all, 1,138 patients (242 males; 896 females; mean age 1.5 years (SD 1.79; 0 to 10) were included in this study. The area under the receiver operating characteristic curve, sensitivity, and specificity of the deep learning system for diagnosing hip dislocation were 0.975, 276/289 (95.5%), and 1,978/1,987 (99.5%), respectively. Compared with clinical diagnoses, the Bland-Altman 95% limits of agreement for acetabular index, as determined by the deep learning system from the radiographs of non-dislocated and dislocated hips, were -3.27° - 2.94° and -7.36° - 5.36°, respectively (p < 0.001). Conclusion. The deep learning system was highly consistent, more convenient, and more effective for diagnosing DDH compared with clinician-led diagnoses. Deep learning systems should be considered for analysis of anteroposterior pelvic radiographs when diagnosing DDH. The deep learning system will improve the current artificially complicated screening referral process. Cite this article: Bone Joint J 2020;102-B(11):1574–1581


Bone & Joint Open
Vol. 2, Issue 10 | Pages 879 - 885
20 Oct 2021
Oliveira e Carmo L van den Merkhof A Olczak J Gordon M Jutte PC Jaarsma RL IJpma FFA Doornberg JN Prijs J

Aims

The number of convolutional neural networks (CNN) available for fracture detection and classification is rapidly increasing. External validation of a CNN on a temporally separate (separated by time) or geographically separate (separated by location) dataset is crucial to assess generalizability of the CNN before application to clinical practice in other institutions. We aimed to answer the following questions: are current CNNs for fracture recognition externally valid?; which methods are applied for external validation (EV)?; and, what are reported performances of the EV sets compared to the internal validation (IV) sets of these CNNs?

Methods

The PubMed and Embase databases were systematically searched from January 2010 to October 2020 according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. The type of EV, characteristics of the external dataset, and diagnostic performance characteristics on the IV and EV datasets were collected and compared. Quality assessment was conducted using a seven-item checklist based on a modified Methodologic Index for NOn-Randomized Studies instrument (MINORS).


The Bone & Joint Journal
Vol. 104-B, Issue 8 | Pages 909 - 910
1 Aug 2022
Vigdorchik JM Jang SJ Taunton MJ Haddad FS


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_12 | Pages 59 - 59
23 Jun 2023
Hernigou P
Full Access

The variables involved in a robotic THA can exceed 52: many parameters as pelvic orientation with CT scan, templating, offset, and leg-length, acetabular reaming, femoral osteotomy, mapping the anatomy; predefining safe zones, robotic execution, femoral head size, thickness of PE etc. with several variables for each parameter, with a total number of variables exceeding 52. This familiar number is the number of cards in a standard deck. The number of possible combinations (factorial 52! = 10^67) to shuffle the cards (and may be to perform a THA) is greater than the number of atoms on earth! Thinking that artificial intelligence and robotics can solve these problems, some surgeons and implant manufacturers have turned to artificial intelligence and robotics. We asked two questions:1) can robot with artificial intelligence really process 52 variables that represent 10^67 combinations? 2) the safety of the technology was ascertained by interrogating Food and Drug Administration (FDA) database about software-related recalls in computer-assisted and robotic arthroplasty [1], between 2017 and 2022. 1). The best computers can only calculate around 100 thousand billion combinations (10^14), and with difficulty: it takes more than 100 days to arrive at this number of digits (10^14) after the decimal point for the number π (pi). We can, therefore, expect the robot to be imperfect. 2). For the FDA software-related recalls, 4634 units were involved. The FDA determined root causes were: software design (66.6%), design change (22.2%), manufacturing deployment (5.6%), design manufacturing process (5.6%). Among the manufacturers’ reasons for recalls, a specific error was declared in 88.9%. a coding error in 43.8%. 94.4% software-related recalls were classified as class 2. Return of the device was the main action taken by firms (44.4%), followed by software update (38.9%). 3). In the same period, no robot complained about its surgeon!. Hip surgeon is as intelligent as a robot and almost twice as safe


Bone & Joint 360
Vol. 12, Issue 4 | Pages 41 - 42
1 Aug 2023

The August 2023 Research Roundup. 360. looks at: Can artificial intelligence improve the readability of patient education materials?; What is the value of radiology input during a multidisciplinary orthopaedic oncology conference?; Periprosthetic joint infection in patients with multiple arthroplasties; Orthopedic Surgery and Anesthesiology Surgical Improvement Strategies Project - Phase III outcomes; Knot tying in arthroplasty and arthroscopy causes lesions to surgical gloves: a potential risk of infection; Vascular calcification of the ankle in plain radiographs equals diabetes mellitus?


Bone & Joint 360
Vol. 12, Issue 4 | Pages 16 - 20
1 Aug 2023

The August 2023 Knee Roundup. 360. looks at: Curettage and cementation of giant cell tumour of bone: is arthritis a given?; Anterior knee pain following total knee arthroplasty: does the patellar cement-bone interface affect postoperative anterior knee pain?; Nickel allergy and total knee arthroplasty; The use of artificial intelligence for the prediction of periprosthetic joint infection following aseptic revision total knee arthroplasty; Ambulatory unicompartmental knee arthroplasty: development of a patient selection tool using machine learning; Femoral asymmetry: a missing piece in knee alignment; Needle arthroscopy – a benefit to patients in the outpatient setting; Can lateral unicompartmental knees be done in a day-case setting?


Bone & Joint 360
Vol. 13, Issue 3 | Pages 28 - 31
3 Jun 2024

The June 2024 Wrist & Hand Roundup. 360. looks at: One-year outcomes of the anatomical front and back reconstruction for scapholunate dissociation; Limited intercarpal fusion versus proximal row carpectomy in the treatment of SLAC or SNAC wrist: results after 3.5 years; Prognostic factors for clinical outcomes after arthroscopic treatment of traumatic central tears of the triangular fibrocartilage complex; The rate of nonunion in the MRI-detected occult scaphoid fracture: a multicentre cohort study; Does correction of carpal malalignment influence the union rate of scaphoid nonunion surgery?; Provision of a home-based video-assisted therapy programme in thumb carpometacarpal arthroplasty; Is replantation associated with better hand function after traumatic hand amputation than after revision amputation?; Diagnostic performance of artificial intelligence for detection of scaphoid and distal radius fractures: a systematic review


Bone & Joint 360
Vol. 12, Issue 5 | Pages 27 - 30
1 Oct 2023

The October 2023 Wrist & Hand Roundup. 360. looks at: Distal radius fracture management: surgeon factors markedly influence decision-making; Fracture-dislocation of the radiocarpal joint: bony and capsuloligamentar management, outcomes, and long-term complications; Exploring the role of artificial intelligence chatbot in the management of scaphoid fractures; Role of ultrasonography for evaluation of nerve recovery in repaired median nerve lacerations; Four weeks versus six weeks of immobilization in a cast following closed reduction for displaced distal radial fractures in adult patients: a multicentre randomized controlled trial; Rehabilitation following flexor tendon injury in Zone 2: a randomized controlled study; On the road again: return to driving following minor hand surgery; Open versus single- or dual-portal endoscopic carpal tunnel release: a meta-analysis of randomized controlled trials


Bone & Joint Open
Vol. 4, Issue 3 | Pages 168 - 181
14 Mar 2023
Dijkstra H Oosterhoff JHF van de Kuit A IJpma FFA Schwab JH Poolman RW Sprague S Bzovsky S Bhandari M Swiontkowski M Schemitsch EH Doornberg JN Hendrickx LAM

Aims

To develop prediction models using machine-learning (ML) algorithms for 90-day and one-year mortality prediction in femoral neck fracture (FNF) patients aged 50 years or older based on the Hip fracture Evaluation with Alternatives of Total Hip arthroplasty versus Hemiarthroplasty (HEALTH) and Fixation using Alternative Implants for the Treatment of Hip fractures (FAITH) trials.

Methods

This study included 2,388 patients from the HEALTH and FAITH trials, with 90-day and one-year mortality proportions of 3.0% (71/2,388) and 6.4% (153/2,388), respectively. The mean age was 75.9 years (SD 10.8) and 65.9% of patients (1,574/2,388) were female. The algorithms included patient and injury characteristics. Six algorithms were developed, internally validated and evaluated across discrimination (c-statistic; discriminative ability between those with risk of mortality and those without), calibration (observed outcome compared to the predicted probability), and the Brier score (composite of discrimination and calibration).


Bone & Joint Research
Vol. 12, Issue 9 | Pages 590 - 597
20 Sep 2023
Uemura K Otake Y Takashima K Hamada H Imagama T Takao M Sakai T Sato Y Okada S Sugano N

Aims

This study aimed to develop and validate a fully automated system that quantifies proximal femoral bone mineral density (BMD) from CT images.

Methods

The study analyzed 978 pairs of hip CT and dual-energy X-ray absorptiometry (DXA) measurements of the proximal femur (DXA-BMD) collected from three institutions. From the CT images, the femur and a calibration phantom were automatically segmented using previously trained deep-learning models. The Hounsfield units of each voxel were converted into density (mg/cm3). Then, a deep-learning model trained by manual landmark selection of 315 cases was developed to select the landmarks at the proximal femur to rotate the CT volume to the neutral position. Finally, the CT volume of the femur was projected onto the coronal plane, and the areal BMD of the proximal femur (CT-aBMD) was quantified. CT-aBMD correlated to DXA-BMD, and a receiver operating characteristic (ROC) analysis quantified the accuracy in diagnosing osteoporosis.


Bone & Joint Open
Vol. 5, Issue 2 | Pages 139 - 146
15 Feb 2024
Wright BM Bodnar MS Moore AD Maseda MC Kucharik MP Diaz CC Schmidt CM Mir HR

Aims. While internet search engines have been the primary information source for patients’ questions, artificial intelligence large language models like ChatGPT are trending towards becoming the new primary source. The purpose of this study was to determine if ChatGPT can answer patient questions about total hip (THA) and knee arthroplasty (TKA) with consistent accuracy, comprehensiveness, and easy readability. Methods. We posed the 20 most Google-searched questions about THA and TKA, plus ten additional postoperative questions, to ChatGPT. Each question was asked twice to evaluate for consistency in quality. Following each response, we responded with, “Please explain so it is easier to understand,” to evaluate ChatGPT’s ability to reduce response reading grade level, measured as Flesch-Kincaid Grade Level (FKGL). Five resident physicians rated the 120 responses on 1 to 5 accuracy and comprehensiveness scales. Additionally, they answered a “yes” or “no” question regarding acceptability. Mean scores were calculated for each question, and responses were deemed acceptable if ≥ four raters answered “yes.”. Results. The mean accuracy and comprehensiveness scores were 4.26 (95% confidence interval (CI) 4.19 to 4.33) and 3.79 (95% CI 3.69 to 3.89), respectively. Out of all the responses, 59.2% (71/120; 95% CI 50.0% to 67.7%) were acceptable. ChatGPT was consistent when asked the same question twice, giving no significant difference in accuracy (t = 0.821; p = 0.415), comprehensiveness (t = 1.387; p = 0.171), acceptability (χ. 2. = 1.832; p = 0.176), and FKGL (t = 0.264; p = 0.793). There was a significantly lower FKGL (t = 2.204; p = 0.029) for easier responses (11.14; 95% CI 10.57 to 11.71) than original responses (12.15; 95% CI 11.45 to 12.85). Conclusion. ChatGPT answered THA and TKA patient questions with accuracy comparable to previous reports of websites, with adequate comprehensiveness, but with limited acceptability as the sole information source. ChatGPT has potential for answering patient questions about THA and TKA, but needs improvement. Cite this article: Bone Jt Open 2024;5(2):139–146


Bone & Joint Open
Vol. 4, Issue 11 | Pages 825 - 831
1 Nov 2023
Joseph PJS Khattak M Masudi ST Minta L Perry DC

Aims. Hip disease is common in children with cerebral palsy (CP) and can decrease quality of life and function. Surveillance programmes exist to improve outcomes by treating hip disease at an early stage using radiological surveillance. However, studies and surveillance programmes report different radiological outcomes, making it difficult to compare. We aimed to identify the most important radiological measurements and develop a core measurement set (CMS) for clinical practice, research, and surveillance programmes. Methods. A systematic review identified a list of measurements previously used in studies reporting radiological hip outcomes in children with CP. These measurements informed a two-round Delphi study, conducted among orthopaedic surgeons and specialist physiotherapists. Participants rated each measurement on a nine-point Likert scale (‘not important’ to ‘critically important’). A consensus meeting was held to finalize the CMS. Results. Overall, 14 distinct measurements were identified in the systematic review, with Reimer’s migration percentage being the most frequently reported. These measurements were presented over the two rounds of the Delphi process, along with two additional measurements that were suggested by participants. Ultimately, two measurements, Reimer’s migration percentage and femoral head-shaft angle, were included in the CMS. Conclusion. This use of a minimum standardized set of measurements has the potential to encourage uniformity across hip surveillance programmes, and may streamline the development of tools, such as artificial intelligence systems to automate the analysis in surveillance programmes. This core set should be the minimum requirement in clinical studies, allowing clinicians to add to this as needed, which will facilitate comparisons to be drawn between studies and future meta-analyses. Cite this article: Bone Jt Open 2023;4(11):825–831


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_13 | Pages 80 - 80
7 Aug 2023
Liu A Qian K Dorzi R Alabdullah M Anand S Maher N Kingsbury S Conaghan P Xie S
Full Access

Abstract. Introduction. Knee braces are limited to providing passive support. There is currently no brace available providing both continuous monitoring and active robot-assisted movements of the knee joint. This project aimed to develop a wearable intelligent motorised robotic knee brace to support and monitor rehabilitation for a range of knee conditions including post-surgical rehabilitation. This brace can be used at home providing ambulatory continuous passive movement obviating the need for hospital admissions. Methodology. A wearable sensing system monitoring knee range of motion was developed to provide remote feedback to clinicians and real-time guidance for patients. A prototype of an exoskeleton providing dynamic motion assistance was developed to help patients complete their exercise goals and strengthen their muscles. The accuracy and reliability of those functions were validated in human participants during exercises including knee flexion/extension (FE) in bed and in chair, sit-to-stand and stand-to-sit. Results. The knee FE measurement from the sensing system showed high accuracy (correlation coefficient of 0.99°) in human participants. The real-time FE data during exercises showed that the desired exoskeleton rotation fitted well with the participant's knee rotation. This indicated the exoskeleton could coordinate with the participant's knee motion by providing consistent motion assistance. The development of user interfaces to provide feedback is currently underway. Conclusion. A wearable robotic knee brace to monitor and support knee rehabilitation exercises was successfully developed. Further development of this device with the use of artificial intelligence has the potential to aid patient rehabilitation in a variety of knee conditions


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_1 | Pages 140 - 140
2 Jan 2024
van der Weegen W Warren T Agricola R Das D Siebelt M
Full Access

Artificial Intelligence (AI) is becoming more powerful but is barely used to counter the growth in health care burden. AI applications to increase efficiency in orthopedics are rare. We questioned if (1) we could train machine learning (ML) algorithms, based on answers from digitalized history taking questionnaires, to predict treatment of hip osteoartritis (either conservative or surgical); (2) such an algorithm could streamline clinical consultation. Multiple ML models were trained on 600 annotated (80% training, 20% test) digital history taking questionnaires, acquired before consultation. Best performing models, based on balanced accuracy and optimized automated hyperparameter tuning, were build into our daily clinical orthopedic practice. Fifty patients with hip complaints (>45 years) were prospectively predicted and planned (partly blinded, partly unblinded) for consultation with the physician assistant (conservative) or orthopedic surgeon (operative). Tailored patient information based on the prediction was automatically sent to a smartphone app. Level of evidence: IV. Random Forest and BernoulliNB were the most accurate ML models (0.75 balanced accuracy). Treatment prediction was correct in 45 out of 50 consultations (90%), p<0.0001 (sign and binomial test). Specialized consultations where conservatively predicted patients were seen by the physician assistant and surgical patients by the orthopedic surgeon were highly appreciated and effective. Treatment strategy of hip osteoartritis based on answers from digital history taking questionnaires was accurately predicted before patients entered the hospital. This can make outpatient consultation scheduling more efficient and tailor pre-consultation patient education


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_1 | Pages 141 - 141
2 Jan 2024
Wendlandt R Volpert T Schroeter J Schulz A Paech A
Full Access

Gait analysis is an indispensable tool for scientific assessment and treatment of individuals whose ability to walk is impaired. The high cost of installation and operation are a major limitation for wide-spread use in clinical routine. Advances in Artificial Intelligence (AI) could significantly reduce the required instrumentation. A mobile phone could be all equipment necessary for 3D gait analysis. MediaPipe Pose provided by Google Research is such a Machine Learning approach for human body tracking from monocular RGB video frames that is detecting 3D-landmarks of the human body. Aim of this study was to analyze the accuracy of gait phase detection based on the joint landmarks identified by the AI system. Motion data from 10 healthy volunteers walking on a treadmill with a fixed speed of 4.5km/h (Callis, Sprintex, Germany) was sampled with a mobile phone (iPhone SE 2nd Generation, Apple). The video was processed with Mediapipe Pose (Version 0.9.1.0) using custom python software. Gait phases (Initial Contact - IC and Toe Off - TO) were detected from the angular velocities of the lower legs. For the determination of ground truth, the movement was simultaneously recorded with the AS-200 System (LaiTronic GmbH, Innsbruck, Austria). The number of detected strides, the error in IC detection and stance phase duration was calculated. In total, 1692 strides were detected from the reference system during the trials from which the AI-system identified 679 strides. The absolute mean error (AME) in IC detection was 39.3 ± 36.6 ms while the AME for stance duration was 187.6 ± 140 ms. Landmark detection is a challenging task for the AI-system as can clearly be seen be the rate of only 40% detected strides. As mentioned by Fadillioglu et al., error in TO-detection is higher than in IC-detection


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_1 | Pages 116 - 116
2 Jan 2024
Belcastro L Zubkovs V Markocic M Sajjadi S Peez C Tognato R Boghossian AA Cattaneo S Grad S Basoli V
Full Access

Osteoarthritis (OA) is a degenerative joint disease affecting millions worldwide. Early detection of OA and monitoring its progression is essential for effective treatment and for preventing irreversible damage. Although sensors have emerged as a promising tool for monitoring analytes in patients, their application for monitoring the state of pathology is currently restricted to specific fields (such as diabetes). In this study, we present the development of an optical sensor system for real-time monitoring of inflammation based on the measurement of nitric oxide (NO), a molecule highly produced in tissues during inflammation. Single-walled carbon nanotubes (SWCNT) were functionalized with a single-stranded DNA (ssDNA) wrapping designed using an artificial intelligence approach and tested using S-nitroso-N-acetyl penicillamine (SNAP) as a standard released-NO marker. An optical SWIR reader with LED excitation at 650 nm, 730 nm and detecting emission above 1000 nm was developed to read the fluorescence signal from the SWCNTs. Finally, the SWCNT was embedded in GelMa to prove the feasibility of monitoring the release of NO in bovine chondrocyte and osteochondral inflamed cultures (1–10 ng/ml IL1β) monitored over 48 hours. The stability of the inflammation model and NO release was indirectly validated using the Griess and DAF-FM methods. A microfabricated sensor tag was developed to explore the possibility of using ssDNA-SWCNT in an ex vivo anatomic set-up for surgical feasibility, the limit of detection, and the stability under dynamic flexion. The SWCNT sensor was sensitive to NO in both in silico and in vitro conditions during the inflammatory response from chondrocyte and osteochondral plug cultures. The fluorescence signal decreased in the inflamed group compared to control, indicating increased NO concentration. The micro-tag was suitable and stable in joints showing a readable signal at a depth of up to 6 mm under the skin. The ssDNA-SWCNT technology showed the possibility of monitoring inflammation continuously in an in vitro set-up and good stability inside the joint. However, further studies in vivo are needed to prove the possibility of monitoring disease progression and treatment efficacy in vivo. Acknowledgments: The project was co-financed by Innosuisse (grant nr. 56034.1 IP-LS)


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_9 | Pages 20 - 20
17 Apr 2023
Reimers N Huynh T Schulz A
Full Access

The objectives of this study are to evaluate the impact of the CoVID-19 pandemic on the development of relevant emerging digital healthcare trends and to explore which digital healthcare trend does the health industry need most to support HCPs. A web survey using 39 questions facilitating Five-Point Likert scales was performed from 1.8.2020 – 31.10.2020. Of 260 participants invited, 90 participants answered the questionnaire. The participants were located in the Hospital/HCP sector in 11.9%, in other healthcare sectors in 22.2%, in the pharmaceutical sector in 11.1%, in the medical device and equipment industry in 43.3%. The Five-Point Likert scales were in all cases fashioned as from 1 (strongly disagree) to 5 (strongly agree). As the top 3 most impacted digital health care trends strongly impacted by CoVID-19, respondents named:. - remote management of patients by telemedicine, mean answer 4.44. - shared data governance under patient control, mean answer 3.80. - new virtual interaction between HCP´s and medical industry, mean answer 3.76. Respondents were asked which level of readiness of the healthcare system currently possess to cope with the current trend impacted by CoVID-19. - Digital and efficient healthcare logistics, mean answer 1.54. - Integrated health care, mean answer 1.73. - Use of big data and artificial intelligence, mean answer 2.03. Asked if collaborative research in the form of digital data platforms for research data sharing and increasing collaboration with multi-centric consortia would have a positive impact on the healthcare sector, the agreement was high with a value of mean 4.10 on the scale. We can conclude that the impact of COVID-19 appears to be a high agreement of necessary advances in digitalization in the health care sector and in the collaboration of HCPs with the health care industry. Health care professional are unsure, in how far the national health care sector is capable of transformation in healthcare logistics and integrated health care


Orthopaedic Proceedings
Vol. 104-B, Issue SUPP_13 | Pages 42 - 42
1 Dec 2022
Abbas A Toor J Lex J Finkelstein J Larouche J Whyne C Lewis S
Full Access

Single level discectomy (SLD) is one of the most commonly performed spinal surgery procedures. Two key drivers of their cost-of-care are duration of surgery (DOS) and postoperative length of stay (LOS). Therefore, the ability to preoperatively predict SLD DOS and LOS has substantial implications for both hospital and healthcare system finances, scheduling and resource allocation. As such, the goal of this study was to predict DOS and LOS for SLD using machine learning models (MLMs) constructed on preoperative factors using a large North American database. The American College of Surgeons (ACS) National Surgical and Quality Improvement (NSQIP) database was queried for SLD procedures from 2014-2019. The dataset was split in a 60/20/20 ratio of training/validation/testing based on year. Various MLMs (traditional regression models, tree-based models, and multilayer perceptron neural networks) were used and evaluated according to 1) mean squared error (MSE), 2) buffer accuracy (the number of times the predicted target was within a predesignated buffer), and 3) classification accuracy (the number of times the correct class was predicted by the models). To ensure real world applicability, the results of the models were compared to a mean regressor model. A total of 11,525 patients were included in this study. During validation, the neural network model (NNM) had the best MSEs for DOS (0.99) and LOS (0.67). During testing, the NNM had the best MSEs for DOS (0.89) and LOS (0.65). The NNM yielded the best 30-minute buffer accuracy for DOS (70.9%) and ≤120 min, >120 min classification accuracy (86.8%). The NNM had the best 1-day buffer accuracy for LOS (84.5%) and ≤2 days, >2 days classification accuracy (94.6%). All models were more accurate than the mean regressors for both DOS and LOS predictions. We successfully demonstrated that MLMs can be used to accurately predict the DOS and LOS of SLD based on preoperative factors. This big-data application has significant practical implications with respect to surgical scheduling and inpatient bedflow, as well as major implications for both private and publicly funded healthcare systems. Incorporating this artificial intelligence technique in real-time hospital operations would be enhanced by including institution-specific operational factors such as surgical team and operating room workflow


Orthopaedic Proceedings
Vol. 103-B, Issue SUPP_4 | Pages 94 - 94
1 Mar 2021
Gallo J Kudelka M Radvansky M Kriegova E
Full Access

Precision medicine tailoring the patient pathway based on the risk, prognosis, and treatment response may bring benefits to the patients. To identify risk factors contributing to the early failure of treatment (development of events of interest) and when possible to change the prognosis via modifying these factors may improve the outcome and/or lower the risk of complications. There is an emerging goal to identify such parameters in total knee arthroplasty (TKA) thus lower the risk of revision surgery. The goal of this study was to identify factors explaining the risk for early revision of TKA using an artificial intelligence method appropriate for this task. We applied a patient similarity network (PSN) for the identification of risk factors associated with early reoperations (n=109, 5.8%) in patients with TKA (n=1885). Next, an algorithm based on formal concept analysis was developed to support the patient decision on how to change modifying personal characteristics with respect to the estimated probability of reoperations. The early reoperations were less frequent in women (4.4%, median time to reoperation 4.5 mo) than in men (8.2%, 10 mo), reaching the highest incidence in younger men (10.9%)


Orthopaedic Proceedings
Vol. 102-B, Issue SUPP_9 | Pages 31 - 31
1 Oct 2020
Jayakumar P Furlough K Uhler L Grogan-Moore M Gliklich R Rathouz P Bozic KJ
Full Access

Introduction. The application of artificial intelligence (A.I) using patient reported outcomes (PROs) to predict benefits, risks, benefits and likelihood of improvement following surgery presents a new frontier in shared decision-making. The purpose of this study was to assess the impact of an A.I-enabled decision aid versus patient education alone on decision quality in patients with knee OA considering total knee replacement (TKR). Secondarily we assess impact on shared decision-making, patient satisfaction, functional outcomes, consultation time, TKR rates and treatment concordance. Methods. We performed a randomized controlled trial involving 130 new adult patients with OA-related knee pain. Patients were randomized to receive the decision aid (intervention group, n=65) or educational material only (control group, n=65) along with usual care. Both cohorts completed patient surveys including PROs at baseline and between 6–12 weeks following initial evaluation or TKR. Statistical analysis included linear mixed effect models, Mann-Whitney U tests to assess for differences between groups and Fisher's exact test to evaluate variations in surgical rates and treatment concordance. Results. The intervention group showed greater decision quality (K-DQI, Mean difference = 20%, p<0.0001), collaboration in decision-making (CollaboRATE, 12% (intervention group), 47% (control group) below median, p<0.0001), satisfaction with consultations (NRS-C, 14% (intervention group), 33% (control group) below median, p=0.008), improvement in functional outcomes from baseline up to 12 week follow-up (KOOSJR, 4.9 pts higher (intervention group), p=0.029) without significantly impacting consultation time. No differences were observed in TKR rates or treatment concordance. Conclusion. A.I-enabled decision aids incorporating PROs in predictive algorithms can improve decision quality, level of shared decision-making, satisfaction with patient-provider consultations, and functional outcomes, without extending consultation times. The combination of advanced predictive technologies and patient reported data to forecast surgical outcomes presents a paradigm shift in shared decision making and the delivery of high value care for patients with knee OA


Orthopaedic Proceedings
Vol. 102-B, Issue SUPP_1 | Pages 4 - 4
1 Feb 2020
Oni J Yi P Wei J Kim T Sair H Fritz J Hager G
Full Access

Introduction. Automated identification of arthroplasty implants could aid in pre-operative planning and is a task which could be facilitated through artificial intelligence (AI) and deep learning. The purpose of this study was to develop and test the performance of a deep learning system (DLS) for automated identification and classification of knee arthroplasty (KA) on radiographs. Methods. We collected 237 AP knee radiographs with equal proportions of native knees, total KA (TKA), and unicompartmental KA (UKA), as well as 274 radiographs with equal proportions of Smith & Nephew Journey and Zimmer NexGen TKAs. Data augmentation was used to increase the number of images available for DLS development. These images were used to train, validate, and test deep convolutional neural networks (DCNN) to 1) detect the presence of TKA; 2) differentiate between TKA and UKA; and 3) differentiate between the 2 TKA models. Receiver operating characteristic (ROC) curves were generated with area under the curve (AUC) calculated to assess test performance. Results. The DCNNs trained to detect KA and to distinguish between TKA and UKA both achieved AUC of 1. In both cases, heatmap analysis demonstrated appropriate emphasis of the KA components in decision-making. The DCNN trained to distinguish between the 2 TKA models also achieved AUC of 1. Heatmap analysis of this DCNN showed emphasis of specific unique features of the TKA model designs for decision making, such as the anterior flange shape of the Zimmer NexGen TKA (Figure 1) and the tibial baseplate/stem shape of the Smith & Nephew Journey TKA (Figure 2). Conclusion. DCNNs can accurately identify presence of TKA and distinguish between specific designs. The proof-of-concept of these DCNNs may set the foundation for DCNNs to identify other prosthesis models and prosthesis-related complications. For any figures or tables, please contact the authors directly