Advertisement for orthosearch.org.uk
Results 1 - 11 of 11
Results per page:
Bone & Joint Research
Vol. 13, Issue 9 | Pages 507 - 512
18 Sep 2024
Farrow L Meek D Leontidis G Campbell M Harrison E Anderson L

Despite the vast quantities of published artificial intelligence (AI) algorithms that target trauma and orthopaedic applications, very few progress to inform clinical practice. One key reason for this is the lack of a clear pathway from development to deployment. In order to assist with this process, we have developed the Clinical Practice Integration of Artificial Intelligence (CPI-AI) framework – a five-stage approach to the clinical practice adoption of AI in the setting of trauma and orthopaedics, based on the IDEAL principles (. https://www.ideal-collaboration.net/. ). Adherence to the framework would provide a robust evidence-based mechanism for developing trust in AI applications, where the underlying algorithms are unlikely to be fully understood by clinical teams. Cite this article: Bone Joint Res 2024;13(9):507–512


Bone & Joint Open
Vol. 4, Issue 9 | Pages 696 - 703
11 Sep 2023
Ormond MJ Clement ND Harder BG Farrow L Glester A

Aims. The principles of evidence-based medicine (EBM) are the foundation of modern medical practice. Surgeons are familiar with the commonly used statistical techniques to test hypotheses, summarize findings, and provide answers within a specified range of probability. Based on this knowledge, they are able to critically evaluate research before deciding whether or not to adopt the findings into practice. Recently, there has been an increased use of artificial intelligence (AI) to analyze information and derive findings in orthopaedic research. These techniques use a set of statistical tools that are increasingly complex and may be unfamiliar to the orthopaedic surgeon. It is unclear if this shift towards less familiar techniques is widely accepted in the orthopaedic community. This study aimed to provide an exploration of understanding and acceptance of AI use in research among orthopaedic surgeons. Methods. Semi-structured in-depth interviews were carried out on a sample of 12 orthopaedic surgeons. Inductive thematic analysis was used to identify key themes. Results. The four intersecting themes identified were: 1) validity in traditional research, 2) confusion around the definition of AI, 3) an inability to validate AI research, and 4) cautious optimism about AI research. Underpinning these themes is the notion of a validity heuristic that is strongly rooted in traditional research teaching and embedded in medical and surgical training. Conclusion. Research involving AI sometimes challenges the accepted traditional evidence-based framework. This can give rise to confusion among orthopaedic surgeons, who may be unable to confidently validate findings. In our study, the impact of this was mediated by cautious optimism based on an ingrained validity heuristic that orthopaedic surgeons develop through their medical training. Adding to this, the integration of AI into everyday life works to reduce suspicion and aid acceptance. Cite this article: Bone Jt Open 2023;4(9):696–703


Orthopaedic Proceedings
Vol. 103-B, Issue SUPP_3 | Pages 30 - 30
1 Mar 2021
Gerges M Eng H Chhina H Cooper A
Full Access

Bone age is a radiographical assessment used in pediatric medicine due to its relative objectivity in determining biological maturity compared to chronological age and size.1 Currently, Greulich and Pyle (GP) is one of the most common methods used to determine bone age from hand radiographs.2–4 In recent years, new methods were developed to increase the efficiency in bone age analysis like the shorthand bone age (SBA) and the automated artificial intelligence algorithms. The purpose of this study is to evaluate the accuracy and reliability of these two methods and examine if the reduction in analysis time compromises their accuracy. Two hundred thirteen males and 213 females were selected. Each participant had their bone age determined by two separate raters using the GP (M1) and SBA methods (M2). Three weeks later, the two raters repeated the analysis of the radiographs. The raters timed themselves using an online stopwatch while analyzing the radiograph on a computer screen. De-identified radiographs were securely uploaded to an automated algorithm developed by a group of radiologists in Toronto. The gold standard was determined to be the radiology report attached to each radiograph, written by experienced radiologists using GP (M1). For intra-rater variability, intraclass correlation analysis between trial 1 (T1) and trial 2 (T2) for each rater and method was performed. For inter-rater variability, intraclass correlation was performed between rater 1 (R1) and rater 2 (R2) for each method and trial. Intraclass correlation between each method and the gold standard fell within the 0.8–0.9 range, highlighting significant agreement. Most of the comparisons showed a statistically significant difference between the two new methods and the gold standard; however it may not be clinically significant as it ranges between 0.25–0.5 years. A bone age is considered clinically abnormal if it falls outside 2 standard deviations of the chronological age; standard deviations are calculated and provided in GP atlas.6–8 For a 10-year old female, 2 standard deviations constitute 21.6 months which far outweighs the difference reported here between SBA, automated algorithm and the gold standard. The median time for completion using the GP method was 21.83 seconds for rater 1 and 9.30 seconds for rater 2. In comparison, SBA required a median time of 7 seconds for rater 1 and 5 seconds for rater 2. The automated method had no time restraint as bone age was determined immediately upon radiograph upload. The correlation between the two trials in each method and rater (i.e. R1M1T1 vs R1M1T2) was excellent (κ= 0.9–1) confirming the reliability of the two new methods. Similarly, the correlation between the two raters in each method and trial (i.e. R1M1T1 vs R2M1T1) fell within the 0.9–1 range. This indicates a limited variability between raters who may use these two methods. The shorthand bone age method and an artificial intelligence automated algorithm produced values that are in agreement with the gold standard Greulich and Pyle, while reducing analysis time and maintaining a high inter-rater and intra-rater reliability


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_2 | Pages 39 - 39
10 Feb 2023
Lutter C Grupp T Mittelmeier W Selig M Grover P Dreischarf M Rose G Bien T
Full Access

Polyethylene wear represents a significant risk factor for the long-term success of knee arthroplasty [1]. This work aimed to develop and in vivo validate an automated algorithm for accurate and precise AI based wear measurement in knee arthroplasty using clinical AP radiographs for scientifically meaningful multi-centre studies.

Twenty postoperative radiographs (knee joint AP in standing position) after knee arthroplasty were analysed using the novel algorithm. A convolutional neural network-based segmentation is used to localize the implant components on the X-Ray, and a 2D-3D registration of the CAD implant models precisely calculates the three-dimensional position and orientation of the implants in the joint at the time of acquisition. From this, the minimal distance between the involved implant components is determined, and its postoperative change over time enables the determination of wear in the radiographs.

The measured minimum inlay height of 335 unloaded inlays excluding the weight-induced deformation, served as ground truth for validation and was compared to the algorithmically calculated component distances from 20 radiographs.

With an average weight of 94 kg in the studied TKA patient cohort, it was determined that an average inlay height of 6.160 mm is expected in the patient. Based on the radiographs, the algorithm calculated a minimum component distance of 6.158 mm (SD = 81 µm), which deviated by 2 µm in comparison to the expected inlay height.

An automated method was presented that allows accurate and precise determination of the inlay height and subsequently the wear in knee arthroplasty based on a clinical radiograph and the CAD models. Precision and accuracy are comparable to the current gold standard RSA [2], but without relying on special radiographic setups. The developed method can therefore be used to objectively investigate novel implant materials with meaningful clinical cohorts, thus improving the quality of patient care.


Orthopaedic Proceedings
Vol. 104-B, Issue SUPP_13 | Pages 42 - 42
1 Dec 2022
Abbas A Toor J Lex J Finkelstein J Larouche J Whyne C Lewis S
Full Access

Single level discectomy (SLD) is one of the most commonly performed spinal surgery procedures. Two key drivers of their cost-of-care are duration of surgery (DOS) and postoperative length of stay (LOS). Therefore, the ability to preoperatively predict SLD DOS and LOS has substantial implications for both hospital and healthcare system finances, scheduling and resource allocation. As such, the goal of this study was to predict DOS and LOS for SLD using machine learning models (MLMs) constructed on preoperative factors using a large North American database. The American College of Surgeons (ACS) National Surgical and Quality Improvement (NSQIP) database was queried for SLD procedures from 2014-2019. The dataset was split in a 60/20/20 ratio of training/validation/testing based on year. Various MLMs (traditional regression models, tree-based models, and multilayer perceptron neural networks) were used and evaluated according to 1) mean squared error (MSE), 2) buffer accuracy (the number of times the predicted target was within a predesignated buffer), and 3) classification accuracy (the number of times the correct class was predicted by the models). To ensure real world applicability, the results of the models were compared to a mean regressor model. A total of 11,525 patients were included in this study. During validation, the neural network model (NNM) had the best MSEs for DOS (0.99) and LOS (0.67). During testing, the NNM had the best MSEs for DOS (0.89) and LOS (0.65). The NNM yielded the best 30-minute buffer accuracy for DOS (70.9%) and ≤120 min, >120 min classification accuracy (86.8%). The NNM had the best 1-day buffer accuracy for LOS (84.5%) and ≤2 days, >2 days classification accuracy (94.6%). All models were more accurate than the mean regressors for both DOS and LOS predictions. We successfully demonstrated that MLMs can be used to accurately predict the DOS and LOS of SLD based on preoperative factors. This big-data application has significant practical implications with respect to surgical scheduling and inpatient bedflow, as well as major implications for both private and publicly funded healthcare systems. Incorporating this artificial intelligence technique in real-time hospital operations would be enhanced by including institution-specific operational factors such as surgical team and operating room workflow


Orthopaedic Proceedings
Vol. 102-B, Issue SUPP_1 | Pages 4 - 4
1 Feb 2020
Oni J Yi P Wei J Kim T Sair H Fritz J Hager G
Full Access

Introduction. Automated identification of arthroplasty implants could aid in pre-operative planning and is a task which could be facilitated through artificial intelligence (AI) and deep learning. The purpose of this study was to develop and test the performance of a deep learning system (DLS) for automated identification and classification of knee arthroplasty (KA) on radiographs. Methods. We collected 237 AP knee radiographs with equal proportions of native knees, total KA (TKA), and unicompartmental KA (UKA), as well as 274 radiographs with equal proportions of Smith & Nephew Journey and Zimmer NexGen TKAs. Data augmentation was used to increase the number of images available for DLS development. These images were used to train, validate, and test deep convolutional neural networks (DCNN) to 1) detect the presence of TKA; 2) differentiate between TKA and UKA; and 3) differentiate between the 2 TKA models. Receiver operating characteristic (ROC) curves were generated with area under the curve (AUC) calculated to assess test performance. Results. The DCNNs trained to detect KA and to distinguish between TKA and UKA both achieved AUC of 1. In both cases, heatmap analysis demonstrated appropriate emphasis of the KA components in decision-making. The DCNN trained to distinguish between the 2 TKA models also achieved AUC of 1. Heatmap analysis of this DCNN showed emphasis of specific unique features of the TKA model designs for decision making, such as the anterior flange shape of the Zimmer NexGen TKA (Figure 1) and the tibial baseplate/stem shape of the Smith & Nephew Journey TKA (Figure 2). Conclusion. DCNNs can accurately identify presence of TKA and distinguish between specific designs. The proof-of-concept of these DCNNs may set the foundation for DCNNs to identify other prosthesis models and prosthesis-related complications. For any figures or tables, please contact the authors directly


Orthopaedic Proceedings
Vol. 102-B, Issue SUPP_2 | Pages 84 - 84
1 Feb 2020
Deckx J Jacobs M Dupraz I Utz M
Full Access

INTRODUCTION. Statistical shape models (SSM) have become a common tool to create reference models for design input and verification of total joint implants. In a recent discussion paper around Artificial Intelligence and Machine Learning, the FDA emphasizes the importance of independent test data [1]. A leave-one-out test is a standard way to evaluate the generalization ability of an SSM [2]; however, this test does not fulfill the independence requirement of the FDA. In this study, we constructed an SSM of the knee (femur and tibia). Next to the standard leave-one-out validation, we used an independent test set of patients from a different geographical region than the patients used to build the SSM. We assessed the ability of the SSM to predict the shapes of knees in this independent test set. METHODS. A dataset of 82 computed tomography (CT) scans of Caucasian patients (42 male, 40 female) from 11 different geographic locations in France, Germany, Austria, Italy and Australia were used as training set to make an SSM of the femur and tibia. A leave-one-out test was performed to assess the ability of the SSM to predict shapes within the training set. A test dataset of 4 CT scans of Caucasian patients from Russia were used for the validation. The SSM was fitted onto each of the femur and tibia shapes and the root mean square error (RMSE) was measured. RESULTS. The leave-one-out tests showed that the femur and tibia SSMs were able to predict patients in the input population with an RMSE of 0.59 ± 0.1 mm (average ± standard deviation) for the femur and 0.70 ± 0.1 mm for the tibia. The validation test showed that the femur and tibia SSMs were able to predict the shapes of the Russian patients with an RMSE 0.62 ± 0.1 mm for the femur and 0.71 ± 0.1 mm for the tibia. DISCUSSION. There were no significant differences in the ability of the SSM to predict femur and tibia shapes of patients in a new geographic region compared to the ability of the SSM to predict shapes within the training set. CONCLUSIONS. Based on this study, 11 different geographic locations in France, Germany, Austria, Italy and Australia provide a complete sample of the Caucasian population. Using an independent set of CT scans is a valuable tool to further validate the generalization ability of an SSM. For any figures or tables, please contact authors directly


Orthopaedic Proceedings
Vol. 102-B, Issue SUPP_1 | Pages 133 - 133
1 Feb 2020
Borjali A Chen A Muratoglu O Varadarajan K
Full Access

INTRODUCTION. Mechanical loosening of total hip replacement (THR) is primarily diagnosed using radiographs, which are diagnostically challenging and require review by experienced radiologists and orthopaedic surgeons. Automated tools that assist less-experienced clinicians and mitigate human error can reduce the risk of missed or delayed diagnosis. Thus the purposes of this study were to: 1) develop an automated tool to detect mechanical loosening of THR by training a deep convolutional neural network (CNN) using THR x-rays, and 2) visualize the CNN training process to interpret how it functions. METHODS. A retrospective study was conducted using previously collected imaging data at a single institution with IRB approval. Twenty-three patients with cementless primary THR who underwent revision surgery due to mechanical loosening (either with a loose stem and/or a loose acetabular component) had their hip x-rays evaluated immediately prior to their revision surgery (32 “loose” x-rays). A comparison group was comprised of 23 patients who underwent primary cementless THR surgery with x-rays immediately after their primary surgery (31 “not loose” x-rays). Fig. 1 shows examples of “not loose” and “loose” THR x-ray. DenseNet201-CNN was utilized by swapping the top layer with a binary classifier using 90:10 split-validation [1]. Pre-trained CNN on ImageNet [2] and not pre-trained CNN (initial zero weights) were implemented to compare the results. Saliency maps were implemented to indicate the importance of each pixel of a given x-ray on the CNN's performance [3]. RESULTS. Fig. 2 shows the saliency maps for an example x-ray and the corresponding accuracy of the CNN on the entire validation dataset at different stages of the training for both pre-trained (Fig. 2a) and not pre-trained (Fig. 2b) CNNs. Colored regions in the saliency maps, where red denotes higher relative influence than blue, indicate the most influential regions on the CNN's performance. Pre-trained CNN achieved higher accuracy (87%) on the validation set x-rays than not pre-trained CNN (62%) after 10 epochs. The pre-trained CNN's saliency map at 10 epochs identified significant influence of bone-implant interaction regions on the CNN's performance. This indicates that the CNN is ‘looking’ at the clinically relevant features in the x-rays. The saliency maps also demonstrated that the pre-trained CNN quickly learned where to ‘look’, while the not pre-trained CNN struggles. DISCUSSION. An automated tool to detect mechanical loosening of THR was developed that can potentially assist clinicians with accurate diagnosis. By visualizing the influential regions of the x-ray on the CNN performance, this study shed light into CNN learning process and demonstrated that CNN is ‘looking’ at the clinically relevant features to classify the x-rays. This visualization is crucial to build trust in the automated system by interpreting how it functions to increase the confidence in the application of artificial intelligence to the field of orthopaedics. This study also demonstrated that pre-training CNN can accelerate the learning process and achieve high accuracy even on a small dataset. For any figures or tables, please contact the authors directly


Bone & Joint Open
Vol. 4, Issue 4 | Pages 250 - 261
7 Apr 2023
Sharma VJ Adegoke JA Afara IO Stok K Poon E Gordon CL Wood BR Raman J

Aims

Disorders of bone integrity carry a high global disease burden, frequently requiring intervention, but there is a paucity of methods capable of noninvasive real-time assessment. Here we show that miniaturized handheld near-infrared spectroscopy (NIRS) scans, operated via a smartphone, can assess structural human bone properties in under three seconds.

Methods

A hand-held NIR spectrometer was used to scan bone samples from 20 patients and predict: bone volume fraction (BV/TV); and trabecular (Tb) and cortical (Ct) thickness (Th), porosity (Po), and spacing (Sp).


Bone & Joint Open
Vol. 2, Issue 2 | Pages 111 - 118
8 Feb 2021
Pettit M Shukla S Zhang J Sunil Kumar KH Khanduja V

Aims

The ongoing COVID-19 pandemic has disrupted and delayed medical and surgical examinations where attendance is required in person. Our article aims to outline the validity of online assessment, the range of benefits to both candidate and assessor, and the challenges to its implementation. In addition, we propose pragmatic suggestions for its introduction into medical assessment.

Methods

We reviewed the literature concerning the present status of online medical and surgical assessment to establish the perceived benefits, limitations, and potential problems with this method of assessment.


Bone & Joint Open
Vol. 1, Issue 6 | Pages 236 - 244
11 Jun 2020
Verstraete MA Moore RE Roche M Conditt MA

Aims

The use of technology to assess balance and alignment during total knee surgery can provide an overload of numerical data to the surgeon. Meanwhile, this quantification holds the potential to clarify and guide the surgeon through the surgical decision process when selecting the appropriate bone recut or soft tissue adjustment when balancing a total knee. Therefore, this paper evaluates the potential of deploying supervised machine learning (ML) models to select a surgical correction based on patient-specific intra-operative assessments.

Methods

Based on a clinical series of 479 primary total knees and 1,305 associated surgical decisions, various ML models were developed. These models identified the indicated surgical decision based on available, intra-operative alignment, and tibiofemoral load data.