External validation of machine learning predictive models is achieved through evaluation of model performance on different groups of patients than were used for algorithm development. This important step is uncommonly performed, inhibiting clinical translation of newly developed models. Recently, machine learning was used to develop a tool that can quantify revision risk for a patient undergoing primary anterior cruciate ligament (ACL) reconstruction (https://swastvedt.shinyapps.io/calculator_rev/). The source of data included nearly 25,000 patients with primary ACL reconstruction recorded in the Norwegian Knee Ligament Register (NKLR). The result was a well-calibrated tool capable of predicting revision risk one, two, and five years after primary ACL reconstruction with moderate accuracy. The purpose of this study was to determine the external validity of the NKLR model by assessing algorithm performance when applied to patients from the Danish Knee Ligament Registry (DKLR). The primary outcome measure of the NKLR model was probability of revision ACL reconstruction within 1, 2, and/or 5 years. For the index study, 24 total predictor variables in the NKLR were included and the models eliminated variables which did not significantly improve prediction ability - without sacrificing accuracy. The result was a well calibrated algorithm developed using the Cox Lasso model that only required five variables (out of the original 24) for outcome prediction. For this external validation study, all DKLR patients with complete data for the five variables required for NKLR prediction were included. The five variables were: graft choice, femur fixation device, Knee Injury and Osteoarthritis Outcome Score (KOOS) Quality of Life subscale score at surgery, years from injury to surgery, and age at surgery. Predicted revision probabilities were calculated for all DKLR patients. The model performance was assessed using the same metrics as the NKLR study: concordance and calibration. In total, 10,922 DKLR patients were included for analysis. Average follow-up time or time-to-revision was 8.4 (±4.3) years and overall revision rate was 6.9%. Surgical technique trends (i.e., graft choice and fixation devices) and injury characteristics (i.e., concomitant meniscus and cartilage pathology) were dissimilar between registries. The model produced similar concordance when applied to the DKLR population compared to the original NKLR test data (DKLR: 0.68; NKLR: 0.68-0.69). Calibration was poorer for the DKLR population at one and five years post primary surgery but similar to the NKLR at two years. The NKLR machine learning algorithm demonstrated similar performance when applied to patients from the DKLR, suggesting that it is valid for application outside of the initial patient population. This represents the first machine learning model for predicting revision ACL reconstruction that has been externally validated. Clinicians can use this in-clinic calculator to estimate revision risk at a patient specific level when discussing outcome expectations pre-operatively. While encouraging, it should be noted that the performance of the model on patients undergoing ACL reconstruction outside of Scandinavia remains unknown.
We test the clinical validity and financial implications of the proposed Choosing Wisely statement: “Using ultrasound as a screening test for shoulder instability is inappropriate in people under 30 years of age, unless there is clinical suspicion of a rotator cuff tear.” A retrospective chart review from a specialist shoulder surgeon's practice over a two-year period recorded 124 patients under the age of 30 referred with shoulder instability. Of these, forty-one had already had ultrasound scans performed prior to specialist review. The scan results and patient files were reviewed to determine the reported findings on the scans and whether these findings were clinically relevant to diagnosis and decision-making. Comparison was made with subsequent MRI scan results. The data, obtained from the Accident Compensation Corporation (ACC), recorded the number of cases and costs incurred for ultrasound scans of the shoulder in patients under 30 years old over a 10-year period. There were no cases where the ultrasound scan was considered useful in decision-making. No patient had a full thickness rotator cuff tear. Thirty-nine of the 41 patients subsequently had MRI scans. The cost to the ACC for funding ultrasound scans in patients under 30 has increased over the last decade and exceeded one million dollars in the 2020/2021 financial year. In addition, patients pay a surcharge for this test. The proposed Choosing Wisely statement is valid. This evidence supports that ultrasound is an unnecessary investigation for patients with shoulder instability unless there is clinical suspicion of a rotator cuff tear. Ultrasound also incurs costs to the insurer (ACC) and the patient. We recommend x-rays and, if further imaging is indicated, High Tech Imaging with MRI and sometimes CT scans in these patients
Augmented reality simulators offer opportunities for practice of orthopaedic procedures outside of theatre environments. We developed an augmented reality simulator that allows trainees to practice pinning of paediatric supracondylar humeral fractures (SCHF) in a radiation-free environment at no extra risk to patients. The simulator is composed of a tangible child's elbow model, and simulated fluoroscopy on a tablet device. The treatment of these fractures is likely one of the first procedures involving X-ray guided wire insertion that trainee orthopaedic surgeons will encounter. This study aims to examine the extent of improvement simulator training provides to real-world operating theatre performance. This multi-centre study will involve four cohorts of New Zealand orthopaedic trainees in their SET1 year. Trainees with no simulator exposure in 2019 - 2021 will form the comparator cohort. Trainees in 2022 will receive additional, regular simulator training as the intervention cohort. The comparator cohort's performance in paediatric SCHF surgery will be retrospectively audited using routinely collected operative outcomes and parameters over a six-month period. The performance of the intervention cohorts will be collected in the same way over a comparable period. The data collected for both groups will be used to examine whether additional training with an augmented reality simulator shows improved real-world surgical outcomes compared to traditional surgical training. This protocol has been approved by the University of Otago Health Ethics committee, and the study is due for completion in 2024. This study is the first nation-wide transfer validity study of a surgical simulator in New Zealand. As of September 2022, all trainees in the intervention cohort have been recruited along with eight retrospective trainees via email. We present this protocol to maintain transparency of the prespecified research plans and ensure robust scientific methods. This protocol may also assist other researchers conducting similar studies within small populations.
There are concerns that patient-reported outcome measures (PROMs) currently used for adults requiring, undergoing or after undergoing lower limb reconstruction (LLR) are not adequately capturing the range of experiences important to these patients. The ‘Patient-Reported Outcome Measure for Lower Limb Reconstruction’ (PROLLIT) study developed a conceptual framework of outcomes identified as important and relevant by adult LLR patients. This review explored whether existing PROMs address these outcomes, and exhibit content validity in this population. A range of key PROMs was selected (n=32). Systematic and hand-searches were employed to find studies assessing content validity of these PROMs in the adult LLR population, along with PROM content and development information. A systematic review of content validity of the measures was carried out following ‘COnsensus-based Standards for the selection of health Measurement Instruments’ (COSMIN) guidance, alongside conceptual mapping of the content of the PROMs against the PROLLIT conceptual framework.Introduction
Materials & Methods
Patients undergoing limb reconstruction surgery often face a challenging and often lengthy process to complete their treatment journey. The majority of existing outcome measures do not adequately capture the patient reported outcomes relevant to this patient group in a single measure. Following a previous systematic review, the Stanmore Limb Reconstruction Score (SLRS) was designed with the intent to address this need for an effective instrument to measure patient reported outcomes in limb reconstruction patients. The SLRS was designed following the use of structured interviews with a group of patients who have undergone limb reconstruction surgery, limb reconstruction surgeons, specialist nurses and physiotherapists. This has undergone further adjustment for language and clarity. The score was then trialled on 10 patients who have been through the process of limb reconstruction surgery, with subsequent structured questioning to understand the perceived suitability.Introduction
Materials and Methods
Objective evaluations of resident performance can be difficult to simulate. A novel competency based surgical OSCE was developed to evaluate surgical skill. The goal of this study was to test the construct validity comparing previously validated Ottawa scores (O-scores) and Orthopaedic in-training evaluation scores (OITE). An OSCE designed to simulate typical general orthopaedic surgical cases was developed to evaluate resident surgical performance. Post-graduate year (PGY) 3–5 trainees have an encounter (interview and physical exam) with a standardized patient and perform a correlating surgery on a cadaver. Examiners evaluate all components of the treatment plan and provide an overall score on the OSCE and also provide an O-score on overall surgical performance. Convergent and divergent validity was assessed comparing OSCE scores to O-scores and OITE scores. SPSS was used for statistical analysis. ANOVA was used to compare PGY averages and Pearson correlation coefficients were calculated to compare OSCE versus O-score and OITE scores. A total of 96 simulated surgical cases were evaluated over a 3 year period for 24 trainees. There was a significant difference in OSCE scores based on year of training. (PGY3 − 6.06/15, PGY4 − 8.16/15 and PGY5 − 11.14/15, p < 0 .001). OSCE and O-scores demonstrated a strong positive correlation of +0.89 while OSCE and OITE scores demonstrated a moderate positive correlation of 0.68. OSCE scores demonstrated strong convergent and moderate divergent correlation. A positive trajectory based on level of training and stronger correlations with established, validated scores supports the construct validity of the novel surgical OSCE.
Accurate and reproducible radiological assessment of shoulder replacement prostheses over time is important for identifying failure or to provide reassurance. A number of clearly defined radiological parameters have been described to help standardise the radiological assessment of prostheses. To our knowledge, this is the first study conducted to test the reproducibility and reliability of these measurements. The aim of this work was to test intraobserver reproducibility and interobserver reliability in the measurement of humeral component orientation (HCO), humeral head offset (HHO), humeral head size (HHS), humeral head height (HHH), and acromiohumeral distance (AHD.)Background
Aim
The NDI is a simple 10-item questionnaire used to assess patients with neck pain. The original validation was performed on 52 patients with neck pain and the test-retest on 17 whiplash patients with a 2-day interval. The SF36 measures functional ability, wellbeing and the overall health of patients. It is used in health economics to assess the health utility, gain and economic impact of medical interventions. Objectives were to independently validate the NDI in patients with neck pain and to draw comparison between the NDI and SF36. 160 patients with neck pain attending the spinal clinic completed self-assessment questionnaires. A second questionnaire was completed in 34 patients after a period of 1-2 weeks. The internal consistency of the NDI and SF36 was calculated using Cronbach alpha. The test-retest reliability was assessed using the Bland and Altman method and the concurrent validity between the two questionnaires was assessed using Pearson correlation. Both questionnaires showed robust internal consistency: SF36 alpha = 0.878 (se=0.014, 95%CI=0.843 to 0.906) and NDI = 0.864 (se=0.017, 95%CI=0.825 to 0.894). The NDI had significant correlation to all eight domains of the SF36 (p<0.001). The individual scores for each of the ten items had significant correlation with the total disability score (p<0.001). The test-retest reliability of the NDI was acceptable. We have shown irrefutably that the NDI has good reliability and validity and that it stands up well to the SF36.
The Nerve Root Sedimentation Sign in transverse magnetic resonance imaging has been shown to discriminate well between selected patients with and without lumbar spinal stenosis (LSS), but the performance of this new test, when used in a broad patient population, is not yet known (Barz et al. 2010). We conducted a retrospective study of consecutive patients with suspected LSS from 2004–2006, before the sign had been described, to assess its association with health outcomes. Based on clinical and radiological diagnostics, patients had been treated with decompression surgery or conservative treatment (physical therapy, oral pain medication). Changes in the Oswestry Disability Index (ODI) from baseline to 24 month follow-up were compared between Sedimentation Sign positives and negatives in both treatment arms. Of the 146 included patients (52% female, mean age 59 yrs), 71 underwent surgery. Baseline ODI in this treatment arm was 52%, the sign was positive in 44 patients (mean ODI improvement 25 points) and negative in 27 (ODI improvement 24), with no significant difference between groups. In the 75 patients of the conservative treatment arm, baseline ODI was 44%, the sign was negative in 45 (ODI improvement 17), and positive in 30 (ODI improvement 5). Here a positive sign was associated with a smaller ODI improvement compared with sign negatives (t-test, p=0.003). This study allowed an unbiased clinical validation of the Sedimentation Sign by avoiding it influencing treatment selection. In the conservative treatment arm a positive sign identifies a group of patients who are less likely to benefit. In these cases, surgery might be effective; however, this needs confirmation in prospective studies.
Patients undergoing limb reconstruction surgery often face a challenging and lengthy process to complete their treatment journey. The majority of existing outcome measures do not adequately capture the patient-reported outcomes relevant to this patient group in a single measure. Following a previous systematic review, the Stanmore Limb Reconstruction Score (SLRS) was designed with the intent to address this need for an effective instrument to measure patient-reported outcomes in limb reconstruction patients. We aim to assess the face validity of this score in a pilot study. The SLRS was designed following structured interviews with several groups including patients who have undergone limb reconstruction surgery, limb reconstruction surgeons, specialist nurses, and physiotherapists. This has subsequently undergone further adjustment for language and clarity. The score was then trialled on ten patients who had undergone limb reconstruction surgery, with subsequent structured questioning to understand the perceived suitability of the score.Aims
Methods
The principles of evidence-based medicine (EBM) are the foundation of modern medical practice. Surgeons are familiar with the commonly used statistical techniques to test hypotheses, summarize findings, and provide answers within a specified range of probability. Based on this knowledge, they are able to critically evaluate research before deciding whether or not to adopt the findings into practice. Recently, there has been an increased use of artificial intelligence (AI) to analyze information and derive findings in orthopaedic research. These techniques use a set of statistical tools that are increasingly complex and may be unfamiliar to the orthopaedic surgeon. It is unclear if this shift towards less familiar techniques is widely accepted in the orthopaedic community. This study aimed to provide an exploration of understanding and acceptance of AI use in research among orthopaedic surgeons. Semi-structured in-depth interviews were carried out on a sample of 12 orthopaedic surgeons. Inductive thematic analysis was used to identify key themes.Aims
Methods
Introduction. The Constant Score (CS) and the Oxford Shoulder Score (OSS) are shoulder scoring systems routinely used in the UK. Patients with Acromio-Clavicular Joint (ACJ) and Sterno-Clavicular Joint (SCJ) injuries and those with clavicle fractures tend to be younger and more active than those with other shoulder pathologies. While the CS takes into account the recreational outcomes for such patients the weighting is very small. We developed the Nottingham Clavicle Score (NCS) specifically for this group of patients. Methods. We recruited 70 patients into a cohort study in which pre-operative and 6 month post-operative evaluations of outcome were reviewed using the CS, the OSS the Imatani Score (IS) and the EQ-5D scores which were compared with the NCS. Reliability was assessed using Cronbach's alpha. Reproducibility of the NCS was assessed using the test/re-test method. Each of the 10 items of the NCS was evaluated for their sensitivity and contribution to the total score of 100.
Aim. To construct and validate a simple patient related outcome measure scheme to quantify the disability caused by Dupuytren's Disease thus enabling prioritisation of treatment, allow reliable audit of surgical outcome and support future research. Methods. The Southampton Dupuytren's Scoring System (SDSS) was developed in a staged fashion according to the recommendations of Derby Outcomes Conference. item generation from questionnaire filled in by 20 patients;. item reduction to create a 20-question proforma;. Internal consistency (Cronbach's alpha). Test-retest (3 week interval testing on 61 patients). Field management was used to assess the user friendliness of the scoring system. Sensitivity to change Standardised response mean. Construct validity: ability of the SDSS to measure what it is supposed to measure. comparing SDSS with QuickDASH (Disability of Arm, Shoulder and Hand). Results. Internal consistency. Cronbach's alpha was 0.87. (Cronbach's alpha of 0.8–0.9 indicates acceptable reliability). Test-retest reliability. The test re-test correlation coefficient was 0.79 between SDSS scores at a three-week interval (high reliability). Field-testing. The SDSS ratings were found to be higher than the QDASH ratings evaluated by the patients who answered both questionnaires. Sensitivity to change. Standardised response mean was more sensitive for SDSS compared to QuickDASH (−1.76 vs −1.19 p>0.05). Construct
This study utilised NJR primary hip data from the 6. th. Annual Report to determine the rate and indication for revision between cemented, uncemented, hybrid and resurfacing prosthetic groups. Regression analysis was performed to identify the influence of gender and ASA grade on these revision rates.