The Revision Hip Complexity Classification (RHCC) was developed by modified Delphi system in 2022 to provide a comprehensive, reproducible framework for the multidisciplinary discussion of complex revision hip surgery. The aim of this study was to assess the validity, intra-relater and inter-relater reliability of the RHCC. Radiographs and clinical vignettes of 20 consecutive patients who had undergone revision of Total Hip Arthroplasty (THA) at our unit during the previous 12-month period were provided to observers. Five observers, comprising 3 revision hip consultants, 1 hip fellow and 1 ST3-8 registrar were familiarised with the RHCC. Each revision THA case was classified on two separate occasions by each observer, with a mean time between assessments of 42.6 days (24–57). Inter-observer reliability was assessed using the Fleiss™ Kappa statistic and percentage agreement. Intra-observer reliability was assessed using the Cohen Kappa statistic.
There are concerns that patient-reported outcome measures (PROMs) currently used for adults requiring, undergoing or after undergoing lower limb reconstruction (LLR) are not adequately capturing the range of experiences important to these patients. The ‘Patient-Reported Outcome Measure for Lower Limb Reconstruction’ (PROLLIT) study developed a conceptual framework of outcomes identified as important and relevant by adult LLR patients. This review explored whether existing PROMs address these outcomes, and exhibit content validity in this population. A range of key PROMs was selected (n=32). Systematic and hand-searches were employed to find studies assessing content validity of these PROMs in the adult LLR population, along with PROM content and development information. A systematic review of content validity of the measures was carried out following ‘COnsensus-based Standards for the selection of health Measurement Instruments’ (COSMIN) guidance, alongside conceptual mapping of the content of the PROMs against the PROLLIT conceptual framework.Introduction
Materials & Methods
The Manchester-Oxford Foot Questionnaire (MOxFQ) is an anatomically specific patient-reported outcome measure (PROM) currently used to assess a wide variety of foot and ankle pathology. It consists of 16 items across three subscales measuring distinct but related traits: walking/standing ability, pain, and social interaction. It is the most used foot and ankle PROM in the UK. Initial MOxFQ validation involved analysis of 100 individuals undergoing hallux valgus surgery. This project aimed to establish whether an individual’s response to the MOxFQ varies with anatomical region of disease (measurement invariance), and to explore structural validity of the factor structure (subscale items) of the MOxFQ. This was a single-centre, prospective cohort study involving 6,637 patients (mean age 52 years (SD 17.79)) presenting with a wide range of foot and ankle pathologies between January 2013 and December 2021. To assess whether the MOxFQ responses vary by anatomical region of foot and ankle disease, we performed multigroup confirmatory factor analysis. To assess the structural validity of the subscale items, exploratory and confirmatory factor analyses were performed.Aims
Methods
Children with spinal dysraphism can develop various musculoskeletal deformities, necessitating a range of orthopaedic interventions, causing significant morbidity, and making considerable demands on resources. This systematic review aimed to identify what outcome measures have been reported in the literature for children with spinal dysraphism who undergo orthopaedic interventions involving the lower limbs. A PROSPERO-registered systematic literature review was performed following PRISMA guidelines. All relevant studies published until January 2023 were identified. Individual outcomes and outcome measurement tools were extracted verbatim. The measurement tools were assessed for reliability and validity, and all outcomes were grouped according to the Outcome Measures Recommended for use in Randomized Clinical Trials (OMERACT) filters.Aims
Methods
Aims. The aim of this study was to evaluate the reliability and validity of a patient-specific algorithm which we developed for predicting changes in sagittal pelvic tilt after total hip arthroplasty (THA). Methods. This retrospective study included 143 patients who underwent 171 THAs between April 2019 and October 2020 and had full-body lateral radiographs preoperatively and at one year postoperatively. We measured the pelvic incidence (PI), the sagittal vertical axis (SVA), pelvic tilt, sacral slope (SS), lumbar lordosis (LL), and thoracic kyphosis to classify patients into types A, B1, B2, B3, and C. The change of pelvic tilt was predicted according to the normal range of SVA (0 mm to 50 mm) for types A, B1, B2, and B3, and based on the absolute value of one-third of the PI-LL mismatch for type C patients. The reliability of the classification of the patients and the prediction of the change of pelvic tilt were assessed using kappa values and intraclass correlation coefficients (ICCs), respectively.
Technological advancements in orthopaedic surgery have mainly focused on increasing precision during the operation however, there have been few developments in post-operative physiotherapy. We have developed a computer vision program using machine learning that can virtually measure the range of movement of a joint to track progress after surgery. This data can be used by physiotherapists to change patients’ exercise regimes with more objectively and help patients visualise the progress that they have made. In this study, we tested our program's reliability and validity to find a benchmark for future use on patients. We compared 150 shoulder joint angles, measured using a goniometer, and those calculated by our program called ArmTracking in a group of 10 participants (5 males and 5 females). Reliability was tested using adjusted R squared and validity was tested using 95% limits of agreement. Our clinically acceptable limit of agreement was ± 10° for ArmTracking to be used interchangeably with goniometry. ArmTracking showed excellent overall reliability of 97.1% when all shoulder movements were combined but there were lower scores for some movements like shoulder extension at 75.8%. There was moderate validity shown when all shoulder movements were combined at 9.6° overestimation and 18.3° underestimation. Computer vision programs have a great potential to be used in telerehabilitation to collect useful information as patients carry out prescribed exercises at home. However, they need to be trained well for precise joint detections to reduce the range of errors in readings.
The principles of evidence-based medicine (EBM) are the foundation of modern medical practice. Surgeons are familiar with the commonly used statistical techniques to test hypotheses, summarize findings, and provide answers within a specified range of probability. Based on this knowledge, they are able to critically evaluate research before deciding whether or not to adopt the findings into practice. Recently, there has been an increased use of artificial intelligence (AI) to analyze information and derive findings in orthopaedic research. These techniques use a set of statistical tools that are increasingly complex and may be unfamiliar to the orthopaedic surgeon. It is unclear if this shift towards less familiar techniques is widely accepted in the orthopaedic community. This study aimed to provide an exploration of understanding and acceptance of AI use in research among orthopaedic surgeons. Semi-structured in-depth interviews were carried out on a sample of 12 orthopaedic surgeons. Inductive thematic analysis was used to identify key themes.Aims
Methods
The purpose of this study was to assess the reliability and responsiveness to hip surgery of a four-point modified Care and Comfort Hypertonicity Questionnaire (mCCHQ) scoring tool in children with cerebral palsy (CP) in Gross Motor Function Classification System (GMFCS) levels IV and V. This was a population-based cohort study in children with CP from a national surveillance programme. Reliability was assessed from 20 caregivers who completed the mCCHQ questionnaire on two occasions three weeks apart. Test-retest reliability of the mCCHQ was calculated, and responsiveness before and after surgery for a displaced hip was evaluated in a cohort of children.Aims
Methods
Challenges in surgical training have led to the exploration of technologies such as augmented reality (AR), which present novel approaches to teaching orthopaedic procedures to medical students. The aim of this double-blinded randomised-controlled trial was to compare the validity and training effect of AR to traditional teaching on medical students’ understanding of total knee arthroplasty (TKA). Twenty medical students from 7 UK universities were randomised equally to either intervention or control groups. The control received a consultant-led teaching session and the intervention received training via Microsoft HoloLens, where surgeons were able to project virtual information over physical objects. Participants completed written knowledge and practical exams which were assessed by 2 orthopaedic consultants. Training superiority was established via 4 quantitative outcome measures: OSATS scores, a checklist of TKA-specific steps, procedural time, and written exam scores. Qualitative feedback was evaluated using a 5-point Likert scale.Abstract
Introduction
Methodology
Reimers migration percentage (MP) is a key measure to inform decision-making around the management of hip displacement in cerebral palsy (CP). The aim of this study is to assess validity and inter- and intra-rater reliability of a novel method of measuring MP using a smart phone app (HipScreen (HS) app). A total of 20 pelvis radiographs (40 hips) were used to measure MP by using the HS app. Measurements were performed by five different members of the multidisciplinary team, with varying levels of expertise in MP measurement. The same measurements were repeated two weeks later. A senior orthopaedic surgeon measured the MP on picture archiving and communication system (PACS) as the gold standard and repeated the measurements using HS app. Pearson’s correlation coefficient (r) was used to compare PACS measurements and all HS app measurements and assess validity. Intraclass correlation coefficient (ICC) was used to assess intra- and inter-rater reliability.Aims
Methods
We aimed to assess the reliability and validity of OpenPose, a posture estimation algorithm, for measurement of knee range of motion after total knee arthroplasty (TKA), in comparison to radiography and goniometry. In this prospective observational study, we analyzed 35 primary TKAs (24 patients) for knee osteoarthritis. We measured the knee angles in flexion and extension using OpenPose, radiography, and goniometry. We assessed the test-retest reliability of each method using intraclass correlation coefficient (1,1). We evaluated the ability to estimate other measurement values from the OpenPose value using linear regression analysis. We used intraclass correlation coefficients (2,1) and Bland–Altman analyses to evaluate the agreement and error between radiography and the other measurements.Aims
Methods
For clinical movement analysis, optical marker-based motion capture is the gold standard. With the advancement of AI-driven computer vision, markerless motion capture (MMC) has emerged.
The tendency towards using inertial sensors for remote monitoring of the patients at home is increasing. One of the most important characteristics of the sensors is sampling rate. Higher sampling rate results in higher resolution of the sampled signal and lower amount of noise. However, higher sampling frequency comes with a cost. The main aim of our study was to determine the validity of measurements performed by low sampling frequency (12.5 Hz) accelerometers (SENS) in patients with knee osteoarthritis compared to standard sensor-based motion capture system (Xsens). We also determined the test-retest reliability of SENS accelerometers. Participants were patients with unilateral knee osteoarthritis. Gait analysis was performed simultaneously by using Xsens and SENS sensors during two repetitions of over-ground walking at a self-selected speed. Gait data from Xsens were used as an input for AnyBody musculoskeletal modeling software to measure the accelerations at the exact location of two defined virtual sensors in the model (VirtualSENS). After preprocessing, the signals from SENS and VirtualSENS were compared in different coordinate axes in time and frequency domains. ICC for SENS data from first and second trials were calculated to assess the repeatability of the measurements. We included 32 patients (18 females) with median age 70.1[48.1 – 85.4]. Mean height and weight of the patients were 173.2 ± 9.6 cm and 84.2 ± 14.7 kg respectively. The correlation between accelerations in time domain measured by SENS and VirtualSENS in different axes was r = 0.94 in y-axis (anteroposterior), r = 0.91 in x-axis (vertical), r = 0.83 in z-axis (mediolateral), and r = 0.89 for the magnitude vector. In frequency domain, the value and the power of fundamental frequencies (F0) of SENS and VirtualSENS signals demonstrated strong correlation (r = 0.98 and r = 0.99 respectively). The result of test-retest evaluation showed excellent repeatability for acceleration measurement by SENS sensors. ICC was between 0.89 to 0.94 for different coordinate axes. Low sampling frequency accelerometers can provide valid and reliable measurements especially for home monitoring of the patients, in which handling big data and sensors cost and battery lifetime are among important issues.
The Severity Scoring System (SSS) is a guide to interpreting findings across clinical, functional, and radiological findings, used by qualified, specially trained physiotherapists in the advanced practice role in order to provide consistency in determining the severity of the patient's condition and need for surgical consultation. The system has been utilized for over 14 years as a part of standardized assessment and management care and was incorporated into virtual care in 2020 following the pandemic restrictions. The present study examined the validity of the modified SSS in virtual care. Patients who were referred to the Rapid Access Clinic (RAC), were contacted via phone by two experienced advanced practice practitioners (APPs) from May to July 2020, when in-person care was halted due to the pandemic. The virtual interview included taking history, completing self-reported measures for pain and functional ability and reviewing the radiological reports. A total of 63 patients were interviewed (mean age 68, SD=9), 34 (54%) females. Of 63 patients, 33 (52%) were considered a candidate for total knee arthroplasty (TKA). Men and women were comparable in age, P4 and LEFS scores. The TKA candidates had a significantly higher SSS (p<0.0001) and pain scores (p=0.024). The variability of the total SSS score explained by the functional, clinical and radiological components of the tool were 55%, 48% and 4% respectively, highlighting the more important role of patient's clinical history and disability in the total SSS. The virtual SSS is a valid tool in directing patients for surgical management when used by highly trained advanced practice physiotherapists. A large component of the SSS is based on clinical data and patient disability and the APP's skillset rather than severity of pathology found on imaging.
The metabolic equivalent of task (MET) score examines patient performance in relation to energy expenditure before and after knee arthroplasty. This study assesses its use in a knee arthroplasty population in comparison with the widely used Oxford Knee Score (OKS) and EuroQol five-dimension index (EQ-5D), which are reported to be limited by ceiling effects. A total of 116 patients with OKS, EQ-5D, and MET scores before, and at least six months following, unilateral primary knee arthroplasty were identified from a database. Procedures were performed by a single surgeon between 2014 and 2019 consecutively. Scores were analyzed for normality, skewness, kurtosis, and the presence of ceiling/floor effects. Concurrent validity between the MET score, OKS, and EQ-5D was assessed using Spearman’s rank.Aims
Methods
Augmented reality simulators offer opportunities for practice of orthopaedic procedures outside of theatre environments. We developed an augmented reality simulator that allows trainees to practice pinning of paediatric supracondylar humeral fractures (SCHF) in a radiation-free environment at no extra risk to patients. The simulator is composed of a tangible child's elbow model, and simulated fluoroscopy on a tablet device. The treatment of these fractures is likely one of the first procedures involving X-ray guided wire insertion that trainee orthopaedic surgeons will encounter. This study aims to examine the extent of improvement simulator training provides to real-world operating theatre performance. This multi-centre study will involve four cohorts of New Zealand orthopaedic trainees in their SET1 year. Trainees with no simulator exposure in 2019 - 2021 will form the comparator cohort. Trainees in 2022 will receive additional, regular simulator training as the intervention cohort. The comparator cohort's performance in paediatric SCHF surgery will be retrospectively audited using routinely collected operative outcomes and parameters over a six-month period. The performance of the intervention cohorts will be collected in the same way over a comparable period. The data collected for both groups will be used to examine whether additional training with an augmented reality simulator shows improved real-world surgical outcomes compared to traditional surgical training. This protocol has been approved by the University of Otago Health Ethics committee, and the study is due for completion in 2024. This study is the first nation-wide transfer validity study of a surgical simulator in New Zealand. As of September 2022, all trainees in the intervention cohort have been recruited along with eight retrospective trainees via email. We present this protocol to maintain transparency of the prespecified research plans and ensure robust scientific methods. This protocol may also assist other researchers conducting similar studies within small populations.
We test the clinical validity and financial implications of the proposed Choosing Wisely statement: “Using ultrasound as a screening test for shoulder instability is inappropriate in people under 30 years of age, unless there is clinical suspicion of a rotator cuff tear.” A retrospective chart review from a specialist shoulder surgeon's practice over a two-year period recorded 124 patients under the age of 30 referred with shoulder instability. Of these, forty-one had already had ultrasound scans performed prior to specialist review. The scan results and patient files were reviewed to determine the reported findings on the scans and whether these findings were clinically relevant to diagnosis and decision-making. Comparison was made with subsequent MRI scan results. The data, obtained from the Accident Compensation Corporation (ACC), recorded the number of cases and costs incurred for ultrasound scans of the shoulder in patients under 30 years old over a 10-year period. There were no cases where the ultrasound scan was considered useful in decision-making. No patient had a full thickness rotator cuff tear. Thirty-nine of the 41 patients subsequently had MRI scans. The cost to the ACC for funding ultrasound scans in patients under 30 has increased over the last decade and exceeded one million dollars in the 2020/2021 financial year. In addition, patients pay a surcharge for this test. The proposed Choosing Wisely statement is valid. This evidence supports that ultrasound is an unnecessary investigation for patients with shoulder instability unless there is clinical suspicion of a rotator cuff tear. Ultrasound also incurs costs to the insurer (ACC) and the patient. We recommend x-rays and, if further imaging is indicated, High Tech Imaging with MRI and sometimes CT scans in these patients
External validation of machine learning predictive models is achieved through evaluation of model performance on different groups of patients than were used for algorithm development. This important step is uncommonly performed, inhibiting clinical translation of newly developed models. Recently, machine learning was used to develop a tool that can quantify revision risk for a patient undergoing primary anterior cruciate ligament (ACL) reconstruction (https://swastvedt.shinyapps.io/calculator_rev/). The source of data included nearly 25,000 patients with primary ACL reconstruction recorded in the Norwegian Knee Ligament Register (NKLR). The result was a well-calibrated tool capable of predicting revision risk one, two, and five years after primary ACL reconstruction with moderate accuracy. The purpose of this study was to determine the external validity of the NKLR model by assessing algorithm performance when applied to patients from the Danish Knee Ligament Registry (DKLR). The primary outcome measure of the NKLR model was probability of revision ACL reconstruction within 1, 2, and/or 5 years. For the index study, 24 total predictor variables in the NKLR were included and the models eliminated variables which did not significantly improve prediction ability - without sacrificing accuracy. The result was a well calibrated algorithm developed using the Cox Lasso model that only required five variables (out of the original 24) for outcome prediction. For this external validation study, all DKLR patients with complete data for the five variables required for NKLR prediction were included. The five variables were: graft choice, femur fixation device, Knee Injury and Osteoarthritis Outcome Score (KOOS) Quality of Life subscale score at surgery, years from injury to surgery, and age at surgery. Predicted revision probabilities were calculated for all DKLR patients. The model performance was assessed using the same metrics as the NKLR study: concordance and calibration. In total, 10,922 DKLR patients were included for analysis. Average follow-up time or time-to-revision was 8.4 (±4.3) years and overall revision rate was 6.9%. Surgical technique trends (i.e., graft choice and fixation devices) and injury characteristics (i.e., concomitant meniscus and cartilage pathology) were dissimilar between registries. The model produced similar concordance when applied to the DKLR population compared to the original NKLR test data (DKLR: 0.68; NKLR: 0.68-0.69). Calibration was poorer for the DKLR population at one and five years post primary surgery but similar to the NKLR at two years. The NKLR machine learning algorithm demonstrated similar performance when applied to patients from the DKLR, suggesting that it is valid for application outside of the initial patient population. This represents the first machine learning model for predicting revision ACL reconstruction that has been externally validated. Clinicians can use this in-clinic calculator to estimate revision risk at a patient specific level when discussing outcome expectations pre-operatively. While encouraging, it should be noted that the performance of the model on patients undergoing ACL reconstruction outside of Scandinavia remains unknown.
Literature surrounding artificial intelligence (AI)-related applications for hip and knee arthroplasty has proliferated. However, meaningful advances that fundamentally transform the practice and delivery of joint arthroplasty are yet to be realized, despite the broad range of applications as we continue to search for meaningful and appropriate use of AI. AI literature in hip and knee arthroplasty between 2018 and 2021 regarding image-based analyses, value-based care, remote patient monitoring, and augmented reality was reviewed. Concerns surrounding meaningful use and appropriate methodological approaches of AI in joint arthroplasty research are summarized. Of the 233 AI-related orthopaedics articles published, 178 (76%) constituted original research, while the rest consisted of editorials or reviews. A total of 52% of original AI-related research concerns hip and knee arthroplasty (n = 92), and a narrative review is described. Three studies were externally validated. Pitfalls surrounding present-day research include conflating vernacular (“AI/machine learning”), repackaging limited registry data, prematurely releasing internally validated prediction models, appraising model architecture instead of inputted data, withholding code, and evaluating studies using antiquated regression-based guidelines. While AI has been applied to a variety of hip and knee arthroplasty applications with limited clinical impact, the future remains promising if the question is meaningful, the methodology is rigorous and transparent, the data are rich, and the model is externally validated. Simple checkpoints for meaningful AI adoption include ensuring applications focus on: administrative support over clinical evaluation and management; necessity of the advanced model; and the novelty of the question being answered. Cite this article: