The Revision Hip Complexity Classification (RHCC) was developed by modified Delphi system in 2022 to provide a comprehensive, reproducible framework for the multidisciplinary discussion of complex revision hip surgery. The aim of this study was to assess the validity, intra-relater and inter-relater reliability of the RHCC. Radiographs and clinical vignettes of 20 consecutive patients who had undergone revision of Total Hip Arthroplasty (THA) at our unit during the previous 12-month period were provided to observers. Five observers, comprising 3 revision hip consultants, 1 hip fellow and 1 ST3-8 registrar were familiarised with the RHCC. Each revision THA case was classified on two separate occasions by each observer, with a mean time between assessments of 42.6 days (24–57). Inter-observer reliability was assessed using the Fleiss™ Kappa statistic and percentage agreement. Intra-observer reliability was assessed using the Cohen Kappa statistic.
The Severity Scoring System (SSS) is a guide to interpreting findings across clinical, functional, and radiological findings, used by qualified, specially trained physiotherapists in the advanced practice role in order to provide consistency in determining the severity of the patient's condition and need for surgical consultation. The system has been utilized for over 14 years as a part of standardized assessment and management care and was incorporated into virtual care in 2020 following the pandemic restrictions. The present study examined the validity of the modified SSS in virtual care. Patients who were referred to the Rapid Access Clinic (RAC), were contacted via phone by two experienced advanced practice practitioners (APPs) from May to July 2020, when in-person care was halted due to the pandemic. The virtual interview included taking history, completing self-reported measures for pain and functional ability and reviewing the radiological reports. A total of 63 patients were interviewed (mean age 68, SD=9), 34 (54%) females. Of 63 patients, 33 (52%) were considered a candidate for total knee arthroplasty (TKA). Men and women were comparable in age, P4 and LEFS scores. The TKA candidates had a significantly higher SSS (p<0.0001) and pain scores (p=0.024). The variability of the total SSS score explained by the functional, clinical and radiological components of the tool were 55%, 48% and 4% respectively, highlighting the more important role of patient's clinical history and disability in the total SSS. The virtual SSS is a valid tool in directing patients for surgical management when used by highly trained advanced practice physiotherapists. A large component of the SSS is based on clinical data and patient disability and the APP's skillset rather than severity of pathology found on imaging.
External validation of machine learning predictive models is achieved through evaluation of model performance on different groups of patients than were used for algorithm development. This important step is uncommonly performed, inhibiting clinical translation of newly developed models. Recently, machine learning was used to develop a tool that can quantify revision risk for a patient undergoing primary anterior cruciate ligament (ACL) reconstruction (https://swastvedt.shinyapps.io/calculator_rev/). The source of data included nearly 25,000 patients with primary ACL reconstruction recorded in the Norwegian Knee Ligament Register (NKLR). The result was a well-calibrated tool capable of predicting revision risk one, two, and five years after primary ACL reconstruction with moderate accuracy. The purpose of this study was to determine the external validity of the NKLR model by assessing algorithm performance when applied to patients from the Danish Knee Ligament Registry (DKLR). The primary outcome measure of the NKLR model was probability of revision ACL reconstruction within 1, 2, and/or 5 years. For the index study, 24 total predictor variables in the NKLR were included and the models eliminated variables which did not significantly improve prediction ability - without sacrificing accuracy. The result was a well calibrated algorithm developed using the Cox Lasso model that only required five variables (out of the original 24) for outcome prediction. For this external validation study, all DKLR patients with complete data for the five variables required for NKLR prediction were included. The five variables were: graft choice, femur fixation device, Knee Injury and Osteoarthritis Outcome Score (KOOS) Quality of Life subscale score at surgery, years from injury to surgery, and age at surgery. Predicted revision probabilities were calculated for all DKLR patients. The model performance was assessed using the same metrics as the NKLR study: concordance and calibration. In total, 10,922 DKLR patients were included for analysis. Average follow-up time or time-to-revision was 8.4 (±4.3) years and overall revision rate was 6.9%. Surgical technique trends (i.e., graft choice and fixation devices) and injury characteristics (i.e., concomitant meniscus and cartilage pathology) were dissimilar between registries. The model produced similar concordance when applied to the DKLR population compared to the original NKLR test data (DKLR: 0.68; NKLR: 0.68-0.69). Calibration was poorer for the DKLR population at one and five years post primary surgery but similar to the NKLR at two years. The NKLR machine learning algorithm demonstrated similar performance when applied to patients from the DKLR, suggesting that it is valid for application outside of the initial patient population. This represents the first machine learning model for predicting revision ACL reconstruction that has been externally validated. Clinicians can use this in-clinic calculator to estimate revision risk at a patient specific level when discussing outcome expectations pre-operatively. While encouraging, it should be noted that the performance of the model on patients undergoing ACL reconstruction outside of Scandinavia remains unknown.
We test the clinical validity and financial implications of the proposed Choosing Wisely statement: “Using ultrasound as a screening test for shoulder instability is inappropriate in people under 30 years of age, unless there is clinical suspicion of a rotator cuff tear.” A retrospective chart review from a specialist shoulder surgeon's practice over a two-year period recorded 124 patients under the age of 30 referred with shoulder instability. Of these, forty-one had already had ultrasound scans performed prior to specialist review. The scan results and patient files were reviewed to determine the reported findings on the scans and whether these findings were clinically relevant to diagnosis and decision-making. Comparison was made with subsequent MRI scan results. The data, obtained from the Accident Compensation Corporation (ACC), recorded the number of cases and costs incurred for ultrasound scans of the shoulder in patients under 30 years old over a 10-year period. There were no cases where the ultrasound scan was considered useful in decision-making. No patient had a full thickness rotator cuff tear. Thirty-nine of the 41 patients subsequently had MRI scans. The cost to the ACC for funding ultrasound scans in patients under 30 has increased over the last decade and exceeded one million dollars in the 2020/2021 financial year. In addition, patients pay a surcharge for this test. The proposed Choosing Wisely statement is valid. This evidence supports that ultrasound is an unnecessary investigation for patients with shoulder instability unless there is clinical suspicion of a rotator cuff tear. Ultrasound also incurs costs to the insurer (ACC) and the patient. We recommend x-rays and, if further imaging is indicated, High Tech Imaging with MRI and sometimes CT scans in these patients
Technological advancements in orthopaedic surgery have mainly focused on increasing precision during the operation however, there have been few developments in post-operative physiotherapy. We have developed a computer vision program using machine learning that can virtually measure the range of movement of a joint to track progress after surgery. This data can be used by physiotherapists to change patients’ exercise regimes with more objectively and help patients visualise the progress that they have made. In this study, we tested our program's reliability and validity to find a benchmark for future use on patients. We compared 150 shoulder joint angles, measured using a goniometer, and those calculated by our program called ArmTracking in a group of 10 participants (5 males and 5 females). Reliability was tested using adjusted R squared and validity was tested using 95% limits of agreement. Our clinically acceptable limit of agreement was ± 10° for ArmTracking to be used interchangeably with goniometry. ArmTracking showed excellent overall reliability of 97.1% when all shoulder movements were combined but there were lower scores for some movements like shoulder extension at 75.8%. There was moderate validity shown when all shoulder movements were combined at 9.6° overestimation and 18.3° underestimation. Computer vision programs have a great potential to be used in telerehabilitation to collect useful information as patients carry out prescribed exercises at home. However, they need to be trained well for precise joint detections to reduce the range of errors in readings.
Augmented reality simulators offer opportunities for practice of orthopaedic procedures outside of theatre environments. We developed an augmented reality simulator that allows trainees to practice pinning of paediatric supracondylar humeral fractures (SCHF) in a radiation-free environment at no extra risk to patients. The simulator is composed of a tangible child's elbow model, and simulated fluoroscopy on a tablet device. The treatment of these fractures is likely one of the first procedures involving X-ray guided wire insertion that trainee orthopaedic surgeons will encounter. This study aims to examine the extent of improvement simulator training provides to real-world operating theatre performance. This multi-centre study will involve four cohorts of New Zealand orthopaedic trainees in their SET1 year. Trainees with no simulator exposure in 2019 - 2021 will form the comparator cohort. Trainees in 2022 will receive additional, regular simulator training as the intervention cohort. The comparator cohort's performance in paediatric SCHF surgery will be retrospectively audited using routinely collected operative outcomes and parameters over a six-month period. The performance of the intervention cohorts will be collected in the same way over a comparable period. The data collected for both groups will be used to examine whether additional training with an augmented reality simulator shows improved real-world surgical outcomes compared to traditional surgical training. This protocol has been approved by the University of Otago Health Ethics committee, and the study is due for completion in 2024. This study is the first nation-wide transfer validity study of a surgical simulator in New Zealand. As of September 2022, all trainees in the intervention cohort have been recruited along with eight retrospective trainees via email. We present this protocol to maintain transparency of the prespecified research plans and ensure robust scientific methods. This protocol may also assist other researchers conducting similar studies within small populations.
In recent years, there has been an increase in using self- admistrated questionnaires to accurately assess intervention outcomes in hand surgery to determine the quality of healthcare. This study aims to evaluate whether the Manchester Modified Disabilities of the Arm, Shoulder and Hand (M2DASH) questionnaire is a valid, reliable, responsive, and unbiased outcome measure for Carpal Tunnel syndrome compared to the Disability of Arm, Shoulder, and Hand (DASH) questionnaire, Boston questionnaire (BQ), and Nerve Conduction Studies (NCS). Method. 48 patients with CTS confirmed by NCS completed the M2DASH, original DASH, and the BQ, at least twice at different time intervals. The scores obtained from M2DASH were compared and correlated with the DASH, BQ, and NCS to assess validity, reliability, responsiveness, and bias of the questionnaires. Results.
Mako robotic assisted knee arthroplasty requires a planning CT scan within 8 weeks of surgery according to the supplier's protocol. This is often impractical, therefore we evaluated whether CT scans remain valid for an extended period. Patients undergoing Mako partial (PKA) and total (TKA) knee arthroplasty were identified from our hospital database. The hospital PACS system was used to define the time interval between the initial planning CT scan and surgery, and whether further imaging was required prior to surgery.Abstract
Introduction
Methodology
Challenges in surgical training have led to the exploration of technologies such as augmented reality (AR), which present novel approaches to teaching orthopaedic procedures to medical students. The aim of this double-blinded randomised-controlled trial was to compare the validity and training effect of AR to traditional teaching on medical students’ understanding of total knee arthroplasty (TKA). Twenty medical students from 7 UK universities were randomised equally to either intervention or control groups. The control received a consultant-led teaching session and the intervention received training via Microsoft HoloLens, where surgeons were able to project virtual information over physical objects. Participants completed written knowledge and practical exams which were assessed by 2 orthopaedic consultants. Training superiority was established via 4 quantitative outcome measures: OSATS scores, a checklist of TKA-specific steps, procedural time, and written exam scores. Qualitative feedback was evaluated using a 5-point Likert scale.Abstract
Introduction
Methodology
There are concerns that patient-reported outcome measures (PROMs) currently used for adults requiring, undergoing or after undergoing lower limb reconstruction (LLR) are not adequately capturing the range of experiences important to these patients. The ‘Patient-Reported Outcome Measure for Lower Limb Reconstruction’ (PROLLIT) study developed a conceptual framework of outcomes identified as important and relevant by adult LLR patients. This review explored whether existing PROMs address these outcomes, and exhibit content validity in this population. A range of key PROMs was selected (n=32). Systematic and hand-searches were employed to find studies assessing content validity of these PROMs in the adult LLR population, along with PROM content and development information. A systematic review of content validity of the measures was carried out following ‘COnsensus-based Standards for the selection of health Measurement Instruments’ (COSMIN) guidance, alongside conceptual mapping of the content of the PROMs against the PROLLIT conceptual framework.Introduction
Materials & Methods
The tendency towards using inertial sensors for remote monitoring of the patients at home is increasing. One of the most important characteristics of the sensors is sampling rate. Higher sampling rate results in higher resolution of the sampled signal and lower amount of noise. However, higher sampling frequency comes with a cost. The main aim of our study was to determine the validity of measurements performed by low sampling frequency (12.5 Hz) accelerometers (SENS) in patients with knee osteoarthritis compared to standard sensor-based motion capture system (Xsens). We also determined the test-retest reliability of SENS accelerometers. Participants were patients with unilateral knee osteoarthritis. Gait analysis was performed simultaneously by using Xsens and SENS sensors during two repetitions of over-ground walking at a self-selected speed. Gait data from Xsens were used as an input for AnyBody musculoskeletal modeling software to measure the accelerations at the exact location of two defined virtual sensors in the model (VirtualSENS). After preprocessing, the signals from SENS and VirtualSENS were compared in different coordinate axes in time and frequency domains. ICC for SENS data from first and second trials were calculated to assess the repeatability of the measurements. We included 32 patients (18 females) with median age 70.1[48.1 – 85.4]. Mean height and weight of the patients were 173.2 ± 9.6 cm and 84.2 ± 14.7 kg respectively. The correlation between accelerations in time domain measured by SENS and VirtualSENS in different axes was r = 0.94 in y-axis (anteroposterior), r = 0.91 in x-axis (vertical), r = 0.83 in z-axis (mediolateral), and r = 0.89 for the magnitude vector. In frequency domain, the value and the power of fundamental frequencies (F0) of SENS and VirtualSENS signals demonstrated strong correlation (r = 0.98 and r = 0.99 respectively). The result of test-retest evaluation showed excellent repeatability for acceleration measurement by SENS sensors. ICC was between 0.89 to 0.94 for different coordinate axes. Low sampling frequency accelerometers can provide valid and reliable measurements especially for home monitoring of the patients, in which handling big data and sensors cost and battery lifetime are among important issues.
The clinical uptake of minimally invasive interventions for intervertebral disc, such as nucleus augmentation, is currently hampered by the lack of robust pre-clinical testing methods that can take into account the large variation in the mechanical behaviour of the tissues. Using computational modelling to develop new interventions could be a way to test scenarios accounting for variation. However, such models need to have been validated for relevant mechanical function, e.g. compressive, torsional or flexional stiffness, and local disc deformations. The aim of this work was to use a novel in-vitro imaging method to assess the validity of computational models of the disc that employed different degrees of sophistication in the anatomical representation of the nucleus. Bovine caudal bone-disc-bone entities (N=6) were dissected and tested in uniaxial compression in a custom-made rig. Forty glass markers were placed on the surface of each disc. The specimens were scanned both with MRI and micro-CT before and during loading. Specimen-specific computational models were built from CT images to replicate the compression test. The anatomy of the nucleus was represented in three ways: assuming a standard diameter ratio, assuming a cylindrical shape with its volume matching that measured from MRI, and deriving the shape directly from MRI. The three types of models were calibrated for force-displacement. The radial displacement of the glass markers were then compared with their experimental displacement derived from CT images. For a similar accuracy in modelling overall force-displacement, the mean error on the surface displacement was 35% for standard ratio nucleus, 38% for image-based cylindrical nucleus, and 32% for MRI-based nucleus geometry. This work shows that, as long as consistency is kept to develop and calibrate image-based computational models, the complexity of the nucleus geometry does not influence the ability of a model to predict surface displacement in the intervertebral disc.
Reimers migration percentage (MP) is a key measure to inform decision-making around the management of hip displacement in cerebral palsy (CP). The aim of this study is to assess validity and inter- and intra-rater reliability of a novel method of measuring MP using a smart phone app (HipScreen (HS) app). A total of 20 pelvis radiographs (40 hips) were used to measure MP by using the HS app. Measurements were performed by five different members of the multidisciplinary team, with varying levels of expertise in MP measurement. The same measurements were repeated two weeks later. A senior orthopaedic surgeon measured the MP on picture archiving and communication system (PACS) as the gold standard and repeated the measurements using HS app. Pearson’s correlation coefficient (r) was used to compare PACS measurements and all HS app measurements and assess validity. Intraclass correlation coefficient (ICC) was used to assess intra- and inter-rater reliability.Aims
Methods
Patients undergoing limb reconstruction surgery often face a challenging and often lengthy process to complete their treatment journey. The majority of existing outcome measures do not adequately capture the patient reported outcomes relevant to this patient group in a single measure. Following a previous systematic review, the Stanmore Limb Reconstruction Score (SLRS) was designed with the intent to address this need for an effective instrument to measure patient reported outcomes in limb reconstruction patients. The SLRS was designed following the use of structured interviews with a group of patients who have undergone limb reconstruction surgery, limb reconstruction surgeons, specialist nurses and physiotherapists. This has undergone further adjustment for language and clarity. The score was then trialled on 10 patients who have been through the process of limb reconstruction surgery, with subsequent structured questioning to understand the perceived suitability.Introduction
Materials and Methods
Objective evaluations of resident performance can be difficult to simulate. A novel competency based surgical OSCE was developed to evaluate surgical skill. The goal of this study was to test the construct validity comparing previously validated Ottawa scores (O-scores) and Orthopaedic in-training evaluation scores (OITE). An OSCE designed to simulate typical general orthopaedic surgical cases was developed to evaluate resident surgical performance. Post-graduate year (PGY) 3–5 trainees have an encounter (interview and physical exam) with a standardized patient and perform a correlating surgery on a cadaver. Examiners evaluate all components of the treatment plan and provide an overall score on the OSCE and also provide an O-score on overall surgical performance. Convergent and divergent validity was assessed comparing OSCE scores to O-scores and OITE scores. SPSS was used for statistical analysis. ANOVA was used to compare PGY averages and Pearson correlation coefficients were calculated to compare OSCE versus O-score and OITE scores. A total of 96 simulated surgical cases were evaluated over a 3 year period for 24 trainees. There was a significant difference in OSCE scores based on year of training. (PGY3 − 6.06/15, PGY4 − 8.16/15 and PGY5 − 11.14/15, p < 0 .001). OSCE and O-scores demonstrated a strong positive correlation of +0.89 while OSCE and OITE scores demonstrated a moderate positive correlation of 0.68. OSCE scores demonstrated strong convergent and moderate divergent correlation. A positive trajectory based on level of training and stronger correlations with established, validated scores supports the construct validity of the novel surgical OSCE.
Hip arthroscopy is a rapidly expanding technique that has a steep learning curve. Simulation may have a role in helping trainees overcome this. However there is as yet no validated hip arthroscopy simulator. This study aimed to test the construct validity of a virtual reality hip arthroscopy simulator. Nineteen orthopaedic surgeons performed a simulated arthroscopic examination of a healthy hip joint in the supine position. Surgeons were categorized as either expert (those who had performed 250 hip arthroscopies or more) or novice (those who had performed fewer than this). Twenty-one targets were visualized within joint; nine via the anterior portal, nine via the anterolateral and three via the posterolateral. This was followed by a task testing basic probe examination of the joint in which a series of eight targets were probed via the anterolateral portal. Each surgeon's performance was evaluated by the simulator using a set of pre-defined metrics including task duration, number of soft tissue & bone collisions, and distance travelled by instruments. No repeat attempts at the tasks were permitted. Construct validity was then evaluated by comparing novice and expert group performance metrics over the two tasks using the Mann–Whitney test, with a p value of less than 0.05 considered significant. On the visualization task, the expert group outperformed the novice group on time taken (P=0.0003), number of collisions with soft tissue (P=0.001), number of collisions with bone (P=0.002) and distance travelled by the arthroscope (P=0.02). On the probe examination, the two groups differed only in the time taken to complete the task (P=0.025). Increased experience in hip arthroscopy was reflected by significantly better performance on the VR simulator across two tasks, supporting its construct validity. This study validates a virtual reality hip arthroscopy simulator and supports its potential for developing basic arthroscopic skills.
To validate the Modified Forgotten Joint Score (MFJS) as a new patient-reported outcome measure (PROM) in hip and knee arthroplasty (THR/TKR) against the UK's gold standard Oxford Hip and Knee Scores (OHS/OKS). The MFJS is a new assessment tool devised to provide a greater discriminatory power, particularly in well performing patients. It measures an appealing concept; the ability of a patient to forget about their artificial joint in everyday life. Postal questionnaires were sent out to 400 THR and TKR patients who were 1–2 years post-op. The data collected from the 212 returned questionnaires was analysed in relation to construct and content validity. 77 patients took part in a test-retest repeatability assessment. The MFJS proved to have an increased discriminatory power in high-performing patients in comparison to the OHS and OKS, highlighted by its more normal frequency of distribution and reduced ceiling effects. 30.8% of patients (n=131) achieved excellent OHS/OKS scores of 42–48 this compared to just 7.69% of patients who achieved a proportionately equivalent MFJS score of 87.5–100. The MFJS proved to have an increased test-retest repeatability based upon its intra-class correlation coefficient of 0.97 compared to the Oxford's 0.85. The MFJS provides a more sensitive tool in the assessment of well performing hip and knee arthroplasties in comparison to the OHS/OKS. The MFJS tests the concept of awareness of a prosthetic joint, rather than pain and function and therefore should be used as adjunct to the OKS/OHS.
To validate the Modified Forgotten Joint Score (MFJS) as a new patient-reported outcome measure (PROM) in hip and knee arthroplasty against the UK's gold standard Oxford Hip and Knee Scores (OHS/OKS). The original Forgotten Joint Score was created by Behrend et al to assess post-op hip/knee arthroplasty patients. It is a new assessment tool devised to provide a greater discriminatory power, particularly in the well performing patients. It measures an appealing concept; the ability of a patient to forget about their artificial joint in everyday life. The original FJS was a 12-item questionnaire, which we have modified to 10-items to improve reliability and missing data. Postal questionnaires were sent out to 400 total hip/knee replacement (THR/TKR) patients who were 1–2 years post-op, along with the OHS/OKS and a visual pain analog score. The data collected from the 212 returned questionnaires (53% return rate) was analysed in relation to construct and content validity. A sub-cohort of 77 patients took part in a test-retest repeatability study to assess reliability of the MFJS. The MFJS proved to have an increased discriminatory power in high-performing patients in comparison to the OHS and OKS, highlighted by its more normal frequency of distribution and reduced ceiling effects in the MFJS. 30.8% of patients (n=131) scored 42–48 (equivalent to 87.5–100 in the MFJS) or more in the OKS compared to just 7.69% in the MFJS TKR patients. The MFJS proved to have an increased test-retest repeatability based upon its intra-class correlation coefficient of 0.968 compared to the Oxford's 0.845. The MFJS provides a more sensitive tool in the assessment of well performing hip and knee arthroplasties in comparison to the OHS/OKS. The MFJS tests the concept of awareness of a prosthetic joint, rather than pain and function and therefore should be used as adjunct to the OKS/OHS.
We aimed to assess the reliability and validity of OpenPose, a posture estimation algorithm, for measurement of knee range of motion after total knee arthroplasty (TKA), in comparison to radiography and goniometry. In this prospective observational study, we analyzed 35 primary TKAs (24 patients) for knee osteoarthritis. We measured the knee angles in flexion and extension using OpenPose, radiography, and goniometry. We assessed the test-retest reliability of each method using intraclass correlation coefficient (1,1). We evaluated the ability to estimate other measurement values from the OpenPose value using linear regression analysis. We used intraclass correlation coefficients (2,1) and Bland–Altman analyses to evaluate the agreement and error between radiography and the other measurements.Aims
Methods
The purpose of this study was to assess the reliability and responsiveness to hip surgery of a four-point modified Care and Comfort Hypertonicity Questionnaire (mCCHQ) scoring tool in children with cerebral palsy (CP) in Gross Motor Function Classification System (GMFCS) levels IV and V. This was a population-based cohort study in children with CP from a national surveillance programme. Reliability was assessed from 20 caregivers who completed the mCCHQ questionnaire on two occasions three weeks apart. Test-retest reliability of the mCCHQ was calculated, and responsiveness before and after surgery for a displaced hip was evaluated in a cohort of children.Aims
Methods