Advertisement for orthosearch.org.uk
Results 1 - 50 of 276
Results per page:

Aims. Classifying trochlear dysplasia (TD) is useful to determine the treatment options for patients suffering from patellofemoral instability (PFI). There is no consensus on which classification system is more reliable and reproducible for the purpose of guiding clinicians’ management of PFI. There are also concerns about the validity of the Dejour Classification (DJC), which is the most widely used classification for TD, having only a fair reliability score. The Oswestry-Bristol Classification (OBC) is a recently proposed system of classification of TD, and the authors report a fair-to-good interobserver agreement and good-to-excellent intraobserver agreement in the assessment of TD. The aim of this study was to compare the reliability and reproducibility of these two classifications. Methods. In all, six assessors (four consultants and two registrars) independently evaluated 100 axial MRIs of the patellofemoral joint (PFJ) for TD and classified them according to OBC and DJC. These assessments were again repeated by all raters after four weeks. The inter- and intraobserver reliability scores were calculated using Cohen’s kappa and Cronbach’s α. Results. Both classifications showed good to excellent interobserver reliability with high α scores. The OBC classification showed a substantial intraobserver agreement (mean kappa 0.628; p < 0.005) whereas the DJC showed a moderate agreement (mean kappa 0.572; p < 0.005). There was no significant difference in the kappa values when comparing the assessments by consultants with those by registrars, in either classification system. Conclusion. This large study from a non-founding institute shows both classification systems to be reliable for classifying TD based on axial MRIs of the PFJ, with the simple-to-use OBC having a higher intraobserver reliability score than that of the DJC. Cite this article: Bone Jt Open 2023;4(7):532–538


Bone & Joint Research
Vol. 12, Issue 5 | Pages 313 - 320
8 May 2023
Saiki Y Kabata T Ojima T Kajino Y Kubo N Tsuchiya H

Aims. We aimed to assess the reliability and validity of OpenPose, a posture estimation algorithm, for measurement of knee range of motion after total knee arthroplasty (TKA), in comparison to radiography and goniometry. Methods. In this prospective observational study, we analyzed 35 primary TKAs (24 patients) for knee osteoarthritis. We measured the knee angles in flexion and extension using OpenPose, radiography, and goniometry. We assessed the test-retest reliability of each method using intraclass correlation coefficient (1,1). We evaluated the ability to estimate other measurement values from the OpenPose value using linear regression analysis. We used intraclass correlation coefficients (2,1) and Bland–Altman analyses to evaluate the agreement and error between radiography and the other measurements. Results. OpenPose had excellent test-retest reliability (intraclass correlation coefficient (1,1) = 1.000). The R. 2. of all regression models indicated large correlations (0.747 to 0.927). In the flexion position, the intraclass correlation coefficients (2,1) of OpenPose indicated excellent agreement (0.953) with radiography. In the extension position, the intraclass correlation coefficients (2,1) indicated good agreement of OpenPose and radiography (0.815) and moderate agreement of goniometry with radiography (0.593). OpenPose had no systematic error in the flexion position, and a 2.3° fixed error in the extension position, compared to radiography. Conclusion. OpenPose is a reliable and valid tool for measuring flexion and extension positions after TKA. It has better accuracy than goniometry, especially in the extension position. Accurate measurement values can be obtained with low error, high reproducibility, and no contact, independent of the examiner’s skills. Cite this article: Bone Joint Res 2023;12(5):313–320


Bone & Joint Research
Vol. 8, Issue 8 | Pages 357 - 366
1 Aug 2019
Zhang B Sun H Zhan Y He Q Zhu Y Wang Y Luo C

Objectives. CT-based three-column classification (TCC) has been widely used in the treatment of tibial plateau fractures (TPFs). In its updated version (updated three-column concept, uTCC), a fracture morphology-based injury mechanism was proposed for effective treatment guidance. In this study, the injury mechanism of TPFs is further explained, and its inter- and intraobserver reliability is evaluated to perfect the uTCC. Methods. The radiological images of 90 consecutive TPF patients were collected. A total of 47 men (52.2%) and 43 women (47.8%) with a mean age of 49.8 years (. sd. 12.4; 17 to 77) were enrolled in our study. Among them, 57 fractures were on the left side (63.3%) and 33 were on the right side (36.7%); no bilateral fracture existed. Four observers were chosen to classify or estimate independently these randomized cases according to the Schatzker classification, TCC, and injury mechanism. With two rounds of evaluation, the kappa values were calculated to estimate the inter- and intrareliability. Results. The overall inter- and intraobserver agreements of the injury mechanism were substantial (κ. inter. = 0.699, κ. intra. = 0.749, respectively). The initial position and the force direction, which are two components of the injury mechanism, had substantial agreement for both inter-reliability or intrareliability. The inter- and intraobserver agreements were lower in high-energy fractures (Schatzker types IV to VI; κ. inter. = 0.605, κ. intra. = 0.721) compared with low-energy fractures (Schatzker types I to III; κ. inter. = 0.81, κ. intra. = 0.832). The inter- and intraobserver agreements were relatively higher in one-column fractures (κ. inter. = 0.759, κ. intra. = 0.801) compared with two-column and three-column fractures. Conclusion. The complete theory of injury mechanism of TPFs was first put forward to make the TCC consummate. It demonstrates substantial inter- and intraobserver agreement generally. Furthermore, the injury mechanism can be promoted clinically. Cite this article: B-B. Zhang, H. Sun, Y. Zhan, Q-F. He, Y. Zhu, Y-K. Wang, C-F. Luo. Reliability and repeatability of tibial plateau fracture assessment with an injury mechanism-based concept. Bone Joint Res 2019;8:357–366. DOI: 10.1302/2046-3758.88.BJR-2018-0331.R1


Bone & Joint Research
Vol. 5, Issue 8 | Pages 347 - 352
1 Aug 2016
Nuttall J Evaniew N Thornley P Griffin A Deheshi B O’Shea T Wunder J Ferguson P Randall RL Turcotte R Schneider P McKay P Bhandari M Ghert M

Objectives. The diagnosis of surgical site infection following endoprosthetic reconstruction for bone tumours is frequently a subjective diagnosis. Large clinical trials use blinded Central Adjudication Committees (CACs) to minimise the variability and bias associated with assessing a clinical outcome. The aim of this study was to determine the level of inter-rater and intra-rater agreement in the diagnosis of surgical site infection in the context of a clinical trial. Materials and Methods. The Prophylactic Antibiotic Regimens in Tumour Surgery (PARITY) trial CAC adjudicated 29 non-PARITY cases of lower extremity endoprosthetic reconstruction. The CAC members classified each case according to the Centers for Disease Control (CDC) criteria for surgical site infection (superficial, deep, or organ space). Combinatorial analysis was used to calculate the smallest CAC panel size required to maximise agreement. A final meeting was held to establish a consensus. Results. Full or near consensus was reached in 20 of the 29 cases. The Fleiss kappa value was calculated as 0.44 (95% confidence interval (CI) 0.35 to 0.53), or moderate agreement. The greatest statistical agreement was observed in the outcome of no infection, 0.61 (95% CI 0.49 to 0.72, substantial agreement). Panelists reached a full consensus in 12 of 29 cases and near consensus in five of 29 cases when CDC criteria were used (superficial, deep or organ space). A stable maximum Fleiss kappa of 0.46 (95% CI 0.50 to 0.35) at CAC sizes greater than three members was obtained. Conclusions. There is substantial agreement among the members of the PARITY CAC regarding the presence or absence of surgical site infection. Agreement on the level of infection, however, is more challenging. Additional clinical information routinely collected by the prospective PARITY trial may improve the discriminatory capacity of the CAC in the parent study for the diagnosis of infection. Cite this article: J. Nuttall, N. Evaniew, P. Thornley, A. Griffin, B. Deheshi, T. O’Shea, J. Wunder, P. Ferguson, R. L. Randall, R. Turcotte, P. Schneider, P. McKay, M. Bhandari, M. Ghert. The inter-rater reliability of the diagnosis of surgical site infection in the context of a clinical trial. Bone Joint Res 2016;5:347–352. DOI: 10.1302/2046-3758.58.BJR-2016-0036.R1


Bone & Joint Open
Vol. 3, Issue 11 | Pages 913 - 920
18 Nov 2022
Dean BJF Berridge A Berkowitz Y Little C Sheehan W Riley N Costa M Sellon E

Aims. The evidence demonstrating the superiority of early MRI has led to increased use of MRI in clinical pathways for acute wrist trauma. The aim of this study was to describe the radiological characteristics and the inter-observer reliability of a new MRI based classification system for scaphoid injuries in a consecutive series of patients. Methods. We identified 80 consecutive patients with acute scaphoid injuries at one centre who had presented within four weeks of injury. The radiographs and MRI scans were assessed by four observers, two radiologists, and two hand surgeons, using both pre-existing classifications and a new MRI based classification tool, the Oxford Scaphoid MRI Assessment Rating Tool (OxSMART). The OxSMART was used to categorize scaphoid injuries into three grades: contusion (grade 1); unicortical fracture (grade 2); and complete bicortical fracture (grade 3). Results. In total there were 13 grade 1 injuries, 11 grade 2 injuries, and 56 grade 3 injuries in the 80 consecutive patients. The inter-observer reliability of the OxSMART was substantial (Kappa = 0.711). The inter-observer reliability of detecting an obvious fracture was moderate for radiographs (Kappa = 0.436) and MRI (Kappa = 0.543). Only 52% (29 of 56) of the grade 3 injuries were detected on plain radiographs. There were two complications of delayed union, both of which occurred in patients with grade 3 injuries, who were promptly treated with cast immobilization. There were no complications in the patients with grade 1 and 2 injuries and the majority of these patients were treated with early mobilization as pain allowed. Conclusion. This MRI based classification tool, the OxSMART, is reliable and clinically useful in managing patients with acute scaphoid injuries. Cite this article: Bone Jt Open 2022;3(11):913–920


Bone & Joint Open
Vol. 4, Issue 5 | Pages 363 - 369
22 May 2023
Amen J Perkins O Cadwgan J Cooke SJ Kafchitsas K Kokkinakis M

Aims. Reimers migration percentage (MP) is a key measure to inform decision-making around the management of hip displacement in cerebral palsy (CP). The aim of this study is to assess validity and inter- and intra-rater reliability of a novel method of measuring MP using a smart phone app (HipScreen (HS) app). Methods. A total of 20 pelvis radiographs (40 hips) were used to measure MP by using the HS app. Measurements were performed by five different members of the multidisciplinary team, with varying levels of expertise in MP measurement. The same measurements were repeated two weeks later. A senior orthopaedic surgeon measured the MP on picture archiving and communication system (PACS) as the gold standard and repeated the measurements using HS app. Pearson’s correlation coefficient (r) was used to compare PACS measurements and all HS app measurements and assess validity. Intraclass correlation coefficient (ICC) was used to assess intra- and inter-rater reliability. Results. All HS app measurements (from 5 raters at week 0 and week 2 and PACS rater) showed highly significant correlation with the PACS measurements (p < 0.001). Pearson’s correlation coefficient (r) was constantly over 0.9, suggesting high validity. Correlation of all HS app measures from different raters to each other was significant with r > 0.874 and p < 0.001, which also confirms high validity. Both inter- and intra-rater reliability were excellent with ICC > 0.9. In a 95% confidence interval for repeated measurements, the deviation of each specific measurement was less than 4% MP for single measurer and 5% for different measurers. Conclusion. The HS app provides a valid method to measure hip MP in CP, with excellent inter- and intra-rater reliability across different medical and allied health specialties. This can be used in hip surveillance programmes by interdisciplinary measurers. Cite this article: Bone Jt Open 2023;4(5):363–369


Bone & Joint Open
Vol. 5, Issue 6 | Pages 524 - 531
24 Jun 2024
Woldeyesus TA Gjertsen J Dalen I Meling T Behzadi M Harboe K Djuv A

Aims. To investigate if preoperative CT improves detection of unstable trochanteric hip fractures. Methods. A single-centre prospective study was conducted. Patients aged 65 years or older with trochanteric hip fractures admitted to Stavanger University Hospital (Stavanger, Norway) were consecutively included from September 2020 to January 2022. Radiographs and CT images of the fractures were obtained, and surgeons made individual assessments of the fractures based on these. The assessment was conducted according to a systematic protocol including three classification systems (AO/Orthopaedic Trauma Association (OTA), Evans Jensen (EVJ), and Nakano) and questions addressing specific fracture patterns. An expert group provided a gold-standard assessment based on the CT images. Sensitivities and specificities of surgeons’ assessments were estimated and compared in regression models with correlations for the same patients. Intra- and inter-rater reliability were presented as Cohen’s kappa and Gwet’s agreement coefficient (AC1). Results. We included 120 fractures in 119 patients. Compared to radiographs, CT increased the sensitivity of detecting unstable trochanteric fractures from 63% to 70% (p = 0.028) and from 70% to 76% (p = 0.004) using AO/OTA and EVJ, respectively. Compared to radiographs alone, CT increased the sensitivity of detecting a large posterolateral trochanter major fragment or a comminuted trochanter major fragment from 63% to 76% (p = 0.002) and from 38% to 55% (p < 0.001), respectively. CT improved intra-rater reliability for stability assessment using EVJ (AC1 0.68 to 0.78; p = 0.049) and for detecting a large posterolateral trochanter major fragment (AC1 0.42 to 0.57; p = 0.031). Conclusion. A preoperative CT of trochanteric fractures increased detection of unstable fractures using the AO/OTA and EVJ classification systems. Compared to radiographs, CT improved intra-rater reliability when assessing fracture stability and detecting large posterolateral trochanter major fragments. Cite this article: Bone Jt Open 2024;5(6):524–531


Bone & Joint Research
Vol. 9, Issue 5 | Pages 242 - 249
1 May 2020
Bali K Smit K Ibrahim M Poitras S Wilkin G Galmiche R Belzile E Beaulé PE

Aims. The aim of the current study was to assess the reliability of the Ottawa classification for symptomatic acetabular dysplasia. Methods. In all, 134 consecutive hips that underwent periacetabular osteotomy were categorized using a validated software (Hip2Norm) into four categories of normal, lateral/global, anterior, or posterior. A total of 74 cases were selected for reliability analysis, and these included 44 dysplastic and 30 normal hips. A group of six blinded fellowship-trained raters, provided with the classification system, looked at these radiographs at two separate timepoints to classify the hips using standard radiological measurements. Thereafter, a consensus meeting was held where a modified flow diagram was devised, before a third reading by four raters using a separate set of 74 radiographs took place. Results. Intrarater results per surgeon between Time 1 and Time 2 showed substantial to almost perfect agreement among the raters (κappa = 0.416 to 0.873). With respect to inter-rater reliability, at Time 1 and Time 2 there was substantial agreement overall between all surgeons (Time 1 κappa = 0.619; Time 2 κappa = 0.623). Posterior and anterior rating categories had moderate and fair agreement at Time 1 (posterior κappa = 0.557; anterior κappa = 0.438) and Time 2 (posterior κappa = 0.506; anterior κappa = 0.250), respectively. At Time 3, overall reliability (κappa = 0.687) and posterior and anterior reliability (posterior κappa = 0.579; anterior κappa = 0.521) improved from Time 1 and Time 2. Conclusion. The Ottawa classification system provides a reliable way to identify three categories of acetabular dysplasia that are well-aligned with surgical management. The term ‘borderline dysplasia’ should no longer be used. Cite this article: Bone Joint Res. 2020;9(5):242–249


Bone & Joint Open
Vol. 1, Issue 7 | Pages 355 - 358
7 Jul 2020
Konrads C Gonser C Ahmad SS

Aims. The Oswestry-Bristol Classification (OBC) was recently described as an MRI-based classification tool for the femoral trochlear. The authors demonstrated better inter- and intraobserver agreement compared to the Dejour classification. As the OBC could potentially provide a very useful MRI-based grading system for trochlear dysplasia, it was the aim to determine the inter- and intraobserver reliability of the classification system from the perspective of the non-founder. Methods. Two orthopaedic surgeons independently assessed 50 MRI scans for trochlear dysplasia and classified each according to the OBC. Both observers repeated the assessments after six weeks. The inter- and intraobserver agreement was determined using Cohen’s kappa statistic and S-statistic nominal and linear weights. Results. The OBC with grading into four different trochlear forms showed excellent inter- and intraobserver agreement with a mean kappa of 0.78. Conclusion. The OBC is a simple MRI-based classification system with high inter- and intraobserver reliability. It could present a useful tool for grading the severity of trochlear dysplasia in daily practice. Cite this article: Bone Joint Open 2020;1-7:355–358


Bone & Joint Research
Vol. 7, Issue 1 | Pages 36 - 45
1 Jan 2018
Kleinlugtenbelt YV Krol RG Bhandari M Goslings JC Poolman RW Scholtes VAB

Objectives. The patient-rated wrist evaluation (PRWE) and the Disabilities of the Arm, Shoulder and Hand (DASH) questionnaire are patient-reported outcome measures (PROMs) used for clinical and research purposes. Methodological high-quality clinimetric studies that determine the measurement properties of these PROMs when used in patients with a distal radial fracture are lacking. This study aimed to validate the PRWE and DASH in Dutch patients with a displaced distal radial fracture (DRF). Methods. The intraclass correlation coefficient (ICC) was used for test-retest reliability, between PROMs completed twice with a two-week interval at six to eight months after DRF. Internal consistency was determined using Cronbach’s α for the dimensions found in the factor analysis. The measurement error was expressed by the smallest detectable change (SDC). A semi-structured interview was conducted between eight and 12 weeks after DRF to assess the content validity. Results. A total of 119 patients (mean age 58 years (. sd. 15)), 74% female, completed PROMs at a mean time of six months (. sd. 1) post-fracture. One overall meaningful dimension was found for the PRWE and the DASH. Internal consistency was excellent for both PROMs (Cronbach’s α 0.96 (PRWE) and 0.97 (DASH)). Test-retest reliability was good for the PRWE (ICC 0.87) and excellent for the DASH (ICC 0.91). The SDC was 20 for the PRWE and 14 for the DASH. No floor or ceiling effects were found. The content validity was good for both questionnaires. Conclusion. The PRWE and DASH are valid and reliable PROMs in assessing function and disability in Dutch patients with a displaced DRF. However, due to the high SDC, the PRWE and DASH are less useful for individual patients with a distal radial fracture in clinical practice. Cite this article: Y. V. Kleinlugtenbelt, R. G. Krol, M. Bhandari, J. C. Goslings, R. W. Poolman, V. A. B. Scholtes. Are the patient-rated wrist evaluation (PRWE) and the disabilities of the arm, shoulder and hand (DASH) questionnaire used in distal radial fractures truly valid and reliable? Bone Joint Res 2018;7:36–45. DOI: 10.1302/2046-3758.71.BJR-2017-0081.R1


Bone & Joint Research
Vol. 2, Issue 1 | Pages 1 - 8
1 Jan 2013
Costa AJ Lustig S Scholes CJ Balestro J Fatima M Parker DA

Objectives. There remains a lack of data on the reliability of methods to estimate tibial coverage achieved during total knee replacement. In order to address this gap, the intra- and interobserver reliability of a three-dimensional (3D) digital templating method was assessed with one symmetric and one asymmetric prosthesis design. Methods. A total of 120 template procedures were performed according to specific rotational and over-hang criteria by three observers at time zero and again two weeks later. Total and sub-region coverage were calculated and the reliability of the templating and measurement method was evaluated. Results. Excellent intra- and interobserver reliability was observed for total coverage, when minimal component overhang (intraclass correlation coefficient (ICC) = 0.87) or no component overhang (ICC = 0.92) was permitted, regardless of rotational restrictions. Conclusions. Measurement of tibial coverage can be reliable using the templating method described even if the rotational axis selected still has a minor influence


Aims. The purpose of this study was to assess the reliability and responsiveness to hip surgery of a four-point modified Care and Comfort Hypertonicity Questionnaire (mCCHQ) scoring tool in children with cerebral palsy (CP) in Gross Motor Function Classification System (GMFCS) levels IV and V. Methods. This was a population-based cohort study in children with CP from a national surveillance programme. Reliability was assessed from 20 caregivers who completed the mCCHQ questionnaire on two occasions three weeks apart. Test-retest reliability of the mCCHQ was calculated, and responsiveness before and after surgery for a displaced hip was evaluated in a cohort of children. Results. Test-retest reliability for the overall mCCHQ score was good (intraclass correlation coefficient 0.78), and no dimension demonstrated poor reliability. The surgical intervention cohort comprised ten children who had preoperative and postoperative mCCHQ scores at a minimum of six months postoperatively. The mCCHQ tool demonstrated a significant improvement in overall score from preoperative assessment to six-month postoperative follow-up assessment (p < 0.001). Conclusion. The mCCHQ demonstrated responsiveness to intervention and good test-retest reliability. The mCCHQ is proposed as an outcome tool for use within a national surveillance programme for children with CP. Cite this article: Bone Jt Open 2023;4(8):580–583


Bone & Joint Research
Vol. 13, Issue 1 | Pages 19 - 27
5 Jan 2024
Baertl S Rupp M Kerschbaum M Morgenstern M Baumann F Pfeifer C Worlicek M Popp D Amanatullah DF Alt V

Aims. This study aimed to evaluate the clinical application of the PJI-TNM classification for periprosthetic joint infection (PJI) by determining intraobserver and interobserver reliability. To facilitate its use in clinical practice, an educational app was subsequently developed and evaluated. Methods. A total of ten orthopaedic surgeons classified 20 cases of PJI based on the PJI-TNM classification. Subsequently, the classification was re-evaluated using the PJI-TNM app. Classification accuracy was calculated separately for each subcategory (reinfection, tissue and implant condition, non-human cells, and morbidity of the patient). Fleiss’ kappa and Cohen’s kappa were calculated for interobserver and intraobserver reliability, respectively. Results. Overall, interobserver and intraobserver agreements were substantial across the 20 classified cases. Analyses for the variable ‘reinfection’ revealed an almost perfect interobserver and intraobserver agreement with a classification accuracy of 94.8%. The category 'tissue and implant conditions' showed moderate interobserver and substantial intraobserver reliability, while the classification accuracy was 70.8%. For 'non-human cells,' accuracy was 81.0% and interobserver agreement was moderate with an almost perfect intraobserver reliability. The classification accuracy of the variable 'morbidity of the patient' reached 73.5% with a moderate interobserver agreement, whereas the intraobserver agreement was substantial. The application of the app yielded comparable results across all subgroups. Conclusion. The PJI-TNM classification system captures the heterogeneity of PJI and can be applied with substantial inter- and intraobserver reliability. The PJI-TNM educational app aims to facilitate application in clinical practice. A major limitation was the correct assessment of the implant situation. To eliminate this, a re-evaluation according to intraoperative findings is strongly recommended. Cite this article: Bone Joint Res 2024;13(1):19–27


The Bone & Joint Journal
Vol. 106-B, Issue 1 | Pages 19 - 27
1 Jan 2024
Tang H Guo S Ma Z Wang S Zhou Y

Aims. The aim of this study was to evaluate the reliability and validity of a patient-specific algorithm which we developed for predicting changes in sagittal pelvic tilt after total hip arthroplasty (THA). Methods. This retrospective study included 143 patients who underwent 171 THAs between April 2019 and October 2020 and had full-body lateral radiographs preoperatively and at one year postoperatively. We measured the pelvic incidence (PI), the sagittal vertical axis (SVA), pelvic tilt, sacral slope (SS), lumbar lordosis (LL), and thoracic kyphosis to classify patients into types A, B1, B2, B3, and C. The change of pelvic tilt was predicted according to the normal range of SVA (0 mm to 50 mm) for types A, B1, B2, and B3, and based on the absolute value of one-third of the PI-LL mismatch for type C patients. The reliability of the classification of the patients and the prediction of the change of pelvic tilt were assessed using kappa values and intraclass correlation coefficients (ICCs), respectively. Validity was assessed using the overall mean error and mean absolute error (MAE) for the prediction of the change of pelvic tilt. Results. The kappa values were 0.927 (95% confidence interval (CI) 0.861 to 0.992) and 0.945 (95% CI 0.903 to 0.988) for the inter- and intraobserver reliabilities, respectively, and the ICCs ranged from 0.919 to 0.997. The overall mean error and MAE for the prediction of the change of pelvic tilt were -0.3° (SD 3.6°) and 2.8° (SD 2.4°), respectively. The overall absolute change of pelvic tilt was 5.0° (SD 4.1°). Pre- and postoperative values and changes in pelvic tilt, SVA, SS, and LL varied significantly among the five types of patient. Conclusion. We found that the proposed algorithm was reliable and valid for predicting the standing pelvic tilt after THA. Cite this article: Bone Joint J 2024;106-B(1):19–27


Bone & Joint Open
Vol. 3, Issue 6 | Pages 502 - 509
20 Jun 2022
James HK Griffin J Pattison GTR

Aims. To identify a core outcome set of postoperative radiographic measurements to assess technical skill in ankle fracture open reduction internal fixation (ORIF), and to validate these against Van der Vleuten’s criteria for effective assessment. Methods. An e-Delphi exercise was undertaken at a major trauma centre (n = 39) to identify relevant parameters. Feasibility was tested by two authors. Reliability and validity was tested using postoperative radiographs of ankle fracture operations performed by trainees enrolled in an educational trial (IRCTN 20431944). To determine construct validity, trainees were divided into novice (performed < ten cases at baseline) and intermediate groups (performed ≥ ten cases at baseline). To assess concurrent validity, the procedure-based assessment (PBA) was considered the gold standard. The inter-rater and intrarater reliability was tested using a randomly selected subset of 25 cases. Results. Overall, 235 ankle ORIFs were performed by 24 postgraduate year three to five trainees during ten months at nine NHS hospitals in England, UK. Overall, 42 PBAs were completed. The e-Delphi panel identified five ‘final product analysis’ parameters and defined acceptability thresholds: medial clear space (MCS); medial malleolar displacement (MMD); lateral malleolar displacement (LMD); tibiofibular clear space (TFCS) (all in mm); and talocrural angle (TCA) in degrees. Face validity, content validity, and feasibility were excellent. PBA global rating scale scores in this population showed excellent construct validity as continuous (p < 0.001) and categorical (p = 0.001) variables. Concurrent validity of all metrics was poor against PBA score. Intrarater reliability was substantial for all parameters (intraclass correlation coefficient (ICC) > 0.8), and inter-rater reliability was substantial for LMD, MMD, TCA, and moderate (ICC 0.61 to 0.80) for MCS and TFCS. Assessment was time efficient compared to PBA. Conclusion. Assessment of technical skill in ankle fracture surgery using the first postoperative radiograph satisfies the tested Van der Vleuten’s utility criteria for effective assessment. 'Final product analysis' assessment may be useful to assess skill transfer in the simulation-based research setting. Cite this article: Bone Jt Open 2022;3(6):502–509


Bone & Joint Open
Vol. 5, Issue 11 | Pages 962 - 970
4 Nov 2024
Suter C Mattila H Ibounig T Sumrein BO Launonen A Järvinen TLN Lähdeoja T Rämö L

Aims. Though most humeral shaft fractures heal nonoperatively, up to one-third may lead to nonunion with inferior outcomes. The Radiographic Union Score for HUmeral Fractures (RUSHU) was created to identify high-risk patients for nonunion. Our study evaluated the RUSHU’s prognostic performance at six and 12 weeks in discriminating nonunion within a significantly larger cohort than before. Methods. Our study included 226 nonoperatively treated humeral shaft fractures. We evaluated the interobserver reliability and intraobserver reproducibility of RUSHU scoring using intraclass correlation coefficients (ICCs). Additionally, we determined the optimal cut-off thresholds for predicting nonunion using the receiver operating characteristic (ROC) method. Results. The RUSHU demonstrated good interobserver reliability with an ICC of 0.78 (95% CI 0.72 to 0.83) at six weeks and 0.77 (95% CI 0.71 to 0.82) at 12 weeks. Intraobserver reproducibility was good or excellent for all analyses. Area under the curve in the ROC analysis was 0.83 (95% CI 0.77 to 0.88) at six weeks and 0.89 (95% CI 0.84 to 0.93) at 12 weeks, indicating excellent discrimination. The optimal cut-off values for predicting nonunion were ≤ eight points at six weeks and ≤ nine points at 12 weeks, providing the best specificity-sensitivity trade-off. Conclusion. The RUSHU proves to be a reliable and reproducible radiological scoring system that aids in identifying patients at risk of nonunion at both six and 12 weeks post-injury during non-surgical treatment of humeral shaft fractures. The statistically optimal cut-off values for predicting nonunion are ≤ eight at six weeks and ≤ nine points at 12 weeks post-injury


Bone & Joint Open
Vol. 4, Issue 9 | Pages 689 - 695
7 Sep 2023
Lim KBL Lee NKL Yeo BS Lim VMM Ng SWL Mishra N

Aims. To determine whether side-bending films in scoliosis are assessed for adequacy in clinical practice; and to introduce a novel method for doing so. Methods. Six surgeons and eight radiographers were invited to participate in four online surveys. The generic survey comprised erect and left and right bending radiographs of eight individuals with scoliosis, with an average age of 14.6 years. Respondents were asked to indicate whether each bending film was optimal (adequate) or suboptimal. In the first survey, they were also asked if they currently assessed the adequacy of bending films. A similar second survey was sent out two weeks later, using the same eight cases but in a different order. In the third survey, a guide for assessing bending film adequacy was attached along with the radiographs to introduce the novel T1-45B method, in which the upper endplate of T1 must tilt ≥ 45° from baseline for the study to be considered optimal. A fourth and final survey was subsequently conducted for confirmation. Results. Overall, 12 (86%) of 14 respondents did not use any criteria to assess the bending film adequacy; the remaining two each described a different invalidated method. In total, 12 (86%) of the respondents felt T1-45B was easy to learn and apply. There was fair to substantial intra-rater reliability (k = 0.25 to 0.88) which improved to fair to almost perfect (k = 0.38 to 0.88) post-introduction of the guide. Inter-rater reliability varied considerably among the rater groups but similarly increased following introduction of the guide (k. S1. = 0.19 to 0.34, k. S2. = 0.33 to 0.43 vs k. S3. = 0.49 to 0.5, k. S4. = 0.35 to 0.43). Conclusion. Many surgeons and radiographers do not assess spinal bending films for adequacy. We propose that the change in the plane of the upper endplate of T1 on side-bending can be used in this evaluation. In the T1-45B method, a change of ≥ 45° on side bending qualifies as an adequate bend effort. Cite this article: Bone Jt Open 2023;4(9):689–695


The Bone & Joint Journal
Vol. 106-B, Issue 9 | Pages 964 - 969
1 Sep 2024
Wang YC Song JJ Li TT Yang D Lv ZB Wang ZY Zhang ZM Luo Y

Aims. To propose a new method for evaluating paediatric radial neck fractures and improve the accuracy of fracture angulation measurement, particularly in younger children, and thereby facilitate planning treatment in this population. Methods. Clinical data of 117 children with radial neck fractures in our hospital from August 2014 to March 2023 were collected. A total of 50 children (26 males, 24 females, mean age 7.6 years (2 to 13)) met the inclusion criteria and were analyzed. Cases were excluded for the following reasons: Judet grade I and Judet grade IVb (> 85° angulation) classification; poor radiograph image quality; incomplete clinical information; sagittal plane angulation; severe displacement of the ulna fracture; and Monteggia fractures. For each patient, standard elbow anteroposterior (AP) view radiographs and corresponding CT images were acquired. On radiographs, Angle P (complementary to the angle between the long axis of the radial head and the line perpendicular to the physis), Angle S (complementary to the angle between the long axis of the radial head and the midline through the proximal radial shaft), and Angle U (between the long axis of the radial head and the straight line from the distal tip of the capitellum to the coronoid process) were identified as candidates approximating the true coronal plane angulation of radial neck fractures. On the coronal plane of the CT scan, the angulation of radial neck fractures (CTa) was measured and served as the reference standard for measurement. Inter- and intraobserver reliabilities were assessed by Kappa statistics and intraclass correlation coefficient (ICC). Results. Angle U showed the strongest correlation with CTa (p < 0.001). In the analysis of inter- and intraobserver reliability, Kappa values were significantly higher for Angles S and U compared with Angle P. ICC values were excellent among the three groups. Conclusion. Angle U on AP view was the best substitute for CTa when evaluating radial neck fractures in children. Further studies are required to validate this method. Cite this article: Bone Joint J 2024;106-B(9):964–969


Bone & Joint Research
Vol. 13, Issue 8 | Pages 392 - 400
5 Aug 2024
Barakat A Evans J Gibbons C Singh HP

Aims. The Oxford Shoulder Score (OSS) is a 12-item measure commonly used for the assessment of shoulder surgeries. This study explores whether computerized adaptive testing (CAT) provides a shortened, individually tailored questionnaire while maintaining test accuracy. Methods. A total of 16,238 preoperative OSS were available in the National Joint Registry (NJR) for England, Wales, Northern Ireland, the Isle of Man, and the States of Guernsey dataset (April 2012 to April 2022). Prior to CAT, the foundational item response theory (IRT) assumptions of unidimensionality, monotonicity, and local independence were established. CAT compared sequential item selection with stopping criteria set at standard error (SE) < 0.32 and SE < 0.45 (equivalent to reliability coefficients of 0.90 and 0.80) to full-length patient-reported outcome measure (PROM) precision. Results. Confirmatory factor analysis (CFA) for unidimensionality exhibited satisfactory fit with root mean square standardized residual (RSMSR) of 0.06 (cut-off ≤ 0.08) but not with comparative fit index (CFI) of 0.85 or Tucker-Lewis index (TLI) of 0.82 (cut-off > 0.90). Monotonicity, measured by H value, yielded 0.482, signifying good monotonic trends. Local independence was generally met, with Yen’s Q3 statistic > 0.2 for most items. The median item count for completing the CAT simulation with a SE of 0.32 was 3 (IQR 3 to 12), while for a SE of 0.45 it was 2 (IQR 2 to 6). This constituted only 25% and 16%, respectively, when compared to the 12-item full-length questionnaire. Conclusion. Calibrating IRT for the OSS has resulted in the development of an efficient and shortened CAT while maintaining accuracy and reliability. Through the reduction of redundant items and implementation of a standardized measurement scale, our study highlights a promising approach to alleviate time burden and potentially enhance compliance with these widely used outcome measures. Cite this article: Bone Joint Res 2024;13(8):392–400


Bone & Joint Open
Vol. 3, Issue 11 | Pages 877 - 884
14 Nov 2022
Archer H Reine S Alshaikhsalama A Wells J Kohli A Vazquez L Hummer A DiFranco MD Ljuhar R Xi Y Chhabra A

Aims. Hip dysplasia (HD) leads to premature osteoarthritis. Timely detection and correction of HD has been shown to improve pain, functional status, and hip longevity. Several time-consuming radiological measurements are currently used to confirm HD. An artificial intelligence (AI) software named HIPPO automatically locates anatomical landmarks on anteroposterior pelvis radiographs and performs the needed measurements. The primary aim of this study was to assess the reliability of this tool as compared to multi-reader evaluation in clinically proven cases of adult HD. The secondary aims were to assess the time savings achieved and evaluate inter-reader assessment. Methods. A consecutive preoperative sample of 130 HD patients (256 hips) was used. This cohort included 82.3% females (n = 107) and 17.7% males (n = 23) with median patient age of 28.6 years (interquartile range (IQR) 22.5 to 37.2). Three trained readers’ measurements were compared to AI outputs of lateral centre-edge angle (LCEA), caput-collum-diaphyseal (CCD) angle, pelvic obliquity, Tönnis angle, Sharp’s angle, and femoral head coverage. Intraclass correlation coefficients (ICC) and Bland-Altman analyses were obtained. Results. Among 256 hips with AI outputs, all six hip AI measurements were successfully obtained. The AI-reader correlations were generally good (ICC 0.60 to 0.74) to excellent (ICC > 0.75). There was lower agreement for CCD angle measurement. Most widely used measurements for HD diagnosis (LCEA and Tönnis angle) demonstrated good to excellent inter-method reliability (ICC 0.71 to 0.86 and 0.82 to 0.90, respectively). The median reading time for the three readers and AI was 212 (IQR 197 to 230), 131 (IQR 126 to 147), 734 (IQR 690 to 786), and 41 (IQR 38 to 44) seconds, respectively. Conclusion. This study showed that AI-based software demonstrated reliable radiological assessment of patients with HD with significant interpretation-related time savings. Cite this article: Bone Jt Open 2022;3(11):877–884


Bone & Joint Research
Vol. 11, Issue 9 | Pages 619 - 628
7 Sep 2022
Yapp LZ Scott CEH Howie CR MacDonald DJ Simpson AHRW Clement ND

Aims. The aim of this study was to report the meaningful values of the EuroQol five-dimension three-level questionnaire (EQ-5D-3L) and EuroQol visual analogue scale (EQ-VAS) in patients undergoing primary knee arthroplasty (KA). Methods. This is a retrospective study of patients undergoing primary KA for osteoarthritis in a university teaching hospital (Royal Infirmary of Edinburgh) (1 January 2013 to 31 December 2019). Pre- and postoperative (one-year) data were prospectively collected for 3,181 patients (median age 69.9 years (interquartile range (IQR) 64.2 to 76.1); females, n = 1,745 (54.9%); median BMI 30.1 kg/m. 2. (IQR 26.6 to 34.2)). The reliability of the EQ-5D-3L was measured using Cronbach’s alpha. Responsiveness was determined by calculating the anchor-based minimal clinically important difference (MCID), the minimal important change (MIC) (cohort and individual), the patient-acceptable symptom state (PASS) predictive of satisfaction, and the minimal detectable change at 90% confidence intervals (MDC-90). Results. The EQ-5D-3L demonstrated good internal consistency with an overall Cronbach alpha of 0.75 (preoperative) and 0.88 (postoperative), respectively. The MCID for the Index score was 0.085 (95% confidence interval (CI) 0.042 to 0.127) and EQ-VAS was 6.41 (95% CI 3.497 to 9.323). The MIC. COHORT. was 0.289 for the EQ-5D and 5.27 for the EQ-VAS. However, the MIC. INDIVIDUAL. for both the EQ-5D-3L Index (0.105) and EQ-VAS (-1) demonstrated poor-to-acceptable reliability. The MDC-90 was 0.023 for the EQ-5D-3L Index and 1.0 for the EQ-VAS. The PASS for the postoperative EQ-5D-3L Index and EQ-VAS scores predictive of patient satisfaction were 0.708 and 77.0, respectively. Conclusion. The meaningful values of the EQ-5D-3L Index and EQ-VAS scores can be used to measure clinically relevant changes in health-related quality of life in patients undergoing primary KA. Cite this article: Bone Joint Res 2022;11(9):619–628


Bone & Joint Open
Vol. 2, Issue 9 | Pages 765 - 772
14 Sep 2021
Silitonga J Djaja YP Dilogo IH Pontoh LAP

Aims. The aim of this study was to perform a cross-cultural adaptation of Oxford Hip Score (OHS) to Indonesian, and to evaluate its psychometric properties. Methods. We performed a cross-cultural adaptation of Oxford Hip Score into Indonesian language (OHS-ID) and determined its internal consistency, test-retest reliability, measurement error, floor-ceiling effect, responsiveness, and construct validity by hypotheses testing of its correlation with Harris Hip Score (HHS), vsual analogue scale (VAS), and Short Form-36 (SF-36). Adults (> 17 years old) with chronic hip pain (osteoarthritis or osteonecrosis) were included. Results. A total of 125 patients were included, including 50 total hip arthroplasty (THA) patients with six months follow-up. The OHS questionnaire was translated into Indonesian and showed good internal consistency (Cronbach’s alpha = 0.89) and good reliability (intraclass correlation = 0.98). The standard error of measurement value of 2.11 resulted in minimal detectable change score of 5.8. Ten out of ten (100%) a priori hypotheses were met, confirming the construct validity. A strong correlation was found with two subscales of SF-36 (pain and physical function), HHS (0.94), and VAS (-0.83). OHS-ID also showed good responsiveness for post-THA series. Floor and ceiling effect was not found. Conclusion. The Indonesian version of OHS showed similar reliability and validity with the original OHS. This questionnaire will be suitable to assess chronic hip pain in Indonesian-speaking patients. Cite this article: Bone Jt Open 2021;2(9):765–772


Bone & Joint Open
Vol. 2, Issue 8 | Pages 638 - 645
1 Aug 2021
Garner AJ Edwards TC Liddle AD Jones GG Cobb JP

Aims. Joint registries classify all further arthroplasty procedures to a knee with an existing partial arthroplasty as revision surgery, regardless of the actual procedure performed. Relatively minor procedures, including bearing exchanges, are classified in the same way as major operations requiring augments and stems. A new classification system is proposed to acknowledge and describe the detail of these procedures, which has implications for risk, recovery, and health economics. Methods. Classification categories were proposed by a surgical consensus group, then ranked by patients, according to perceived invasiveness and implications for recovery. In round one, 26 revision cases were classified by the consensus group. Results were tested for inter-rater reliability. In round two, four additional cases were added for clarity. Round three repeated the survey one month later, subject to inter- and intrarater reliability testing. In round four, five additional expert partial knee arthroplasty surgeons were asked to classify the 30 cases according to the proposed revision partial knee classification (RPKC) system. Results. Four classes were proposed: PR1, where no bone-implant interfaces are affected; PR2, where surgery does not include conversion to total knee arthroplasty, for example, a second partial arthroplasty to a native compartment; PR3, when a standard primary total knee prosthesis is used; and PR4 when revision components are necessary. Round one resulted in 92% inter-rater agreement (Kendall’s W 0.97; p < 0.005), rising to 93% in round two (Kendall’s W 0.98; p < 0.001). Round three demonstrated 97% agreement (Kendall’s W 0.98; p < 0.001), with high intra-rater reliability (interclass correlation coefficient (ICC) 0.99; 95% confidence interval 0.98 to 0.99). Round four resulted in 80% agreement (Kendall’s W 0.92; p < 0.001). Conclusion. The RPKC system accounts for all procedures which may be appropriate following partial knee arthroplasty. It has been shown to be reliable, repeatable and pragmatic. The implications for patient care and health economics are discussed. Cite this article: Bone Jt Open 2021;2(8):638–645


Bone & Joint Research
Vol. 10, Issue 12 | Pages 820 - 829
15 Dec 2021
Schmidutz F Schopf C Yan SG Ahrend M Ihle C Sprecher C

Aims. The distal radius is a major site of osteoporotic bone loss resulting in a high risk of fragility fracture. This study evaluated the capability of a cortical index (CI) at the distal radius to predict the local bone mineral density (BMD). Methods. A total of 54 human cadaver forearms (ten singles, 22 pairs) (19 to 90 years) were systematically assessed by clinical radiograph (XR), dual-energy X-ray absorptiometry (DXA), CT, as well as high-resolution peripheral quantitative CT (HR-pQCT). Cortical bone thickness (CBT) of the distal radius was measured on XR and CT scans, and two cortical indices mean average (CBTavg) and gauge (CBTg) were determined. These cortical indices were compared to the BMD of the distal radius determined by DXA (areal BMD (aBMD)) and HR-pQCT (volumetric BMD (vBMD)). Pearson correlation coefficient (r) and intraclass correlation coefficient (ICC) were used to compare the results and degree of reliability. Results. The CBT could accurately be determined on XRs and highly correlated to those determined on CT scans (r = 0.87 to 0.93). The CBTavg index of the XRs significantly correlated with the BMD measured by DXA (r = 0.78) and HR-pQCT (r = 0.63), as did the CBTg index with the DXA (r = 0.55) and HR-pQCT (r = 0.64) (all p < 0.001). A high correlation of the BMD and CBT was observed between paired specimens (r = 0.79 to 0.96). The intra- and inter-rater reliability was excellent (ICC 0.79 to 0.92). Conclusion. The cortical index (CBTavg) at the distal radius shows a close correlation to the local BMD. It thus can serve as an initial screening tool to estimate the local bone quality if quantitative BMD measurements are unavailable, and enhance decision-making in acute settings on fracture management or further osteoporosis screening. Cite this article: Bone Joint Res 2021;10(12):820–829


Bone & Joint Research
Vol. 13, Issue 6 | Pages 294 - 305
17 Jun 2024
Yang P He W Yang W Jiang L Lin T Sun W Zhang Q Bai X Sun W Guo D

Aims. In this study, we aimed to visualize the spatial distribution characteristics of femoral head necrosis using a novel measurement method. Methods. We retrospectively collected CT imaging data of 108 hips with non-traumatic osteonecrosis of the femoral head from 76 consecutive patients (mean age 34.3 years (SD 8.1), 56.58% male (n = 43)) in two clinical centres. The femoral head was divided into 288 standard units (based on the orientation of units within the femoral head, designated as N[Superior], S[Inferior], E[Anterior], and W[Posterior]) using a new measurement system called the longitude and latitude division system (LLDS). A computer-aided design (CAD) measurement tool was also developed to visualize the measurement of the spatial location of necrotic lesions in CT images. Two orthopaedic surgeons independently performed measurements, and the results were used to draw 2D and 3D heat maps of spatial distribution of necrotic lesions in the femoral head, and for statistical analysis. Results. The results showed that the LLDS has high inter-rater reliability. As illustrated by the heat map, the distribution of Japanese Investigation Committee (JIC) classification type C necrotic lesions exhibited clustering characteristics, with the lesions being concentrated in the northern and eastern regions, forming a hot zone (90% probability) centred on the N4-N6E2, N3-N6E units of outer ring blocks. Statistical results showed that the distribution difference between type C2 and type C1 was most significant in the E1 and E2 units and, combined with the heat map, indicated that the spatial distribution differences at N3-N6E1 and N1-N3E2 units are crucial in understanding type C1 and C2 necrotic lesions. Conclusion. The LLDS can be used to accurately measure the spatial location of necrotic lesions and display their distribution characteristics. Cite this article: Bone Joint Res 2024;13(6):294–305


Bone & Joint Open
Vol. 2, Issue 10 | Pages 858 - 864
18 Oct 2021
Guntin J Plummer D Della Valle C DeBenedetti A Nam D

Aims. Prior studies have identified that malseating of a modular dual mobility liner can occur, with previous reported incidences between 5.8% and 16.4%. The aim of this study was to determine the incidence of malseating in dual mobility implants at our institution, assess for risk factors for liner malseating, and investigate whether liner malseating has any impact on clinical outcomes after surgery. Methods. We retrospectively reviewed the radiographs of 239 primary and revision total hip arthroplasties with a modular dual mobility liner. Two independent reviewers assessed radiographs for each patient twice for evidence of malseating, with a third observer acting as a tiebreaker. Univariate analysis was conducted to determine risk factors for malseating with Youden’s index used to identify cut-off points. Cohen’s kappa test was used to measure interobserver and intraobserver reliability. Results. In all, 12 liners (5.0%), including eight Stryker (6.8%) and four Zimmer Biomet (3.3%), had radiological evidence of malseating. Interobserver reliability was found to be 0.453 (95% confidence interval (CI) 0.26 to 0.64), suggesting weak inter-rater agreement, with strong agreement being greater than 0.8. We found component size of 50 mm or less to be associated with liner malseating on univariate analysis (p = 0.031). Patients with malseated liners appeared to have no associated clinical consequences, and none required revision surgery at a mean of 14 months (1.4 to 99.2) postoperatively. Conclusion. The incidence of liner malseating was 5.0%, which is similar to other reports. Component size of 50 mm or smaller was identified as a risk factor for malseating. Surgeons should be aware that malseating can occur and implant design changes or changes in instrumentation should be considered to lower the risk of malseating. Although further follow-up is needed, it remains to be seen if malseating is associated with any clinical consequences. Cite this article: Bone Jt Open 2021;2(10):858–864


Bone & Joint Open
Vol. 4, Issue 4 | Pages 262 - 272
11 Apr 2023
Batailler C Naaim A Daxhelet J Lustig S Ollivier M Parratte S

Aims. The impact of a diaphyseal femoral deformity on knee alignment varies according to its severity and localization. The aims of this study were to determine a method of assessing the impact of diaphyseal femoral deformities on knee alignment for the varus knee, and to evaluate the reliability and the reproducibility of this method in a large cohort of osteoarthritic patients. Methods. All patients who underwent a knee arthroplasty from 2019 to 2021 were included. Exclusion criteria were genu valgus, flexion contracture (> 5°), previous femoral osteotomy or fracture, total hip arthroplasty, and femoral rotational disorder. A total of 205 patients met the inclusion criteria. The mean age was 62.2 years (SD 8.4). The mean BMI was 33.1 kg/m. 2. (SD 5.5). The radiological measurements were performed twice by two independent reviewers, and included hip knee ankle (HKA) angle, mechanical medial distal femoral angle (mMDFA), anatomical medial distal femoral angle (aMDFA), femoral neck shaft angle (NSA), femoral bowing angle (FBow), the distance between the knee centre and the top of the FBow (DK), and the angle representing the FBow impact on the knee (C’KS angle). Results. The FBow impact on the mMDFA can be measured by the C’KS angle. The C’KS angle took the localization (length DK) and the importance (FBow angle) of the FBow into consideration. The mean FBow angle was 4.4° (SD 2.4; 0 to 12.5). The mean C’KS angle was 1.8° (SD 1.1; 0 to 5.8). Overall, 84 knees (41%) had a severe FBow (> 5°). The radiological measurements showed very good to excellent intraobserver and interobserver agreements. The C’KS increased significantly when the length DK decreased and the FBow angle increased (p < 0.001). Conclusion. The impact of the diaphyseal femoral deformity on the mechanical femoral axis is measured by the C’KS angle, a reliable and reproducible measurement. Cite this article: Bone Jt Open 2023;4(4):262–272


Bone & Joint Open
Vol. 1, Issue 9 | Pages 594 - 604
24 Sep 2020
James HK Pattison GTR Griffin J Fisher JD Griffin DR

Aims. To develop a core outcome set of measurements from postoperative radiographs that can be used to assess technical skill in performing dynamic hip screw (DHS) and hemiarthroplasty, and to validate these against Van der Vleuten’s criteria for effective assessment. Methods. A Delphi exercise was undertaken at a regional major trauma centre to identify candidate measurement items. The feasibility of taking these measurements was tested by two of the authors (HKJ, GTRP). Validity and reliability were examined using the radiographs of operations performed by orthopaedic resident participants (n = 28) of a multicentre randomized controlled educational trial (ISRCTN20431944). Trainees were divided into novice and intermediate groups, defined as having performed < ten or ≥ ten cases each for DHS and hemiarthroplasty at baseline. The procedure-based assessment (PBA) global rating score was assumed as the gold standard assessment for the purposes of concurrent validity. Intra- and inter-rater reliability testing were performed on a random subset of 25 cases. Results. In total, 327 DHS and 248 hemiarthroplasty procedures were performed by 28 postgraduate year (PGY) 3 to 5 orthopaedic trainees during the 2014 to 2015 surgical training year at nine NHS hospitals in the West Midlands, UK. Overall, 109 PBAs were completed for DHS and 80 for hemiarthroplasty. Expert consensus identified four ‘final product analysis’ (FPA) radiological parameters of technical success for DHS: tip-apex distance (TAD); lag screw position in the femoral head; flushness of the plate against the lateral femoral cortex; and eight-cortex hold of the plate screws. Three parameters were identified for hemiarthroplasty: leg length discrepancy; femoral stem alignment; and femoral offset. Face validity, content validity, and feasibility were excellent. For all measurements, performance was better in the intermediate compared with the novice group, and this was statistically significant for TAD (p < 0.001) and femoral stem alignment (p = 0.023). Concurrent validity was poor when measured against global PBA score. This may be explained by the fact that they are measuring difference facets of competence. Intra-and inter-rater reliability were excellent for TAD, moderate for lag screw position (DHS), and moderate for leg length discrepancy (hemiarthroplasty). Use of a large multicentre dataset suggests good generalizability of the results to other settings. Assessment using FPA was time- and cost-effective compared with PBA. Conclusion. Final product analysis using post-implantation radiographs to measure technical skill in hip fracture surgery is feasible, valid, reliable, and cost-effective. It can complement traditional workplace-based assessment for measuring performance in the real-world operating room . It may have particular utility in competency-based training frameworks and for assessing skill transfer from the simulated to live operating theatre. Cite this article: Bone Joint Open 2020;1-9:594–604


Bone & Joint Open
Vol. 4, Issue 12 | Pages 964 - 969
19 Dec 2023
Berwin JT Duffy SDX Gargan MF Barnes JR

Aims. We assessed the long-term outcomes of a large cohort of patients who have undergone a periacetabular osteotomy (PAO), and sought to validate a patient satisfaction questionnaire for use in a PAO cohort. Methods. All patients who had undergone a PAO from July 1998 to February 2013 were surveyed, with several patient-reported outcome measures (PROMs) and radiological measurements of preoperative acetabular dysplasia and postoperative correction also recorded. Patients were asked to rate their level of satisfaction with their operation in achieving pain relief, restoration of activities of daily living, ability to perform recreational activity, and their overall level of satisfaction with the procedure. Results. A total of 143 PAOs were performed between 1998 and 2013. Of those, 90 postoperative surveys were returned. Only 65 patients (73 hips) had both pre- and postoperative radiographs available for measurement. The mean time to follow-up was 15 years (6.5 to 20). Most patients were female (91%), with a mean age of 26.4 years (14.9 to 48.3) at the time of their surgery. A statistically significant improvement in radiological correction was detected in all hips (p < 0.001). A total of 67 patients (92.3%) remained either very satisfied or satisfied with their PAO. The internal consistency of the patient satisfaction questionnaire, measured using Cronbach’s α, ranged from 0.89 to 0.94 indicating ‘good’ to ‘excellent’ reliability. Conclusion. Outcomes of importance to patients undergoing a PAO include several key domains: pain relief, improve activities of daily living, and improve recreational ability. Our study demonstrates high rates of long-term patient satisfaction in all domains, and found the patient satisfaction questionnaire to be a valid and reliable instrument for use in this cohort. Cite this article: Bone Jt Open 2023;4(12):964–969


Bone & Joint Open
Vol. 5, Issue 10 | Pages 818 - 824
2 Oct 2024
Moroder P Herbst E Pawelke J Lappen S Schulz E

Aims. The liner design is a key determinant of the constraint of a reverse total shoulder arthroplasty (rTSA). The aim of this study was to compare the degree of constraint of rTSA liners between different implant systems. Methods. An implant company’s independent 3D shoulder arthroplasty planning software (mediCAD 3D shoulder v. 7.0, module v. 2.1.84.173.43) was used to determine the jump height of standard and constrained liners of different sizes (radius of curvature) of all available companies. The obtained parameters were used to calculate the stability ratio (degree of constraint) and angle of coverage (degree of glenosphere coverage by liner) of the different systems. Measurements were independently performed by two raters, and intraclass correlation coefficients were calculated to perform a reliability analysis. Additionally, measurements were compared with parameters provided by the companies themselves, when available, to ensure validity of the software-derived measurements. Results. There were variations in jump height between rTSA systems at a given size, resulting in large differences in stability ratio between systems. Standard liners exhibited a stability ratio range from 126% to 214% (mean 158% (SD 23%)) and constrained liners a range from 151% to 479% (mean 245% (SD 76%)). The angle of coverage showed a range from 103° to 130° (mean 115° (SD 7°)) for standard and a range from 113° to 156° (mean 133° (SD 11°)) for constrained liners. Four arthroplasty systems kept the stability ratio of standard liners constant (within 5%) across different sizes, while one system showed slight inconsistencies (within 10%), and ten arthroplasty systems showed large inconsistencies (range 11% to 28%). The stability ratio of constrained liners was consistent across different sizes in two arthroplasty systems and inconsistent in seven systems (range 18% to 106%). Conclusion. Large differences in jump height and resulting degree of constraint of rTSA liners were observed between different implant systems, and in many cases even within the same implant systems. While the immediate clinical effect remains unclear, in theory the degree of constraint of the liner plays an important role for the dislocation and notching risk of a rTSA system. Cite this article: Bone Jt Open 2024;5(10):818–824


Bone & Joint Open
Vol. 3, Issue 5 | Pages 423 - 431
1 May 2022
Leong JWY Singhal R Whitehouse MR Howell JR Hamer A Khanduja V Board TN

Aims. The aim of this modified Delphi process was to create a structured Revision Hip Complexity Classification (RHCC) which can be used as a tool to help direct multidisciplinary team (MDT) discussions of complex cases in local or regional revision networks. Methods. The RHCC was developed with the help of a steering group and an invitation through the British Hip Society (BHS) to members to apply, forming an expert panel of 35. We ran a mixed-method modified Delphi process (three rounds of questionnaires and one virtual meeting). Round 1 consisted of identifying the factors that govern the decision-making and complexities, with weighting given to factors considered most important by experts. Participants were asked to identify classification systems where relevant. Rounds 2 and 3 focused on grouping each factor into H1, H2, or H3, creating a hierarchy of complexity. This was followed by a virtual meeting in an attempt to achieve consensus on the factors which had not achieved consensus in preceding rounds. Results. The expert group achieved strong consensus in 32 out of 36 factors following the Delphi process. The RHCC used the existing Paprosky (acetabulum and femur), Unified Classification System, and American Society of Anesthesiologists (ASA) classification systems. Patients with ASA grade III/IV are recognized with a qualifier of an asterisk added to the final classification. The classification has good intraobserver and interobserver reliability with Kappa values of 0.88 to 0.92 and 0.77 to 0.85, respectively. Conclusion. The RHCC has been developed through a modified Delphi technique. RHCC will provide a framework to allow discussion of complex cases as part of a local or regional hip revision MDT. We believe that adoption of the RHCC will provide a comprehensive and reproducible method to describe each patient’s case with regard to surgical complexity, in addition to medical comorbidities that may influence their management. Cite this article: Bone Jt Open 2022;3(5):423–431


Bone & Joint Open
Vol. 3, Issue 2 | Pages 114 - 122
1 Feb 2022
Green GL Arnander M Pearse E Tennent D

Aims. Recurrent dislocation is both a cause and consequence of glenoid bone loss, and the extent of the bony defect is an indicator guiding operative intervention. Literature suggests that loss greater than 25% requires glenoid reconstruction. Measuring bone loss is controversial; studies use different methods to determine this, with no clear evidence of reproducibility. A systematic review was performed to identify existing CT-based methods of quantifying glenoid bone loss and establish their reliability and reproducibility. Methods. A Preferred Reporting Items for Systematic reviews and Meta-Analyses-compliant systematic review of conventional and grey literature was performed. Results. A total of 25 studies were initially eligible. Following screening, nine papers were included for review. Main themes identified compared 2D and 3D imaging, as well as linear- compared with area-based techniques. Heterogenous data were acquired, and therefore no meta-analysis was performed. Conclusion. No ideal CT-based method is demonstrated in the current literature, however evidence suggests that surface area methods are more reproducible and lead to fewer over-estimations of bone loss, provided the views used are standardized. A prospective imaging trial is required to provide a more definitive answer to this research question. Cite this article: Bone Jt Open 2022;3(2):114–122


Bone & Joint Open
Vol. 3, Issue 6 | Pages 475 - 484
13 Jun 2022
Jang SJ Vigdorchik JM Windsor EW Schwarzkopf R Mayman DJ Sculco PK

Aims. Navigation devices are designed to improve a surgeon’s accuracy in positioning the acetabular and femoral components in total hip arthroplasty (THA). The purpose of this study was to both evaluate the accuracy of an optical computer-assisted surgery (CAS) navigation system and determine whether preoperative spinopelvic mobility (categorized as hypermobile, normal, or stiff) increased the risk of acetabular component placement error. Methods. A total of 356 patients undergoing primary THA were prospectively enrolled from November 2016 to March 2018. Clinically relevant error using the CAS system was defined as a difference of > 5° between CAS and 3D radiological reconstruction measurements for acetabular component inclination and anteversion. Univariate and multiple logistic regression analyses were conducted to determine whether hypermobile (. Δ. sacral slope(SS). stand-sit. > 30°), or stiff (. ∆. SS. stand-sit. < 10°) spinopelvic mobility contributed to increased error rates. Results. The paired absolute difference between CAS and postoperative imaging measurements was 2.3° (standard deviation (SD) 2.6°) for inclination and 3.1° (SD 4.2°) for anteversion. Using a target zone of 40° (± 10°) (inclination) and 20° (± 10°) (anteversion), postoperative standing radiographs measured 96% of acetabular components within the target zone for both inclination and anteversion. Multiple logistic regression analysis controlling for BMI and sex revealed that hypermobile spinopelvic mobility significantly increased error rates for anteversion (odds ratio (OR) 2.48, p = 0.009) and inclination (OR 2.44, p = 0.016), whereas stiff spinopelvic mobility increased error rates for anteversion (OR 1.97, p = 0.028). There were no dislocations at a minimum three-year follow-up. Conclusion. Despite high reliability in acetabular positioning for inclination in a large patient cohort using an optical CAS system, hypermobile and stiff spinopelvic mobility significantly increased the risk of clinically relevant errors. In patients with abnormal spinopelvic mobility, CAS systems should be adjusted for use to avoid acetabular component misalignment and subsequent risk for long-term dislocation. Cite this article: Bone Jt Open 2022;3(6):475–484


Bone & Joint Open
Vol. 5, Issue 11 | Pages 1037 - 1040
15 Nov 2024
Wu DY Lam EKF

Aims. The first metatarsal pronation deformity of hallux valgus feet is widely recognized. However, its assessment relies mostly on 3D standing CT scans. Two radiological signs, the first metatarsal round head (RH) and inferior tuberosity position (ITP), have been described, but are seldom used to aid in diagnosis. This study was undertaken to determine the reliability and validity of these two signs for a more convenient and affordable preoperative assessment and postoperative comparison. Methods. A total of 200 feet were randomly selected from the radiograph archives of a foot and ankle clinic. An anteroposterior view of both feet was taken while standing on the same x-ray platform. The intermetatarsal angle (IMA), metatarsophalangeal angle (MPA), medial sesamoid position, RH, and ITP signs were assessed for statistical analysis. Results. There were 127 feet with an IMA > 9°. Both RH and ITP severities correlated significantly with IMA severity. RH and ITP were also significantly associated with each other, and the pronation deformities of these feet are probably related to extrinsic factors. There were also feet with discrepancies between their RH and ITP severities, possibly due to intrinsic torsion of the first metatarsal. Conclusion. Both RH and ITP are reliable first metatarsal pronation signs correlating to the metatarsus primus varus deformity of hallux valgus feet. They should be used more for preoperative and postoperative assessment. Cite this article: Bone Jt Open 2024;5(11):1037–1040


Bone & Joint Research
Vol. 9, Issue 9 | Pages 623 - 632
5 Sep 2020
Jayadev C Hulley P Swales C Snelling S Collins G Taylor P Price A

Aims. The lack of disease-modifying treatments for osteoarthritis (OA) is linked to a shortage of suitable biomarkers. This study combines multi-molecule synovial fluid analysis with machine learning to produce an accurate diagnostic biomarker model for end-stage knee OA (esOA). Methods. Synovial fluid (SF) from patients with esOA, non-OA knee injury, and inflammatory knee arthritis were analyzed for 35 potential markers using immunoassays. Partial least square discriminant analysis (PLS-DA) was used to derive a biomarker model for cohort classification. The ability of the biomarker model to diagnose esOA was validated by identical wide-spectrum SF analysis of a test cohort of ten patients with esOA. Results. PLS-DA produced a streamlined biomarker model with excellent sensitivity (95%), specificity (98.4%), and reliability (97.4%). The eight-biomarker model produced a fingerprint for esOA comprising type IIA procollagen N-terminal propeptide (PIIANP), tissue inhibitor of metalloproteinase (TIMP)-1, a disintegrin and metalloproteinase with thrombospondin motifs 4 (ADAMTS-4), monocyte chemoattractant protein (MCP)-1, interferon-γ-inducible protein-10 (IP-10), and transforming growth factor (TGF)-β3. Receiver operating characteristic (ROC) analysis demonstrated excellent discriminatory accuracy: area under the curve (AUC) being 0.970 for esOA, 0.957 for knee injury, and 1 for inflammatory arthritis. All ten validation test patients were classified correctly as esOA (accuracy 100%; reliability 100%) by the biomarker model. Conclusion. SF analysis coupled with machine learning produced a partially validated biomarker model with cohort-specific fingerprints that accurately and reliably discriminated esOA from knee injury and inflammatory arthritis with almost 100% efficacy. The presented findings and approach represent a new biomarker concept and potential diagnostic tool to stage disease in therapy trials and monitor the efficacy of such interventions. Cite this article: Bone Joint Res 2020;9(9):623–632


Bone & Joint Open
Vol. 2, Issue 12 | Pages 1075 - 1081
17 Dec 2021
Suthar A Yukata K Azuma Y Suetomi Y Yamazaki K Seki K Sakai T Fujii H

Aims. This study aimed to investigate the relationship between changes in patellar height and clinical outcomes at a mean follow-up of 7.7 years (5 to 10) after fixed-bearing posterior-stabilized total knee arthroplasty (PS-TKA). Methods. We retrospectively evaluated knee radiographs of 165 knees, which underwent fixed-bearing PS-TKA with patella resurfacing. The incidence of patella baja and changes in patellar height over a minimum of five years of follow-up were determined using Insall-Salvati ratio (ISR) measurement. We examined whether patella baja (ISR < 0.8) at final follow-up affected clinical outcomes, knee joint range of motion (ROM), and Knee Society Score (KSS). We also assessed inter- and intrarater reliability of ISR measurements and focused on the relationship between patellar height reduction beyond measurement error and clinical outcomes. Results. The ISR gradually decreased over five years after TKA, and finally 33 patients (20.0%) had patella baja. Patella baja at the final follow-up was not related to passive knee ROM or KSS. Interestingly, when we divided into two groups - patella baja and patella normal-alta (ISR ≥ 0.8) - the patella baja group already had a lower patellar height before surgery, compared with the patella normal-alta group. The ISR measurement error in this study was 0.17. Both passive knee flexion and KSS were significantly decreased in the group with a decrease in ISR of ≥ 0.17 at final follow-up. Conclusion. Patellar height gradually decreased over five years of follow-up after TKA. The reduction in patellar height beyond measurement error following TKA was associated with lower clinical outcomes. Cite this article: Bone Jt Open 2021;2(12):1075–1081


Bone & Joint Research
Vol. 7, Issue 5 | Pages 351 - 356
1 May 2018
Yeoman TFM Clement ND Macdonald D Moran M

Objectives. The primary aim of this study was to assess the reproducibility of the recalled preoperative Oxford Hip Score (OHS) and Oxford Knee Score (OKS) one year following arthroplasty for a cohort of patients. The secondary aim was to assess the reliability of a patient’s recollection of their own preoperative OHS and OKS one year following surgery. Methods. A total of 335 patients (mean age 72.5; 22 to 92; 53.7% female) undergoing total hip arthroplasty (n = 178) and total knee arthroplasty (n = 157) were prospectively assessed. Patients undergoing hip and knee arthroplasty completed an OHS or OKS, respectively, preoperatively and were asked to recall their preoperative condition while completing the same score one year after surgery. Results. A mean difference of 0.04 points (95% confidence intervals (CI) -15.64 to 15.72, p = 0.97) between the actual and the recalled OHS was observed. The mean difference in the OKS was 1.59 points (95% CI -11.57 to 14.75, p = 0.10). There was excellent reliability for the ‘average measures’ intra-class correlation for both the OHS (r = 0.802) and the OKS (r = 0.772). However, this reliability was diminished for the individuals OHS (r = 0.670) and OKS (r = 0.629) using single measures intra-class correlation. Bland–Altman plots demonstrated wide variation in the individual patient’s ability to recall their preoperative score (95% CI ± 16 for OHS, 95% CI ± 13 for OKS). Conclusion. Prospective preoperative collection of OHS and OKS remains the benchmark. Using recalled scores one year following hip and knee arthroplasty is an alternative when used to assess a cohort of patients. However, the recall of an individual patient’s preoperative score should not be relied upon due to the diminished reliability and wide CI. Cite this article: T. F. M. Yeoman, N. D. Clement, D. Macdonald, M. Moran. Recall of preoperative Oxford Hip and Knee Scores one year after arthroplasty is an alternative and reliable technique when used for a cohort of patients. Bone Joint Res 2018;7:351–356. DOI: 10.1302/2046-3758.75.BJR-2017-0259.R1


Bone & Joint Open
Vol. 3, Issue 4 | Pages 307 - 313
7 Apr 2022
Singh V Bieganowski T Huang S Karia R Davidovitch RI Schwarzkopf R

Aims. The Forgotten Joint Score-12 (FJS-12) is a validated patient-reported outcome measure (PROM) tool designed to assess artificial prosthesis awareness during daily activities following total hip arthroplasty (THA). The patient-acceptable symptom state (PASS) is the minimum cut-off value that corresponds to a patient’s satisfactory state-of-health. Despite the validity and reliability of the FJS-12 having been previously demonstrated, the PASS has yet to be clearly defined. This study aims to define the PASS of the FJS-12 following primary THA. Methods. We retrospectively reviewed all patients who underwent primary elective THA from 2019 to 2020, and answered both the FJS-12 and the Hip Disability and Osteoarthritis Outcome Score, Joint Replacement (HOOS, JR) questionnaires one-year postoperatively. HOOS, JR score was used as the anchor to estimate the PASS of FJS-12. Two statistical methods were employed: the receiver operating characteristic (ROC) curve point, which maximized the Youden index; and 75th percentile of the cumulative percentage curve of patients who had the HOOS, JR score difference larger than the cut-off value. Results. This study included 780 patients. The mean one-year FJS-12 score was 65.42 (SD 28.59). The mean one-year HOOS, JR score was 82.70 (SD 16.57). A high positive correlation between FJS-12 and HOOS, JR was found (r = 0.74; p<0.001), making the HOOS, JR a valid external anchor. The threshold score of the FJS-12 that maximized the sensitivity and specificity for detecting a PASS was 66.68 (area under the curve = 0.8). The cut-off score value computed with the 75th percentile approach was 92.20. Conclusion. The PASS threshold for the FJS-12 at one year following primary THA was 66.68 and 92.20 using the ROC curve and 75th percentile approaches, respectively. These values can be used to achieve consensus about meaningful postoperative improvement to maximize the utility of the FJS-12 to evaluate and counsel patients undergoing THA. Cite this article: Bone Jt Open 2022;3(4):307–313


Bone & Joint Research
Vol. 10, Issue 11 | Pages 714 - 722
1 Nov 2021
Qi W Feng X Zhang T Wu H Fang C Leung F

Aims. To fully verify the reliability and reproducibility of an experimental method in generating standardized micromotion for the rat femur fracture model. Methods. A modularized experimental device has been developed that allows rat models to be used instead of large animal models, with the aim of reducing systematic errors and time and money constraints on grouping. The bench test was used to determine the difference between the measured and set values of the micromotion produced by this device under different simulated loading weights. The displacement of the fixator under different loading conditions was measured by compression tests, which was used to simulate the unexpected micromotion caused by the rat’s ambulation. In vivo preliminary experiments with a small sample size were used to test the feasibility and effectiveness of the whole experimental scheme and surgical scheme. Results. The bench test showed that a weight loading < 500 g did not affect the operation of experimental device. The compression test demonstrated that the stiffness of the device was sufficient to keep the uncontrollable motion between fracture ends, resulting from the rat’s daily activities, within 1% strain. In vivo results on 15 rats prove that the device works reliably, without overburdening the experimental animals, and provides standardized micromotion reproductively at the fracture site according to the set parameters. Conclusion. Our device was able to investigate the effect of micromotion parameters on fracture healing by generating standardized micromotion to small animal models. Cite this article: Bone Joint Res 2021;10(11):714–722


Bone & Joint Open
Vol. 2, Issue 9 | Pages 705 - 709
1 Sep 2021
Wright J Timms A Fugazzotto S Goodier D Calder P

Aims. Patients undergoing limb reconstruction surgery often face a challenging and lengthy process to complete their treatment journey. The majority of existing outcome measures do not adequately capture the patient-reported outcomes relevant to this patient group in a single measure. Following a previous systematic review, the Stanmore Limb Reconstruction Score (SLRS) was designed with the intent to address this need for an effective instrument to measure patient-reported outcomes in limb reconstruction patients. We aim to assess the face validity of this score in a pilot study. Methods. The SLRS was designed following structured interviews with several groups including patients who have undergone limb reconstruction surgery, limb reconstruction surgeons, specialist nurses, and physiotherapists. This has subsequently undergone further adjustment for language and clarity. The score was then trialled on ten patients who had undergone limb reconstruction surgery, with subsequent structured questioning to understand the perceived suitability of the score. Results. Ten patients completed the score and the subsequent structured interview. Considering the tool as a whole, 100% of respondents felt the score to be comprehensible, relevant, and comprehensive regarding the areas that were important to a patient undergoing limb reconstruction surgery. For individual questions, on a five-point Likert scale, importance/relevance was reported as a mean of 4.78 (4.3 to 5.0), with ability to understand rated as 4.92 (4.7 to 5.0) suggesting high levels of relevance and comprehension. Flesch-Kincaid reading grade level was calculated as 5.2 (10 to 11 years old). Conclusion. The current SLRS has been shown to have acceptable scores from a patient sample regarding relevance, comprehensibility, and comprehensiveness. This suggests face validity, however further testing required and is ongoing in a larger cohort of patients to determine the reliability, responsiveness, precision, and criterion validity of the score in this patient group. Cite this article: Bone Jt Open 2021;2(9):705–709


Bone & Joint Research
Vol. 6, Issue 9 | Pages 530 - 534
1 Sep 2017
Krakow L Klockow A Roehner E Brodt S Eijer H Bossert J Matziolis G

Objectives. The determination of the volumetric polyethylene wear on explanted material requires complicated equipment, which is not available in many research institutions. Our aim in this study was to present and validate a method that only requires a set of polyetheretherketone balls and a laboratory balance to determine wear. Methods. The insert to be measured was placed on a balance, and a ball of the appropriate diameter was inserted. The cavity remaining between the ball and insert caused by wear was filled with contrast medium and the weight of the contrast medium was recorded. The volume was calculated from the known density of the liquid. The precision, inter- and intraobserver reliability, were determined by four investigators on four days using nine inserts with specified wear (0.094 ml to 1.626 ml), and the intra-class correlation coefficient was calculated. The feasibility of using this method in routine clinical practice and the time required for measurement were tested on 84 explanted inserts by one investigator. Results. In order to get the mean for all investigators and determinations, the deviation between the measured and specified wear was -0.08 ml . (sd. 0.12; -0.21 to 0.11). The interobserver reliability was 0.989 ml (95% confidence interval (CI) 0.964 to 0.997) and the intraobserver reliability was 0.941 for observer 1 (95% CI 0.846 to 0.985), 0.983 for observer 2 (95% CI 0.956 to 0.995), 0.939 for observer 3 (95% CI 0.855 to 0.984), and 0.934 for observer 4 (95% CI 0.790 to 0.984). The mean time required to examine the samples was two minutes . (sd. 2; 1 to 5). Conclusion. The method presented here was shown to be sufficiently precise for many settings and is a cost-effective and quick method of determining the volumetric wear of explanted acetabular components. However, the measurement of wear for scientific purposes will probably continue to involve more accurate and dedicated laboratory equipment. Cite this article: Bone Joint Res 2017;6:530–534


Bone & Joint Research
Vol. 5, Issue 4 | Pages 153 - 161
1 Apr 2016
Kleinlugtenbelt YV Nienhuis RW Bhandari M Goslings JC Poolman RW Scholtes VAB

Objectives. Patient-reported outcome measures (PROMs) are often used to evaluate the outcome of treatment in patients with distal radial fractures. Which PROM to select is often based on assessment of measurement properties, such as validity and reliability. Measurement properties are assessed in clinimetric studies, and results are often reviewed without considering the methodological quality of these studies. Our aim was to systematically review the methodological quality of clinimetric studies that evaluated measurement properties of PROMs used in patients with distal radial fractures, and to make recommendations for the selection of PROMs based on the level of evidence of each individual measurement property. Methods. A systematic literature search was performed in PubMed, EMbase, CINAHL and PsycINFO databases to identify relevant clinimetric studies. Two reviewers independently assessed the methodological quality of the studies on measurement properties, using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist. Level of evidence (strong / moderate / limited / lacking) for each measurement property per PROM was determined by combining the methodological quality and the results of the different clinimetric studies. Results. In all, 19 out of 1508 identified unique studies were included, in which 12 PROMs were rated. The Patient-rated wrist evaluation (PRWE) and the Disabilities of Arm, Shoulder and Hand questionnaire (DASH) were evaluated on most measurement properties. The evidence for the PRWE is moderate that its reliability, validity (content and hypothesis testing), and responsiveness are good. The evidence is limited that its internal consistency and cross-cultural validity are good, and its measurement error is acceptable. There is no evidence for its structural and criterion validity. The evidence for the DASH is moderate that its responsiveness is good. The evidence is limited that its reliability and the validity on hypothesis testing are good. There is no evidence for the other measurement properties. Conclusion. According to this systematic review, there is, at best, moderate evidence that the responsiveness of the PRWE and DASH are good, as are the reliability and validity of the PRWE. We recommend these PROMs in clinical studies in patients with distal radial fractures; however, more clinimetric studies of higher methodological quality are needed to adequately determine the other measurement properties. Cite this article: Dr Y. V. Kleinlugtenbelt. Are validated outcome measures used in distal radial fractures truly valid?: A critical assessment using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist. Bone Joint Res 2016;5:153–161. DOI: 10.1302/2046-3758.54.2000462


Bone & Joint Research
Vol. 7, Issue 7 | Pages 468 - 475
1 Jul 2018
He Q Sun H Shu L Zhu Y Xie X Zhan Y Luo C

Objectives. Researchers continue to seek easier ways to evaluate the quality of bone and screen for osteoporosis and osteopenia. Until recently, radiographic images of various parts of the body, except the distal femur, have been reappraised in the light of dual-energy X-ray absorptiometry (DXA) findings. The incidence of osteoporotic fractures around the knee joint in the elderly continues to increase. The aim of this study was to propose two new radiographic parameters of the distal femur for the assessment of bone quality. Methods. Anteroposterior radiographs of the knee and bone mineral density (BMD) and T-scores from DXA scans of 361 healthy patients were prospectively analyzed. The mean cortical bone thickness (CBTavg) and the distal femoral cortex index (DFCI) were the two parameters that were proposed and measured. Intra- and interobserver reliabilities were assessed. Correlations between the BMD and T-score and these parameters were investigated and their value in the diagnosis of osteoporosis and osteopenia was evaluated. Results. The DFCI, as a ratio, had higher reliability than the CBTavg. Both showed significant correlation with BMD and T-score. When compared with DFCI, CBTavg showed better correlation and was better for predicting osteoporosis and osteopenia. Conclusion. The CBTavg and DFCI are simple and reliable screening tools for the prediction of osteoporosis and osteopenia. The CBTavg is more accurate but the DFCI is easier to use in clinical practice. Cite this article: Q-F. He, H. Sun, L-Y. Shu, Y. Zhu, X-T. Xie, Y. Zhan, C-F. Luo. Radiographic predictors for bone mineral loss: Cortical thickness and index of the distal femur. Bone Joint Res 2018;7:468–475. DOI: 10.1302/2046-3758.77.BJR-2017-0332.R1


Bone & Joint Research
Vol. 5, Issue 4 | Pages 116 - 121
1 Apr 2016
Leow JM Clement ND Tawonsawatruk T Simpson CJ Simpson AHRW

Objectives. The radiographic union score for tibial (RUST) fractures was developed by Whelan et al to assess the healing of tibial fractures following intramedullary nailing. In the current study, the repeatability and reliability of the RUST score was evaluated in an independent centre (a) using the original description, (b) after further interpretation of the description of the score, and (c) with the immediate post-operative radiograph available for comparison. Methods. A total of 15 radiographs of tibial shaft fractures treated by intramedullary nailing (IM) were scored by three observers using the RUST system. Following discussion on how the criteria of the RUST system should be implemented, 45 sets (i.e. AP and lateral) of radiographs of IM nailed tibial fractures were scored by five observers. Finally, these 45 sets of radiographs were rescored with the baseline post-operative radiograph available for comparison. Results. The initial intraclass correlation (ICC) on the first 15 sets of radiographs was 0.67 (95% CI 0.63 to 0.71). However, the original description was being interpreted in different ways. After agreeing on the interpretation, the ICC on the second cohort improved to 0.75. The ICC improved even further to 0.79, when the baseline post-operative radiographs were available for comparison. Conclusion. This study demonstrates that the RUST scoring system is a reliable and repeatable outcome measure for assessing tibial fracture healing. Further improvement in the reliability of the scoring system can be obtained if the radiographs are compared with the baseline post-operative radiographs. Cite this article: Mr J.M. Leow. The radiographic union scale in tibial (RUST) fractures: Reliability of the outcome measure at an independent centre. Bone Joint Res 2016;5:116–121. DOI: 10.1302/2046-3758.54.2000628


Bone & Joint Open
Vol. 1, Issue 7 | Pages 364 - 369
10 Jul 2020
Aarvold A Lohre R Chhina H Mulpuri K Cooper A

Aims. Though the pathogenesis of Legg-Calve-Perthes disease (LCPD) is unknown, repetitive microtrauma resulting in deformity has been postulated. The purpose of this study is to trial a novel upright MRI scanner, to determine whether any deformation occurs in femoral heads affected by LCPD with weightbearing. Methods. Children affected by LCPD were recruited for analysis. Children received both standing weightbearing and supine scans in the MROpen upright MRI scanner, for coronal T1 GFE sequences, both hips in field of view. Parameters of femoral head height, width, and lateral extrusion of affected and unaffected hips were assessed by two independent raters, repeated at a one month interval. Inter- and intraclass correlation coefficients were determined. Standing and supine measurements were compared for each femoral head. Results. Following rigorous protocol development in healthy age-matched volunteers, successful scanning was performed in 11 LCPD-affected hips in nine children, with seven unaffected hips therefore available for comparison. Five hips were in early stage (1 and 2) and six were in late stage (3 and 4). The mean age was 5.3 years. All hips in early-stage LCPD demonstrated dynamic deformity on weightbearing. Femoral head height decreased (mean 1.2 mm, 12.4% decrease), width increased (mean 2.5 mm, 7.2% increase), and lateral extrusion increased (median 2.5 mm, 23% increase) on standing weightbearing MRI compared to supine scans. Negligible deformation was observed in contra-lateral unaffected hips, with less deformation observed in late-stage hips. Inter- and intraclass reliability for all measured parameters was good to excellent. Conclusion. This pilot study has described an effective novel research investigation for children with LCPD. Femoral heads in early-stage LCPD demonstrated dynamic deformity on weightbearing not previously seen, while unaffected hips did not. Expansion of this protocol will allow further translational study into the effects of loading hips with LCPD. Cite this article: Bone Joint Open 2020;1-7:364–369


Bone & Joint Research
Vol. 8, Issue 10 | Pages 502 - 508
1 Oct 2019
Mao W Ni H Li L He Y Chen X Tang H Dong Y

Objectives. Different criteria for assessing the reduction quality of trochanteric fractures have been reported. The Baumgaertner reduction quality criteria (BRQC) are relatively common and the Chang reduction quality criteria (CRQC) are relatively new. The objectives of the current study were to compare the reliability of the BRQC and CRQC in predicting mechanical complications and to investigate the clinical implications of the CRQC. Methods. A total of 168 patients were assessed in a retrospective observational study. Clinical information including age, sex, fracture side, American Society of Anesthesiologists (ASA) classification, tip-apex distance (TAD), fracture classification, reduction quality, blade position, BRQC, CRQC, bone quality, and the occurrence of mechanical complications were used in the statistical analysis. Results. A total of 127 patients were included in the full analysis, and mechanical complications were observed in 26 patients. The TAD, blade position, BRQC and CRQC were significantly associated with mechanical complications in the univariate analysis. Only the TAD (p = 0.025) and the CRQC (p < 0.001) showed significant results in the multivariate analysis. In the comparison of the receiver operating characteristic curves, the CRQC also performed better than the BRQC. Conclusion. The CRQC are reliable in predicting mechanical complications and are more reliable than the BRQC. Future studies could use the CRQC to assess fracture reduction quality. Intraoperatively, the surgeon should refer to the CRQC to achieve good reduction in trochanteric fractures. Cite this article: Bone Joint Res 2019;8:502–508


Bone & Joint Research
Vol. 8, Issue 3 | Pages 146 - 155
1 Mar 2019
Langton DJ Natu S Harrington CF Bowsher JG Nargol AVF

Objectives. We investigated the reliability of the cobalt-chromium (CoCr) synovial joint fluid ratio (JFR) in identifying the presence of a severe aseptic lymphocyte-dominated vasculitis-associated lesion (ALVAL) response and/or suboptimal taper performance (SOTP) following metal-on-metal (MoM) hip arthroplasty. We then examined the possibility that the CoCr JFR may influence the serum partitioning of Co and Cr. Methods. For part A, we included all revision surgeries carried out at our unit with the relevant data, including volumetric wear analysis, joint fluid (JF) Co and Cr concentrations, and ALVAL grade (n = 315). Receiver operating characteristic curves were constructed to assess the reliability of the CoCr JFR in identifying severe ALVAL and/or SOTP. For part B, we included only patients with unilateral prostheses who had given matched serum and whole blood samples for Co and Cr analysis (n = 155). Multiple regression was used to examine the influence of JF concentrations on the serum partitioning of Co and Cr in the blood. Results. A CoCr JFR > 1 showed a specificity of 83% (77% to 88%) and sensitivity of 63% (55% to 70%) for the detection of severe ALVAL and/or SOTP. In patients with CoCr JFRs > 1, the median blood Cr to serum Cr ratio was 0.99, compared with 0.71 in patients with CoCr JFRs < 1 (p < 0.001). Regression analysis demonstrated that the blood Cr to serum Cr value was positively associated with the JF Co concentration (p = 0.011) and inversely related to the JF Cr concentration (p < 0.001). Conclusion. Elevations in CoCr JFRs are associated with adverse biological (severe ALVAL) or tribocorrosive processes (SOTP). Comparison of serum Cr with blood Cr concentrations may be a useful additional clinical tool to help to identify these conditions. Cite this article: D. J. Langton, S. Natu, C. F. Harrington, J. G. Bowsher, A. V. F. Nargol. Is the synovial fluid cobalt-to-chromium ratio related to the serum partitioning of metal debris following metal-on-metal hip arthroplasty? Bone Joint Res 2019;8:146–155. DOI: 10.1302/2046-3758.83.BJR-2018-0049.R1


Bone & Joint Open
Vol. 5, Issue 5 | Pages 394 - 400
15 May 2024
Nishi M Atsumi T Yoshikawa Y Okano I Nakanishi R Watanabe M Usui Y Kudo Y

Aims

The localization of necrotic areas has been reported to impact the prognosis and treatment strategy for osteonecrosis of the femoral head (ONFH). Anteroposterior localization of the necrotic area after a femoral neck fracture (FNF) has not been properly investigated. We hypothesize that the change of the weight loading direction on the femoral head due to residual posterior tilt caused by malunited FNF may affect the location of ONFH. We investigate the relationship between the posterior tilt angle (PTA) and anteroposterior localization of osteonecrosis using lateral hip radiographs.

Methods

Patients aged younger than 55 years diagnosed with ONFH after FNF were retrospectively reviewed. Overall, 65 hips (38 males and 27 females; mean age 32.6 years (SD 12.2)) met the inclusion criteria. Patients with stage 1 or 4 ONFH, as per the Association Research Circulation Osseous classification, were excluded. The ratios of anterior and posterior viable areas and necrotic areas of the femoral head to the articular surface were calculated by setting the femoral head centre as the reference point. The PTA was measured using Palm’s method. The association between the PTA and viable or necrotic areas of the femoral head was assessed using Spearman’s rank correlation analysis (median PTA 6.0° (interquartile range 3 to 11.5)).


Bone & Joint Research
Vol. 12, Issue 3 | Pages 155 - 164
1 Mar 2023
McCarty CP Nazif MA Sangiorgio SN Ebramzadeh E Park S

Aims

Taper corrosion has been widely reported to be problematic for modular total hip arthroplasty implants. A simple and systematic method to evaluate taper damage with sufficient resolution is needed. We introduce a semiquantitative grading system for modular femoral tapers to characterize taper corrosion damage.

Methods

After examining a unique collection of retrieved cobalt-chromium (CoCr) taper sleeves (n = 465) using the widely-used Goldberg system, we developed an expanded six-point visual grading system intended to characterize the severity, visible material loss, and absence of direct component contact due to corrosion. Female taper sleeve damage was evaluated by three blinded observers using the Goldberg scoring system and the expanded system. A subset (n = 85) was then re-evaluated following destructive cleaning, using both scoring systems. Material loss for this subset was quantified using metrology and correlated with both scoring systems.


Bone & Joint Open
Vol. 5, Issue 10 | Pages 904 - 910
18 Oct 2024
Bergman EM Mulligan EP Patel RM Wells J

Aims

The Single Assessment Numerical Evalution (SANE) score is a pragmatic alternative to longer patient-reported outcome measures (PROMs). The purpose of this study was to investigate the concurrent validity of the SANE and hip-specific PROMs in a generalized population of patients with hip pain at a single timepoint upon initial visit with an orthopaedic surgeon who is a hip preservation specialist. We hypothesized that SANE would have a strong correlation with the 12-question International Hip Outcome Tool (iHOT)-12, the Hip Outcome Score (HOS), and the Hip disability and Osteoarthritis Outcome Score (HOOS), providing evidence for concurrent validity of the SANE and hip-specific outcome measures in patients with hip pain.

Methods

This study was a cross-sectional retrospective database analysis at a single timepoint. Data were collected from 2,782 patients at initial evaluation with a hip preservation specialist using the iHOT-12, HOS, HOOS, and SANE. Outcome scores were retrospectively analyzed using Pearson correlation coefficients.