Advertisement for orthosearch.org.uk
Results 1 - 50 of 1250
Results per page:
The Bone & Joint Journal
Vol. 105-B, Issue 10 | Pages 1123 - 1130
1 Oct 2023
Donnan M Anderson N Hoq M Donnan L

Aims. The aim of this study was to investigate the agreement in interpretation of the quality of the paediatric hip ultrasound examination, the reliability of geometric and morphological assessment, and the relationship between these measurements. Methods. Four investigators evaluated 60 hip ultrasounds and assessed their quality based the standard plane of Graf et al. They measured geometric parameters, described the morphology of the hip, and assigned the Graf grade of dysplasia. They analyzed one self-selected image and one randomly selected image from the ultrasound series, and repeated the process four weeks later. The intra- and interobserver agreement, and correlations between various parameters were analyzed. Results. In the assessment of quality, there a was moderate to substantial intraobserver agreement for each element investigated, but interobserver agreement was poor. Morphological features showed weak to moderate agreement across all parameters but improved to significant when responses were reduced. The geometric measurements showed nearly perfect agreement, and the relationship between them and the morphological features showed a dose response across all parameters with moderate to substantial correlations. There were strong correlations between geometric measurements. The Graf classification showed a fair to moderate interobserver agreement, and moderate to substantial intraobserver agreement. Conclusion. This investigation into the reliability of the interpretation of hip ultrasound scans identified the difficulties in defining what is a high-quality ultrasound. We confirmed that geometric measurements are reliably interpreted and may be useful as a further measurement of quality. Morphological features are generally poorly interpreted, but a simpler binary classification considerably improves agreement. As there is a clear dose response relationship between geometric and morphological measurements, the importance of morphology in the diagnosis of hip dysplasia should be questioned. Cite this article: Bone Joint J 2023;105-B(10):1123–1130


Bone & Joint Research
Vol. 4, Issue 12 | Pages 190 - 194
1 Dec 2015
Kleinlugtenbelt YV Hoekstra M Ham SJ Kloen P Haverlag R Simons MP Bhandari M Goslings JC Poolman RW Scholtes VAB

Objectives. Current studies on the additional benefit of using computed tomography (CT) in order to evaluate the surgeons’ agreement on treatment plans for fracture are inconsistent. This inconsistency can be explained by a methodological phenomenon called ‘spectrum bias’, defined as the bias inherent when investigators choose a population lacking therapeutic uncertainty for evaluation. The aim of the study is to determine the influence of spectrum bias on the intra-observer agreement of treatment plans for fractures of the distal radius. Methods. Four surgeons evaluated 51 patients with displaced fractures of the distal radius at four time points: T1 and T2: conventional radiographs; T3 and T4: radiographs and additional CT scan (radiograph and CT). Choice of treatment plan (operative or non-operative) and therapeutic certainty (five-point scale: very uncertain to very certain) were rated. To determine the influence of spectrum bias, the intra-observer agreement was analysed, using Kappa statistics, for each degree of therapeutic certainty. . Results. In cases with high therapeutic certainty, intra-observer agreement based on radiograph was almost perfect (0.86 to 0.90), but decreased to moderate based on a radiograph and CT (0.47 to 0.60). In cases with high therapeutic uncertainty, intra-observer agreement was slight at best (-0.12 to 0.19), but increased to moderate based on the radiograph and CT (0.56 to 0.57). Conclusion. Spectrum bias influenced the outcome of this agreement study on treatment plans. An additional CT scan improves the intra-observer agreement on treatment plans for a fracture of the distal radius only when there is therapeutic uncertainty. Reporting and analysing intra-observer agreement based on the surgeon’s level of certainty is an appropriate method to minimise spectrum bias. Cite this article: Bone Joint Res 2015;4:190–194


The Journal of Bone & Joint Surgery British Volume
Vol. 70-B, Issue 2 | Pages 299 - 301
1 Mar 1988
Dias J Taylor M Thompson J Brenkel I Gregg P

Inter-observer agreement and reproducibility of opinion were assessed for the radiographic diagnosis of union of scaphoid fractures on films taken 12 weeks after injury. Weighted kappa statistics were used to compare the opinions of eight senior observers reviewing 20 sets of good quality radiographs on two occasions separated by two months. There was poor agreement on whether trabeculae crossed the fracture line, whether there was sclerosis at or near the fracture and on whether the proximal part of the scaphoid was avascular. As a consequence, agreement on union also was poor; it appears that radiographs taken 12 weeks after a scaphoid fracture do not provide reliable and reproducible evidence of healing


Bone & Joint 360
Vol. 10, Issue 6 | Pages 8 - 10
1 Dec 2021
Spacey K Wimhurst J Hasan R Sharma D


The Journal of Bone & Joint Surgery British Volume
Vol. 30-B, Issue 1 | Pages 4 - 6
1 Feb 1948


The Bone & Joint Journal
Vol. 105-B, Issue 9 | Pages 1007 - 1012
1 Sep 2023
Hoeritzauer I Paterson M Jamjoom AAB Srikandarajah N Soleiman H Poon MTC Copley PC Graves C MacKay S Duong C Leung AHC Eames N Statham PFX Darwish S Sell PJ Thorpe P Shekhar H Roy H Woodfield J

Aims. Patients with cauda equina syndrome (CES) require emergency imaging and surgical decompression. The severity and type of symptoms may influence the timing of imaging and surgery, and help predict the patient’s prognosis. Categories of CES attempt to group patients for management and prognostication purposes. We aimed in this study to assess the inter-rater reliability of dividing patients with CES into categories to assess whether they can be reliably applied in clinical practice and in research. Methods. A literature review was undertaken to identify published descriptions of categories of CES. A total of 100 real anonymized clinical vignettes of patients diagnosed with CES from the Understanding Cauda Equina Syndrome (UCES) study were reviewed by consultant spinal surgeons, neurosurgical registrars, and medical students. All were provided with published category definitions and asked to decide whether each patient had ‘suspected CES’; ‘early CES’; ‘incomplete CES’; or ‘CES with urinary retention’. Inter-rater agreement was assessed for all categories, for all raters, and for each group of raters using Fleiss’s kappa. Results. Each of the 100 participants were rated by four medical students, five neurosurgical registrars, and four consultant spinal surgeons. No groups achieved reasonable inter-rater agreement for any of the categories. CES with retention versus all other categories had the highest inter-rater agreement (kappa 0.34 (95% confidence interval 0.27 to 0.31); minimal agreement). There was no improvement in inter-rater agreement with clinical experience. Across all categories, registrars agreed with each other most often (kappa 0.41), followed by medical students (kappa 0.39). Consultant spinal surgeons had the lowest inter-rater agreement (kappa 0.17). Conclusion. Inter-rater agreement for categorizing CES is low among clinicians who regularly manage these patients. CES categories should be used with caution in clinical practice and research studies, as groups may be heterogenous and not comparable. Cite this article: Bone Joint J 2023;105-B(9):1007–1012


Bone & Joint Research
Vol. 13, Issue 1 | Pages 19 - 27
5 Jan 2024
Baertl S Rupp M Kerschbaum M Morgenstern M Baumann F Pfeifer C Worlicek M Popp D Amanatullah DF Alt V

Aims. This study aimed to evaluate the clinical application of the PJI-TNM classification for periprosthetic joint infection (PJI) by determining intraobserver and interobserver reliability. To facilitate its use in clinical practice, an educational app was subsequently developed and evaluated. Methods. A total of ten orthopaedic surgeons classified 20 cases of PJI based on the PJI-TNM classification. Subsequently, the classification was re-evaluated using the PJI-TNM app. Classification accuracy was calculated separately for each subcategory (reinfection, tissue and implant condition, non-human cells, and morbidity of the patient). Fleiss’ kappa and Cohen’s kappa were calculated for interobserver and intraobserver reliability, respectively. Results. Overall, interobserver and intraobserver agreements were substantial across the 20 classified cases. Analyses for the variable ‘reinfection’ revealed an almost perfect interobserver and intraobserver agreement with a classification accuracy of 94.8%. The category 'tissue and implant conditions' showed moderate interobserver and substantial intraobserver reliability, while the classification accuracy was 70.8%. For 'non-human cells,' accuracy was 81.0% and interobserver agreement was moderate with an almost perfect intraobserver reliability. The classification accuracy of the variable 'morbidity of the patient' reached 73.5% with a moderate interobserver agreement, whereas the intraobserver agreement was substantial. The application of the app yielded comparable results across all subgroups. Conclusion. The PJI-TNM classification system captures the heterogeneity of PJI and can be applied with substantial inter- and intraobserver reliability. The PJI-TNM educational app aims to facilitate application in clinical practice. A major limitation was the correct assessment of the implant situation. To eliminate this, a re-evaluation according to intraoperative findings is strongly recommended. Cite this article: Bone Joint Res 2024;13(1):19–27


Aims. Classifying trochlear dysplasia (TD) is useful to determine the treatment options for patients suffering from patellofemoral instability (PFI). There is no consensus on which classification system is more reliable and reproducible for the purpose of guiding clinicians’ management of PFI. There are also concerns about the validity of the Dejour Classification (DJC), which is the most widely used classification for TD, having only a fair reliability score. The Oswestry-Bristol Classification (OBC) is a recently proposed system of classification of TD, and the authors report a fair-to-good interobserver agreement and good-to-excellent intraobserver agreement in the assessment of TD. The aim of this study was to compare the reliability and reproducibility of these two classifications. Methods. In all, six assessors (four consultants and two registrars) independently evaluated 100 axial MRIs of the patellofemoral joint (PFJ) for TD and classified them according to OBC and DJC. These assessments were again repeated by all raters after four weeks. The inter- and intraobserver reliability scores were calculated using Cohen’s kappa and Cronbach’s α. Results. Both classifications showed good to excellent interobserver reliability with high α scores. The OBC classification showed a substantial intraobserver agreement (mean kappa 0.628; p < 0.005) whereas the DJC showed a moderate agreement (mean kappa 0.572; p < 0.005). There was no significant difference in the kappa values when comparing the assessments by consultants with those by registrars, in either classification system. Conclusion. This large study from a non-founding institute shows both classification systems to be reliable for classifying TD based on axial MRIs of the PFJ, with the simple-to-use OBC having a higher intraobserver reliability score than that of the DJC. Cite this article: Bone Jt Open 2023;4(7):532–538


Bone & Joint Research
Vol. 12, Issue 5 | Pages 313 - 320
8 May 2023
Saiki Y Kabata T Ojima T Kajino Y Kubo N Tsuchiya H

Aims. We aimed to assess the reliability and validity of OpenPose, a posture estimation algorithm, for measurement of knee range of motion after total knee arthroplasty (TKA), in comparison to radiography and goniometry. Methods. In this prospective observational study, we analyzed 35 primary TKAs (24 patients) for knee osteoarthritis. We measured the knee angles in flexion and extension using OpenPose, radiography, and goniometry. We assessed the test-retest reliability of each method using intraclass correlation coefficient (1,1). We evaluated the ability to estimate other measurement values from the OpenPose value using linear regression analysis. We used intraclass correlation coefficients (2,1) and Bland–Altman analyses to evaluate the agreement and error between radiography and the other measurements. Results. OpenPose had excellent test-retest reliability (intraclass correlation coefficient (1,1) = 1.000). The R. 2. of all regression models indicated large correlations (0.747 to 0.927). In the flexion position, the intraclass correlation coefficients (2,1) of OpenPose indicated excellent agreement (0.953) with radiography. In the extension position, the intraclass correlation coefficients (2,1) indicated good agreement of OpenPose and radiography (0.815) and moderate agreement of goniometry with radiography (0.593). OpenPose had no systematic error in the flexion position, and a 2.3° fixed error in the extension position, compared to radiography. Conclusion. OpenPose is a reliable and valid tool for measuring flexion and extension positions after TKA. It has better accuracy than goniometry, especially in the extension position. Accurate measurement values can be obtained with low error, high reproducibility, and no contact, independent of the examiner’s skills. Cite this article: Bone Joint Res 2023;12(5):313–320


The Bone & Joint Journal
Vol. 105-B, Issue 12 | Pages 1259 - 1264
1 Dec 2023
Hurley ET Hughes AJ Savage-Elliott I Dejour D Campbell KA Mulcahey MK Wittstein JR Jazrawi LM

Aims. The aim of this study was to establish consensus statements on the diagnosis, nonoperative management, and indications, if any, for medial patellofemoral complex (MPFC) repair in patients with patellar instability, using the modified Delphi approach. Methods. A total of 60 surgeons from 11 countries were invited to develop consensus statements based on their expertise in this area. They were assigned to one of seven working groups defined by subtopics of interest within patellar instability. Consensus was defined as achieving between 80% and 89% agreement, strong consensus was defined as between 90% and 99% agreement, and 100% agreement was considered to be unanimous. Results. Of 27 questions and statements on patellar instability, three achieved unanimous consensus, 14 achieved strong consensus, five achieved consensus, and five did not achieve consensus. Conclusion. The statements that reached unanimous consensus were that an assessment of physeal status is critical for paediatric patients with patellar instability. There was also unanimous consensus on early mobilization and resistance training following nonoperative management once there is no apprehension. The statements that did not achieve consensus were on the importance of immobilization of the knee, the use of orthobiologics in nonoperative management, the indications for MPFC repair, and whether a vastus medialis oblique advancement should be performed. Cite this article: Bone Joint J 2023;105-B(12):1259–1264


The Bone & Joint Journal
Vol. 105-B, Issue 12 | Pages 1265 - 1270
1 Dec 2023
Hurley ET Sherman SL Chahla J Gursoy S Alaia MJ Tanaka MJ Pace JL Jazrawi LM

Aims. The aim of this study was to establish consensus statements on medial patellofemoral ligament (MPFL) reconstruction, anteromedialization tibial tubercle osteotomy, trochleoplasty, and rehabilitation and return to sporting activity in patients with patellar instability, using the modified Delphi process. Methods. This was the second part of a study dealing with these aspects of management in these patients. As in part I, a total of 60 surgeons from 11 countries contributed to the development of consensus statements based on their expertise in this area. They were assigned to one of seven working groups defined by subtopics of interest. Consensus was defined as achieving between 80% and 89% agreement, strong consensus was defined as between 90% and 99% agreement, and 100% agreement was considered unanimous. Results. Of 41 questions and statements on patellar instability, none achieved unanimous consensus, 19 achieved strong consensus, 15 achieved consensus, and seven did not achieve consensus. Conclusion. Most statements reached some degree of consensus, without any achieving unanimous consensus. There was no consensus on the use of anchors in MPFL reconstruction, and the order of fixation of the graft (patella first versus femur first). There was also no consensus on the indications for trochleoplasty or its effect on the viability of the cartilage after elevation of the osteochondral flap. There was also no consensus on postoperative immobilization or weightbearing, or whether paediatric patients should avoid an early return to sport. Cite this article: Bone Joint J 2023;105-B(12):1265–1270


The Bone & Joint Journal
Vol. 103-B, Issue 12 | Pages 1802 - 1808
1 Dec 2021
Bruce J Knight R Parsons N Betteridge R Verdon A Brown J Campolier M Achten J Costa ML

Aims. Deep surgical site infection (SSI) is common after lower limb fracture. We compared the diagnosis of deep SSI using alternative methods of data collection and examined the agreement of clinical photography and in-person clinical assessment by the Centers for Disease Control and Prevention (CDC) criteria after lower limb fracture surgery. Methods. Data from two large, UK-based multicentre randomized controlled major trauma trials investigating SSI and wound healing after surgical repair of open lower limb fractures that could not be primarily closed (UK WOLLF), and surgical incisions for fractures that were primarily closed (UK WHiST), were examined. Trial interventions were standard wound care management and negative pressure wound therapy after initial surgical debridement. Wound outcomes were collected from 30 days to six weeks. We compared the level of agreement between wound photography and clinical assessment of CDC-defined SSI. We are also assessed the level of agreement between blinded independent assessors of the photographs. Results. Rates of CDC-defined deep SSI were 7.6% (35/460) after open fracture and 6.3% (95/1519) after closed incisional repair. Photographs were obtained for 77% and 73% of WOLLF and WHiST cohorts respectively (all participants n = 1,478). Agreement between photographic-SSI and CDC-SSI was fair for open fracture wounds (83%; k = 0.27 (95% confidence interval (CI) 0.14 to 0.42)) and for closed incisional wounds (88%; k = 0.29 (95% CI 0.20 to 0.37)) although the rate of photographically detected deep SSIs was twice as high as CDC-SSI (12% vs 6%). Agreement between different assessors for photographic-SSI (WOLLF 88%, k = 0.63 (95% CI 0.52 to 0.72); WHiST 89%; k = 0.61 (95% CI 0.54 to 0.69)); and wound healing was good (WOLLF 90%; k = 0.80 (95% CI 0.73 to 0.86); WHiST 87%; k = 0.57 (95% CI 0.50 to 0.64)). Conclusion. Although wound photography was feasible within the research context and inter-rater assessor agreement substantial, digital photographs used in isolation overestimated deep SSI rates, when compared to CDC criteria. Wound photography should not replace clinical assessment in pragmatic trials but may be useful for screening purposes where surgical infection outcomes are paramount. Cite this article: Bone Joint J 2021;103-B(12):1802–1808


The Bone & Joint Journal
Vol. 103-B, Issue 8 | Pages 1339 - 1344
1 Aug 2021
Jain S Mohrir G Townsend O Lamb JN Palan J Aderinto J Pandit H

Aims. This aim of this study was to assess the reliability and validity of the Unified Classification System (UCS) for postoperative periprosthetic femoral fractures (PFFs) around cemented polished taper-slip (PTS) stems. Methods. Radiographs of 71 patients with a PFF admitted consecutively at two centres between 25 February 2012 and 19 May 2020 were collated by an independent investigator. Six observers (three hip consultants and three trainees) were familiarized with the UCS. Each PFF was classified on two separate occasions, with a mean time between assessments of 22.7 days (16 to 29). Interobserver reliability for more than two observers was assessed using percentage agreement and Fleiss’ kappa statistic. Intraobserver reliability between two observers was calculated with Cohen kappa statistic. Validity was tested on surgically managed UCS type B PFFs where stem stability was documented in operation notes (n = 50). Validity was assessed using percentage agreement and Cohen kappa statistic between radiological assessment and intraoperative findings. Kappa statistics were interpreted using Landis and Koch criteria. All six observers were blinded to operation notes and postoperative radiographs. Results. Interobserver reliability percentage agreement was 58.5% and the overall kappa value was 0.442 (moderate agreement). Lowest kappa values were seen for type B fractures (0.095 to 0.360). The mean intraobserver reliability kappa value was 0.672 (0.447 to 0.867), indicating substantial agreement. Validity percentage agreement was 65.7% and the mean kappa value was 0.300 (0.160 to 0.4400) indicating only fair agreement. Conclusion. This study demonstrates that the UCS is unsatisfactory for the classification of PFFs around PTS stems, and that it has considerably lower reliability and validity than previously described for other stem types. Radiological PTS stem loosening in the presence of PFF is poorly defined and formal intraoperative testing of stem stability is recommended. Cite this article: Bone Joint J 2021;103-B(8):1339–1344


The Bone & Joint Journal
Vol. 106-B, Issue 9 | Pages 898 - 906
1 Sep 2024
Kayani B Wazir MUK Mancino F Plastow R Haddad FS

Aims. The primary objective of this study was to develop a validated classification system for assessing iatrogenic bone trauma and soft-tissue injury during total hip arthroplasty (THA). The secondary objective was to compare macroscopic bone trauma and soft-tissues injury in conventional THA (CO THA) versus robotic arm-assisted THA (RO THA) using this classification system. Methods. This study included 30 CO THAs versus 30 RO THAs performed by a single surgeon. Intraoperative photographs of the osseous acetabulum and periacetabular soft-tissues were obtained prior to implantation of the acetabular component, which were used to develop the proposed classification system. Interobserver and intraobserver variabilities of the proposed classification system were assessed. Results. The BOne trauma and Soft-Tissue Injury classification system in total Hip arthroplasty (BOSTI Hip) grades osseous acetabular trauma and periarticular muscle damage during THA. The classification system has an interclass correlation coefficient of 0.90 (95% CI 0.86 to 0.93) for interobserver agreement and 0.89 (95% CI 0.84 to 0.93) for intraobserver agreement. RO THA was associated with improved BOSTI Hip scores (p = 0.002) and more pristine osseous surfaces in the anterior superior (p = 0.001) and posterior superior (p < 0.001) acetabular quadrants compared with CO THA. There were no differences between the groups in relation to injury to the gluteus medius (p = 0.084), obturator internus (p = 0.241), piriformis (p = 0.081), superior gamellus (p = 0.116), inferior gamellus (p = 0.132), quadratus femoris (p = 0.208), and vastus lateralis (p = 0.135), but overall combined muscle injury was reduced in RO THA compared with CO THA (p = 0.023). Discussion. The proposed BOSTI Hip classification provides a reproducible grading system for stratifying iatrogenic bone trauma and soft-tissue injury during THA. RO THA was associated with improved BOSTI Hip scores, more pristine osseous acetabular surfaces, and reduced combined periarticular muscle injury compared with CO THA. Further research is required to understand if these intraoperative findings translate to differences in clinical outcomes between the treatment groups. Cite this article: Bone Joint J 2024;106-B(9):898–906


The Bone & Joint Journal
Vol. 101-B, Issue 10 | Pages 1292 - 1299
1 Oct 2019
Masters J Metcalfe D Parsons NR Achten J Griffin XL Costa ML

Aims. This study explores data quality in operation type and fracture classification recorded as part of a large research study and a national audit with an independent review. Patients and Methods. At 17 centres, an expert surgeon reviewed a randomly selected subset of cases from their centre with regard to fracture classification using the AO system and type of operation performed. Agreement for these variables was then compared with the data collected during conduct of the World Hip Trauma Evaluation (WHiTE) cohort study. Both types of surgery and fracture classification were collapsed to identify the level of detail of reporting that achieved meaningful agreement. In the National Hip Fracture Database (NHFD), the types of operation and fracture classification were explored to identify the proportion of “highly improbable” combinations. Results. The records were reviewed for 903 cases. Agreement for the subtypes of extracapsular fracture was poor; most centres achieved no better than “fair” agreement. When the classification was collapsed to a single option for “extracapsular” fracture, only four centres failed to have at least “moderate” agreement. There was only “moderate” agreement for the subtypes of intracapsular fracture, which improved to “substantial” when collapsed to “intracapsular”. Subtrochanteric fracture types were well reported with “substantial” agreement. There was near “perfect” agreement for internal fixation procedures. “Perfect” or “substantial” agreement was achieved when the type of arthroplasty surgery was reported at the level of “hemiarthroplasty” and “total hip replacement”. When reviewing data submitted to the NHFD, a minimum of 5.2% of cases contained “highly improbable” procedures for the stated fracture classification. Conclusion. The complexity of collecting fracture classification data at a national scale compromises the accuracy with which detailed classification systems can be reported. Data around type of surgery performed show similar tendencies. Data capture, reporting, and interpretation in future studies must take this into account. Cite this article: Bone Joint J 2019;101-B:1292–1299


Bone & Joint Open
Vol. 2, Issue 8 | Pages 638 - 645
1 Aug 2021
Garner AJ Edwards TC Liddle AD Jones GG Cobb JP

Aims. Joint registries classify all further arthroplasty procedures to a knee with an existing partial arthroplasty as revision surgery, regardless of the actual procedure performed. Relatively minor procedures, including bearing exchanges, are classified in the same way as major operations requiring augments and stems. A new classification system is proposed to acknowledge and describe the detail of these procedures, which has implications for risk, recovery, and health economics. Methods. Classification categories were proposed by a surgical consensus group, then ranked by patients, according to perceived invasiveness and implications for recovery. In round one, 26 revision cases were classified by the consensus group. Results were tested for inter-rater reliability. In round two, four additional cases were added for clarity. Round three repeated the survey one month later, subject to inter- and intrarater reliability testing. In round four, five additional expert partial knee arthroplasty surgeons were asked to classify the 30 cases according to the proposed revision partial knee classification (RPKC) system. Results. Four classes were proposed: PR1, where no bone-implant interfaces are affected; PR2, where surgery does not include conversion to total knee arthroplasty, for example, a second partial arthroplasty to a native compartment; PR3, when a standard primary total knee prosthesis is used; and PR4 when revision components are necessary. Round one resulted in 92% inter-rater agreement (Kendall’s W 0.97; p < 0.005), rising to 93% in round two (Kendall’s W 0.98; p < 0.001). Round three demonstrated 97% agreement (Kendall’s W 0.98; p < 0.001), with high intra-rater reliability (interclass correlation coefficient (ICC) 0.99; 95% confidence interval 0.98 to 0.99). Round four resulted in 80% agreement (Kendall’s W 0.92; p < 0.001). Conclusion. The RPKC system accounts for all procedures which may be appropriate following partial knee arthroplasty. It has been shown to be reliable, repeatable and pragmatic. The implications for patient care and health economics are discussed. Cite this article: Bone Jt Open 2021;2(8):638–645


The Bone & Joint Journal
Vol. 106-B, Issue 3 | Pages 227 - 231
1 Mar 2024
Todd NV Casey A Birch NC

The diagnostic sub-categorization of cauda equina syndrome (CES) is used to aid communication between doctors and other healthcare professionals. It is also used to determine the need for, and urgency of, MRI and surgery in these patients. A recent paper by Hoeritzauer et al (2023) in this journal examined the interobserver reliability of the widely accepted subcategories in 100 patients with cauda equina syndrome. They found that there is no useful interobserver agreement for the subcategories, even for experienced spinal surgeons. This observation is supported by the largest prospective study of the treatment of cauda equina syndrome in the UK by Woodfield et al (2023). If the accepted subcategories are unreliable, they cannot be used in the way that they are currently, and they should be revised or abandoned. This paper presents a reassessment of the diagnostic and prognostic subcategories of cauda equina syndrome in the light of this evidence, with a suggested cure based on a more inclusive synthesis of symptoms, signs, bladder ultrasound scan results, and pre-intervention urinary catheterization. Cite this article: Bone Joint J 2024;106-B(3):227–231


The Bone & Joint Journal
Vol. 104-B, Issue 6 | Pages 758 - 764
1 Jun 2022
Gelfer Y Davis N Blanco J Buckingham R Trees A Mavrotas J Tennant S Theologis T

Aims. The aim of this study was to gain an agreement on the management of idiopathic congenital talipes equinovarus (CTEV) up to walking age in order to provide a benchmark for practitioners and guide consistent, high-quality care for children with CTEV. Methods. The consensus process followed an established Delphi approach with a predetermined degree of agreement. The process included the following steps: establishing a steering group; steering group meetings, generating statements, and checking them against the literature; a two-round Delphi survey; and final consensus meeting. The steering group members and Delphi survey participants were all British Society of Children’s Orthopaedic Surgery (BSCOS) members. Descriptive statistics were used for analysis of the Delphi survey results. The Appraisal of Guidelines for Research & Evaluation checklist was followed for reporting of the results. Results. The BSCOS-selected steering group, the steering group meetings, the Delphi survey, and the final consensus meeting all followed the pre-agreed protocol. A total of 153/243 members voted in round 1 Delphi (63%) and 132 voted in round 2 (86%). Out of 61 statements presented to round 1 Delphi, 43 reached ‘consensus in’, no statements reached ‘consensus out’, and 18 reached ‘no consensus’. Four statements were deleted and one new statement added following suggestions from round 1. Out of 15 statements presented to round 2, 12 reached ‘consensus in’, no statements reached ‘consensus out’, and three reached ‘no consensus’ and were discussed and included following the final consensus meeting. Two statements were combined for simplicity. The final consensus document includes 57 statements allocated into six successive stages. Conclusion. We have produced a consensus document for the treatment of idiopathic CTEV up to walking age. This will provide a benchmark for standard of care in the UK and will help to reduce geographical variability in treatment and outcomes. Appropriate dissemination and implementation will be key to its success. Cite this article: Bone Joint J 2022;104-B(6):758–764


The Bone & Joint Journal
Vol. 102-B, Issue 4 | Pages 478 - 484
1 Apr 2020
Daniels AM Wyers CE Janzing HMJ Sassen S Loeffen D Kaarsemaker S van Rietbergen B Hannemann PFW Poeze M van den Bergh JP

Aims. Besides conventional radiographs, the use of MRI, CT, and bone scintigraphy is frequent in the diagnosis of a fracture of the scaphoid. However, which techniques give the best results remain unknown. The investigation of a new imaging technique initially requires an analysis of its precision. The primary aim of this study was to investigate the interobserver agreement of high-resolution peripheral quantitative CT (HR-pQCT) in the diagnosis of a scaphoid fracture. A secondary aim was to investigate the interobserver agreement for the presence of other fractures and for the classification of scaphoid fracture. Methods. Two radiologists and two orthopaedic trauma surgeons evaluated HR-pQCT scans of 31 patients with a clinically-suspected scaphoid fracture. The observers were asked to determine the presence of a scaphoid or other fracture and to classify the scaphoid fracture based on the Herbert classification system. Fleiss kappa statistics were used to calculate the interobserver agreement for the diagnosis of a fracture. Intraclass correlation coefficients (ICCs) were used to assess the agreement for the classification of scaphoid fracture. Results. A total of nine (29%) scaphoid fractures and 12 (39%) other fractures were diagnosed in 20 patients (65%) using HR-pQCT across the four observers. The interobserver agreement was 91% for the identification of a scaphoid fracture (95% confidence interval (CI) 0.76 to 1.00) and 80% for other fractures (95% CI 0.72 to 0.87). The mean ICC for the classification of a scaphoid fracture in the seven patients diagnosed with scaphoid fracture by all four observers was 73% (95% CI 0.42 to 0.94). Conclusion. We conclude that the diagnosis of scaphoid and other fractures is reliable when using HR-pQCT in patients with a clinically-suspected fracture. Cite this article: Bone Joint J 2020;102-B(4):478–484


The Bone & Joint Journal
Vol. 106-B, Issue 9 | Pages 1016 - 1020
9 Jul 2024
Trompeter AJ Costa ML

Aims. Weightbearing instructions after musculoskeletal injury or orthopaedic surgery are a key aspect of the rehabilitation pathway and prescription. The terminology used to describe the weightbearing status of the patient is variable; many different terms are used, and there is recognition and evidence that the lack of standardized terminology contributes to confusion in practice. Methods. A consensus exercise was conducted involving all the major stakeholders in the patient journey for those with musculoskeletal injury. The consensus exercise primary aim was to seek agreement on a standardized set of terminology for weightbearing instructions. Results. A pre-meeting questionnaire was conducted. The one-day consensus meeting, including patient representatives, identified three agreed terms only to be used in defining the weightbearing status of the patient: 1) non-weightbearing; 2) limited weightbearing; and 3) unrestricted weightbearing. Conclusion. This study represents the first and only exercise in standardizing rehabilitation terminology in orthopaedics, as agreed by all major stakeholders in the patient pathway and the patients themselves. The standardization of language allows for higher-quality and more accurate research to be conducted, and is one small part of the bigger picture in increasing the mobility of patients after orthopaedic injury or surgery. Cite this article: Bone Joint J 2024;106-B(9):1016–1020


Bone & Joint Research
Vol. 10, Issue 12 | Pages 759 - 766
1 Dec 2021
Nicholson JA Oliver WM MacGillivray TJ Robinson CM Simpson AHRW

Aims. The aim of this study was to establish a reliable method for producing 3D reconstruction of sonographic callus. Methods. A cohort of ten closed tibial shaft fractures managed with intramedullary nailing underwent ultrasound scanning at two, six, and 12 weeks post-surgery. Ultrasound capture was performed using infrared tracking technology to map each image to a 3D lattice. Using echo intensity, semi-automated mapping was performed to produce an anatomical 3D representation of the fracture site. Two reviewers independently performed 3D reconstructions and kappa coefficient was used to determine agreement. A further validation study was undertaken with ten reviewers to estimate the clinical application of this imaging technique using the intraclass correlation coefficient (ICC). Results. Nine of the ten patients achieved union at six months. At six weeks, seven patients had bridging callus of ≥ one cortex on the 3D reconstruction and when present all achieved union. Compared to six-week radiographs, no bridging callus was present in any patient. Of the three patients lacking sonographic bridging callus, one went onto a nonunion (77.8% sensitive and 100% specific to predict union). At 12 weeks, nine patients had bridging callus at ≥ one cortex on 3D reconstruction (100%-sensitive and 100%-specific to predict union). Presence of sonographic bridging callus on 3D reconstruction demonstrated excellent reviewer agreement on ICC at 0.87 (95% confidence interval 0.74 to 0.96). Conclusion. 3D fracture reconstruction can be created using multiple ultrasound images in order to evaluate the presence of bridging callus. This imaging modality has the potential to enhance the usability and accuracy of identification of early fracture healing. Cite this article: Bone Joint Res 2021;10(12):759–766


The Bone & Joint Journal
Vol. 102-B, Issue 2 | Pages 232 - 238
1 Feb 2020
Javed S Hadi S Imam MA Gerogiannis D Foden P Monga P

Aims. Accurate measurement of the glenoid version is important in performing total shoulder arthroplasty (TSA). Our aim was to evaluate the Ellipse method, which involves formally defining the vertical mid-point of the glenoid prior to measuring the glenoid version and comparing it with the ‘classic’ Friedman method. Methods. This was a retrospective study which evaluated 100 CT scans for patients who underwent a primary TSA. The glenoid version was measured using the Friedman and Ellipse methods by two senior observers. Statistical analyses were performed using the paired t-test for significance and the Bland-Altman plot for agreement. Results. The mean glenoid version was -3.11° (-23.8° to 17.9°) using the Friedman method and -1.95° (-29.8° to 24.6°) using the Ellipse method (p = 0.002). In 16 patients the difference between methods was greater than 5°, which we considered to be clinically significant. There was poor agreement between methods with relatively large 95% limits of agreement. There was excellent inter-rater agreement between the observers for the Ellipse method and similarly, the intrarater agreement was excellent with a repeatability coefficient of 0.94. Conclusion. We recommend the use of the Ellipse modification to define the mid glenoid point prior to measuring the glenoid version in patients undergoing TSA. Cite this article: Bone Joint J 2020;102-B(2):232–238


The Bone & Joint Journal
Vol. 102-B, Issue 1 | Pages 102 - 107
1 Jan 2020
Sharma N Brown A Bouras T Kuiper JH Eldridge J Barnett A

Aims. Trochlear dysplasia is a significant risk factor for patellofemoral instability. The Dejour classification is currently considered the standard for classifying trochlear dysplasia, but numerous studies have reported poor reliability on both plain radiography and MRI. The severity of trochlear dysplasia is important to establish in order to guide surgical management. We have developed an MRI-specific classification system to assess the severity of trochlear dysplasia, the Oswestry-Bristol Classification (OBC). This is a four-part classification system comprising normal, mild, moderate, and severe to represent a normal, shallow, flat, and convex trochlear, respectively. The purpose of this study was to assess the inter- and intraobserver reliability of the OBC and compare it with that of the Dejour classification. Methods. Four observers (two senior and two junior orthopaedic surgeons) independently assessed 32 CT and axial MRI scans for trochlear dysplasia and classified each according to the OBC and the Dejour classification systems. Assessments were repeated following a four-week interval. The inter- and intraobserver agreement was determined by using Fleiss’ generalization of Cohen’s kappa statistic and S-statistic nominal and linear weights. Results. The OBC showed fair-to-good interobserver agreement and good-to-excellent intraobserver agreement (mean kappa 0.68). The Dejour classification showed poor interobserver agreement and fair-to-good intraobserver agreement (mean kappa 0.52). Conclusion. The OBC can be used to assess the severity of trochlear dysplasia. It can be applied in clinical practice to simplify and standardize surgical decision-making in patients with recurrent patella instability. Cite this article: Bone Joint J 2020;102-B(1):102–107


Bone & Joint Open
Vol. 5, Issue 6 | Pages 524 - 531
24 Jun 2024
Woldeyesus TA Gjertsen J Dalen I Meling T Behzadi M Harboe K Djuv A

Aims. To investigate if preoperative CT improves detection of unstable trochanteric hip fractures. Methods. A single-centre prospective study was conducted. Patients aged 65 years or older with trochanteric hip fractures admitted to Stavanger University Hospital (Stavanger, Norway) were consecutively included from September 2020 to January 2022. Radiographs and CT images of the fractures were obtained, and surgeons made individual assessments of the fractures based on these. The assessment was conducted according to a systematic protocol including three classification systems (AO/Orthopaedic Trauma Association (OTA), Evans Jensen (EVJ), and Nakano) and questions addressing specific fracture patterns. An expert group provided a gold-standard assessment based on the CT images. Sensitivities and specificities of surgeons’ assessments were estimated and compared in regression models with correlations for the same patients. Intra- and inter-rater reliability were presented as Cohen’s kappa and Gwet’s agreement coefficient (AC1). Results. We included 120 fractures in 119 patients. Compared to radiographs, CT increased the sensitivity of detecting unstable trochanteric fractures from 63% to 70% (p = 0.028) and from 70% to 76% (p = 0.004) using AO/OTA and EVJ, respectively. Compared to radiographs alone, CT increased the sensitivity of detecting a large posterolateral trochanter major fragment or a comminuted trochanter major fragment from 63% to 76% (p = 0.002) and from 38% to 55% (p < 0.001), respectively. CT improved intra-rater reliability for stability assessment using EVJ (AC1 0.68 to 0.78; p = 0.049) and for detecting a large posterolateral trochanter major fragment (AC1 0.42 to 0.57; p = 0.031). Conclusion. A preoperative CT of trochanteric fractures increased detection of unstable fractures using the AO/OTA and EVJ classification systems. Compared to radiographs, CT improved intra-rater reliability when assessing fracture stability and detecting large posterolateral trochanter major fragments. Cite this article: Bone Jt Open 2024;5(6):524–531


The Bone & Joint Journal
Vol. 106-B, Issue 10 | Pages 1150 - 1157
1 Oct 2024
de Klerk HH Verweij LPE Doornberg JN Jaarsma RL Murase T Chen NC van den Bekerom MPJ

Aims. This study aimed to gather insights from elbow experts using the Delphi method to evaluate the influence of patient characteristics and fracture morphology on the choice between operative and nonoperative treatment for coronoid fractures. Methods. A three-round electronic (e-)modified Delphi survey study was performed between March and December 2023. A total of 55 elbow surgeons from Asia, Australia, Europe, and North America participated, with 48 completing all questionnaires (87%). The panellists evaluated the factors identified as important in literature for treatment decision-making, using a Likert scale ranging from "strongly influences me to recommend nonoperative treatment" (1) to "strongly influences me to recommend operative treatment" (5). Factors achieving Likert scores ≤ 2.0 or ≥ 4.0 were deemed influential for treatment recommendation. Stable consensus is defined as an agreement of ≥ 80% in the second and third rounds. Results. Of 68 factors considered important in the literature for treatment choice for coronoid fractures, 18 achieved a stable consensus to be influential. Influential factors with stable consensus that advocate for operative treatment were being a professional athlete, playing overhead sports, a history of subjective dislocation or subluxation during trauma, open fracture, crepitation with range of movement, > 2 mm opening during varus stress on radiological imaging, and having an anteromedial facet or basal coronoid fracture (O’Driscoll type 2 or 3). An anterolateral coronoid tip fracture ≤ 2 mm was the only influential factor with a stable consensus that advocates for nonoperative treatment. Most disagreement existed regarding the treatment for the terrible triad injury with an anterolateral coronoid tip fracture fragment ≤ 2 mm (O’Driscoll type 1 subtype 1). Conclusion. This study gives insights into areas of consensus among surveyed elbow surgeons in choosing between operative and nonoperative management of coronoid fractures. These findings should be used in conjunction with previous patient cohort studies when discussing treatment options with patients. Cite this article: Bone Joint J 2024;106-B(10):1150–1157


Bone & Joint Research
Vol. 12, Issue 3 | Pages 155 - 164
1 Mar 2023
McCarty CP Nazif MA Sangiorgio SN Ebramzadeh E Park S

Aims. Taper corrosion has been widely reported to be problematic for modular total hip arthroplasty implants. A simple and systematic method to evaluate taper damage with sufficient resolution is needed. We introduce a semiquantitative grading system for modular femoral tapers to characterize taper corrosion damage. Methods. After examining a unique collection of retrieved cobalt-chromium (CoCr) taper sleeves (n = 465) using the widely-used Goldberg system, we developed an expanded six-point visual grading system intended to characterize the severity, visible material loss, and absence of direct component contact due to corrosion. Female taper sleeve damage was evaluated by three blinded observers using the Goldberg scoring system and the expanded system. A subset (n = 85) was then re-evaluated following destructive cleaning, using both scoring systems. Material loss for this subset was quantified using metrology and correlated with both scoring systems. Results. There was substantial agreement in grading among all three observers with uncleaned (n = 465) and with the subset of cleaned (n = 85) implants. The expanded scoring criteria provided a wider distribution of scores which ultimately correlated well with corrosion material loss. Cleaning changed the average scores marginally using the Goldberg criteria (p = 0.290); however, using the VGS, approximately 40% of the scores for all observers changed, increasing the average score from 4.24 to 4.35 (p = 0.002). There was a strong correlation between measured material loss and new grading scores. Conclusion. The expanded scoring criteria provided a wider distribution of scores which ultimately correlated well with corrosion material loss. This system provides potential advantages for assessing taper damage without requiring specialized imaging devices. Cite this article: Bone Joint Res 2023;12(3):155–164


The Bone & Joint Journal
Vol. 103-B, Issue 8 | Pages 1345 - 1350
1 Aug 2021
Czubak-Wrzosek M Nitek Z Sztwiertnia P Czubak J Grzelecki D Kowalczewski J Tyrakowski M

Aims. The aim of the study was to compare two methods of calculating pelvic incidence (PI) and pelvic tilt (PT), either by using the femoral heads or acetabular domes to determine the bicoxofemoral axis, in patients with unilateral or bilateral primary hip osteoarthritis (OA). Methods. PI and PT were measured on standing lateral radiographs of the spine in two groups: 50 patients with unilateral (Group I) and 50 patients with bilateral hip OA (Group II), using the femoral heads or acetabular domes to define the bicoxofemoral axis. Agreement between the methods was determined by intraclass correlation coefficient (ICC) and the standard error of measurement (SEm). The intraobserver reproducibility and interobserver reliability of the two methods were analyzed on 31 radiographs in both groups to calculate ICC and SEm. Results. In both groups, excellent agreement between the two methods was obtained, with ICC of 0.99 and SEm 0.3° for Group I, and ICC 0.99 and SEm 0.4° for Group II. The intraobserver reproducibility was excellent for both methods in both groups, with an ICC of at least 0.97 and SEm not exceeding 0.8°. The study also revealed excellent interobserver reliability for both methods in both groups, with ICC 0.99 and SEm 0.5° or less. Conclusion. Either the femoral heads or acetabular domes can be used to define the bicoxofemoral axis on the lateral standing radiographs of the spine for measuring PI and PT in patients with idiopathic unilateral or bilateral hip OA. Cite this article: Bone Joint J 2021;103-B(8):1345–1350


The Bone & Joint Journal
Vol. 106-B, Issue 10 | Pages 1190 - 1196
1 Oct 2024
Gelfer Y McNee AE Harris JD Mavrotas J Deriu L Cashman J Wright J Kothari A

Aims. The aim of this study was to gain a consensus for best practice of the assessment and management of children with idiopathic toe walking (ITW) in order to provide a benchmark for practitioners and guide the best consistent care. Methods. An established Delphi approach with predetermined steps and degree of agreement based on a standardized protocol was used to determine consensus. The steering group members and Delphi survey participants included members from the British Society of Children’s Orthopaedic Surgery (BSCOS) and the Association of Paediatric Chartered Physiotherapists (APCP). The statements included definition, assessment, treatment indications, nonoperative and operative interventions, and outcomes. Descriptive statistics were used for analysis of the Delphi survey results. The AGREE checklist was followed for reporting the results. Results. A total of 227 participants (54% APCP and 46% BSCOS members) completed the first round, and 222 participants (98%) completed the second round. Out of 54 proposed statements included in the first round Delphi, 17 reached ‘consensus in’, no statements reached ‘consensus out’, and 37 reached ‘no consensus’. These 37 statements were then discussed, reworded, amalgamated, or deleted before the second round Delphi of 29 statements. A total of 12 statements reached ‘consensus in’, four ‘consensus out’, and 13 ‘no consensus’. In the final consensus meeting, 13 statements were voted upon. Five were accepted, resulting in a total of 31 approved statements. Conclusion. In the aspects of practice where sufficient evidence is not available, a consensus statement can provide a strong body of opinion that acts as a benchmark for excellence in clinical care. This statement can assist clinicians managing children with ITW to ensure consistent and reliable practice, and reduce geographical variability in practice and outcomes. It will enable those treating ITW to share the published consensus document with both carers and patient groups. Cite this article: Bone Joint J 2024;106-B(10):1190–1196


Bone & Joint Open
Vol. 2, Issue 10 | Pages 858 - 864
18 Oct 2021
Guntin J Plummer D Della Valle C DeBenedetti A Nam D

Aims. Prior studies have identified that malseating of a modular dual mobility liner can occur, with previous reported incidences between 5.8% and 16.4%. The aim of this study was to determine the incidence of malseating in dual mobility implants at our institution, assess for risk factors for liner malseating, and investigate whether liner malseating has any impact on clinical outcomes after surgery. Methods. We retrospectively reviewed the radiographs of 239 primary and revision total hip arthroplasties with a modular dual mobility liner. Two independent reviewers assessed radiographs for each patient twice for evidence of malseating, with a third observer acting as a tiebreaker. Univariate analysis was conducted to determine risk factors for malseating with Youden’s index used to identify cut-off points. Cohen’s kappa test was used to measure interobserver and intraobserver reliability. Results. In all, 12 liners (5.0%), including eight Stryker (6.8%) and four Zimmer Biomet (3.3%), had radiological evidence of malseating. Interobserver reliability was found to be 0.453 (95% confidence interval (CI) 0.26 to 0.64), suggesting weak inter-rater agreement, with strong agreement being greater than 0.8. We found component size of 50 mm or less to be associated with liner malseating on univariate analysis (p = 0.031). Patients with malseated liners appeared to have no associated clinical consequences, and none required revision surgery at a mean of 14 months (1.4 to 99.2) postoperatively. Conclusion. The incidence of liner malseating was 5.0%, which is similar to other reports. Component size of 50 mm or smaller was identified as a risk factor for malseating. Surgeons should be aware that malseating can occur and implant design changes or changes in instrumentation should be considered to lower the risk of malseating. Although further follow-up is needed, it remains to be seen if malseating is associated with any clinical consequences. Cite this article: Bone Jt Open 2021;2(10):858–864


Bone & Joint Open
Vol. 1, Issue 7 | Pages 355 - 358
7 Jul 2020
Konrads C Gonser C Ahmad SS

Aims. The Oswestry-Bristol Classification (OBC) was recently described as an MRI-based classification tool for the femoral trochlear. The authors demonstrated better inter- and intraobserver agreement compared to the Dejour classification. As the OBC could potentially provide a very useful MRI-based grading system for trochlear dysplasia, it was the aim to determine the inter- and intraobserver reliability of the classification system from the perspective of the non-founder. Methods. Two orthopaedic surgeons independently assessed 50 MRI scans for trochlear dysplasia and classified each according to the OBC. Both observers repeated the assessments after six weeks. The inter- and intraobserver agreement was determined using Cohen’s kappa statistic and S-statistic nominal and linear weights. Results. The OBC with grading into four different trochlear forms showed excellent inter- and intraobserver agreement with a mean kappa of 0.78. Conclusion. The OBC is a simple MRI-based classification system with high inter- and intraobserver reliability. It could present a useful tool for grading the severity of trochlear dysplasia in daily practice. Cite this article: Bone Joint Open 2020;1-7:355–358


Bone & Joint Open
Vol. 3, Issue 11 | Pages 877 - 884
14 Nov 2022
Archer H Reine S Alshaikhsalama A Wells J Kohli A Vazquez L Hummer A DiFranco MD Ljuhar R Xi Y Chhabra A

Aims. Hip dysplasia (HD) leads to premature osteoarthritis. Timely detection and correction of HD has been shown to improve pain, functional status, and hip longevity. Several time-consuming radiological measurements are currently used to confirm HD. An artificial intelligence (AI) software named HIPPO automatically locates anatomical landmarks on anteroposterior pelvis radiographs and performs the needed measurements. The primary aim of this study was to assess the reliability of this tool as compared to multi-reader evaluation in clinically proven cases of adult HD. The secondary aims were to assess the time savings achieved and evaluate inter-reader assessment. Methods. A consecutive preoperative sample of 130 HD patients (256 hips) was used. This cohort included 82.3% females (n = 107) and 17.7% males (n = 23) with median patient age of 28.6 years (interquartile range (IQR) 22.5 to 37.2). Three trained readers’ measurements were compared to AI outputs of lateral centre-edge angle (LCEA), caput-collum-diaphyseal (CCD) angle, pelvic obliquity, Tönnis angle, Sharp’s angle, and femoral head coverage. Intraclass correlation coefficients (ICC) and Bland-Altman analyses were obtained. Results. Among 256 hips with AI outputs, all six hip AI measurements were successfully obtained. The AI-reader correlations were generally good (ICC 0.60 to 0.74) to excellent (ICC > 0.75). There was lower agreement for CCD angle measurement. Most widely used measurements for HD diagnosis (LCEA and Tönnis angle) demonstrated good to excellent inter-method reliability (ICC 0.71 to 0.86 and 0.82 to 0.90, respectively). The median reading time for the three readers and AI was 212 (IQR 197 to 230), 131 (IQR 126 to 147), 734 (IQR 690 to 786), and 41 (IQR 38 to 44) seconds, respectively. Conclusion. This study showed that AI-based software demonstrated reliable radiological assessment of patients with HD with significant interpretation-related time savings. Cite this article: Bone Jt Open 2022;3(11):877–884


The Bone & Joint Journal
Vol. 106-B, Issue 4 | Pages 372 - 379
1 Apr 2024
Straub J Staats K Vertesich K Kowalscheck L Windhager R Böhler C

Aims. Histology is widely used for diagnosis of persistent infection during reimplantation in two-stage revision hip and knee arthroplasty, although data on its utility remain scarce. Therefore, this study aims to assess the predictive value of permanent sections at reimplantation in relation to reinfection risk, and to compare results of permanent and frozen sections. Methods. We retrospectively collected data from 226 patients (90 hips, 136 knees) with periprosthetic joint infection who underwent two-stage revision between August 2011 and September 2021, with a minimum follow-up of one year. Histology was assessed via the SLIM classification. First, we analyzed whether patients with positive permanent sections at reimplantation had higher reinfection rates than patients with negative histology. Further, we compared permanent and frozen section results, and assessed the influence of anatomical regions (knee versus hip), low- versus high-grade infections, as well as first revision versus multiple prior revisions on the histological result at reimplantation. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), chi-squared tests, and Kaplan-Meier estimates were calculated. Results. Overall, the reinfection rate was 18%. A total of 14 out of 82 patients (17%) with positive permanent sections at reimplantation experienced reinfection, compared to 26 of 144 patients (18%) with negative results (p = 0.996). Neither permanent sections nor fresh frozen sections were significantly associated with reinfection, with a sensitivity of 0.35, specificity of 0.63, PPV of 0.17, NPV of 0.81, and accuracy of 58%. Histology was not significantly associated with reinfection or survival time for any of the analyzed sub-groups. Permanent and frozen section results were in agreement for 91% of cases. Conclusion. Permanent and fresh frozen sections at reimplantation in two-stage revision do not serve as a reliable predictor for reinfection. Cite this article: Bone Joint J 2024;106-B(4):372–379


The Bone & Joint Journal
Vol. 105-B, Issue 2 | Pages 209 - 214
1 Feb 2023
Aarvold A Perry DC Mavrotas J Theologis T Katchburian M

Aims. A national screening programme has existed in the UK for the diagnosis of developmental dysplasia of the hip (DDH) since 1969. However, every aspect of screening and treatment remains controversial. Screening programmes throughout the world vary enormously, and in the UK there is significant variation in screening practice and treatment pathways. We report the results of an attempt by the British Society for Children’s Orthopaedic Surgery (BSCOS) to identify a nationwide consensus for the management of DDH in order to unify treatment and suggest an approach for screening. Methods. A Delphi consensus study was performed among the membership of BSCOS. Statements were generated by a steering group regarding aspects of the management of DDH in children aged under three months, namely screening and surveillance (15 questions), the technique of ultrasound scanning (eight questions), the initiation of treatment (19 questions), care during treatment with a splint (ten questions), and on quality, governance, and research (eight questions). A two-round Delphi process was used and a consensus document was produced at the final meeting of the steering group. Results. A total of 60 statements were graded by 128 clinicians in the first round and 132 in the second round. Consensus was reached on 30 out of 60 statements in the first round and an additional 12 in the seond. This was summarized in a consensus statement and distilled into a flowchart to guide clinical practice. Conclusion. We identified agreement in an area of medicine that has a long history of controversy and varied practice. None of the areas of consensus are based on high-quality evidence. This document is thus a framework to guide clinical practice and on which high-quality clinical trials can be developed. Cite this article: Bone Joint J 2023;105-B(2):209–214


Bone & Joint Open
Vol. 4, Issue 4 | Pages 262 - 272
11 Apr 2023
Batailler C Naaim A Daxhelet J Lustig S Ollivier M Parratte S

Aims. The impact of a diaphyseal femoral deformity on knee alignment varies according to its severity and localization. The aims of this study were to determine a method of assessing the impact of diaphyseal femoral deformities on knee alignment for the varus knee, and to evaluate the reliability and the reproducibility of this method in a large cohort of osteoarthritic patients. Methods. All patients who underwent a knee arthroplasty from 2019 to 2021 were included. Exclusion criteria were genu valgus, flexion contracture (> 5°), previous femoral osteotomy or fracture, total hip arthroplasty, and femoral rotational disorder. A total of 205 patients met the inclusion criteria. The mean age was 62.2 years (SD 8.4). The mean BMI was 33.1 kg/m. 2. (SD 5.5). The radiological measurements were performed twice by two independent reviewers, and included hip knee ankle (HKA) angle, mechanical medial distal femoral angle (mMDFA), anatomical medial distal femoral angle (aMDFA), femoral neck shaft angle (NSA), femoral bowing angle (FBow), the distance between the knee centre and the top of the FBow (DK), and the angle representing the FBow impact on the knee (C’KS angle). Results. The FBow impact on the mMDFA can be measured by the C’KS angle. The C’KS angle took the localization (length DK) and the importance (FBow angle) of the FBow into consideration. The mean FBow angle was 4.4° (SD 2.4; 0 to 12.5). The mean C’KS angle was 1.8° (SD 1.1; 0 to 5.8). Overall, 84 knees (41%) had a severe FBow (> 5°). The radiological measurements showed very good to excellent intraobserver and interobserver agreements. The C’KS increased significantly when the length DK decreased and the FBow angle increased (p < 0.001). Conclusion. The impact of the diaphyseal femoral deformity on the mechanical femoral axis is measured by the C’KS angle, a reliable and reproducible measurement. Cite this article: Bone Jt Open 2023;4(4):262–272


Bone & Joint Research
Vol. 8, Issue 8 | Pages 357 - 366
1 Aug 2019
Zhang B Sun H Zhan Y He Q Zhu Y Wang Y Luo C

Objectives. CT-based three-column classification (TCC) has been widely used in the treatment of tibial plateau fractures (TPFs). In its updated version (updated three-column concept, uTCC), a fracture morphology-based injury mechanism was proposed for effective treatment guidance. In this study, the injury mechanism of TPFs is further explained, and its inter- and intraobserver reliability is evaluated to perfect the uTCC. Methods. The radiological images of 90 consecutive TPF patients were collected. A total of 47 men (52.2%) and 43 women (47.8%) with a mean age of 49.8 years (. sd. 12.4; 17 to 77) were enrolled in our study. Among them, 57 fractures were on the left side (63.3%) and 33 were on the right side (36.7%); no bilateral fracture existed. Four observers were chosen to classify or estimate independently these randomized cases according to the Schatzker classification, TCC, and injury mechanism. With two rounds of evaluation, the kappa values were calculated to estimate the inter- and intrareliability. Results. The overall inter- and intraobserver agreements of the injury mechanism were substantial (κ. inter. = 0.699, κ. intra. = 0.749, respectively). The initial position and the force direction, which are two components of the injury mechanism, had substantial agreement for both inter-reliability or intrareliability. The inter- and intraobserver agreements were lower in high-energy fractures (Schatzker types IV to VI; κ. inter. = 0.605, κ. intra. = 0.721) compared with low-energy fractures (Schatzker types I to III; κ. inter. = 0.81, κ. intra. = 0.832). The inter- and intraobserver agreements were relatively higher in one-column fractures (κ. inter. = 0.759, κ. intra. = 0.801) compared with two-column and three-column fractures. Conclusion. The complete theory of injury mechanism of TPFs was first put forward to make the TCC consummate. It demonstrates substantial inter- and intraobserver agreement generally. Furthermore, the injury mechanism can be promoted clinically. Cite this article: B-B. Zhang, H. Sun, Y. Zhan, Q-F. He, Y. Zhu, Y-K. Wang, C-F. Luo. Reliability and repeatability of tibial plateau fracture assessment with an injury mechanism-based concept. Bone Joint Res 2019;8:357–366. DOI: 10.1302/2046-3758.88.BJR-2018-0331.R1


The Bone & Joint Journal
Vol. 106-B, Issue 11 | Pages 1249 - 1256
1 Nov 2024
Mangwani J Houchen-Wolloff L Malhotra K Booth S Smith A Teece L Mason LW

Aims. Venous thromboembolism (VTE) is a potential complication of foot and ankle surgery. There is a lack of agreement on contributing risk factors and chemical prophylaxis requirements. The primary outcome of this study was to analyze the 90-day incidence of symptomatic VTE and VTE-related mortality in patients undergoing foot and ankle surgery and Achilles tendon (TA) rupture. Secondary aims were to assess the variation in the provision of chemical prophylaxis and risk factors for VTE. Methods. This was a multicentre, prospective national collaborative audit with data collection over nine months for all patients undergoing foot and ankle surgery in an operating theatre or TA rupture treatment, within participating UK hospitals. The association between VTE and thromboprophylaxis was assessed with a univariable logistic regression model. A multivariable logistic regression model was used to identify key predictors for the risk of VTE. Results. A total of 13,569 patients were included from 68 sites. Overall, 11,363 patients were available for analysis: 44.79% were elective (n = 5,090), 42.16% were trauma excluding TA ruptures (n = 4,791), 3.50% were acute diabetic procedures (n = 398), 2.44% were TA ruptures undergoing surgery (n = 277), and 7.10% were TA ruptures treated nonoperatively (n = 807). In total, 11 chemical anticoagulants were recorded, with the most common agent being low-molecular-weight heparin (n = 6,303; 56.79%). A total of 32.71% received no chemical prophylaxis. There were 99 cases of VTE (incidence 0.87% (95% CI 0.71 to 1.06)). VTE-related mortality was 0.03% (95% CI 0.005 to 0.080). Univariable analysis showed that increased age and American Society of Anesthesiologists (ASA) grade had higher odds of VTE, as did having previous cancer, stroke, or history of VTE. On multivariable analysis, the strongest predictors for VTE were the type of foot and ankle procedure and ASA grade. Conclusion. The 90-day incidence of symptomatic VTE and mortality related to VTE is low in foot and ankle surgery and TA management. There was notable variability in the chemical prophylaxis used. The significant risk factors associated with 90-day symptomatic VTE were TA rupture and high ASA grade. Cite this article: Bone Joint J 2024;106-B(11):1249–1256


Bone & Joint Open
Vol. 5, Issue 3 | Pages 236 - 242
22 Mar 2024
Guryel E McEwan J Qureshi AA Robertson A Ahluwalia R

Aims. Ankle fractures are common injuries and the third most common fragility fracture. In all, 40% of ankle fractures in the frail are open and represent a complex clinical scenario, with morbidity and mortality rates similar to hip fracture patients. They have a higher risk of complications, such as wound infections, malunion, hospital-acquired infections, pressure sores, veno-thromboembolic events, and significant sarcopaenia from prolonged bed rest. Methods. A modified Delphi method was used and a group of experts with a vested interest in best practice were invited from the British Foot and Ankle Society (BOFAS), British Orthopaedic Association (BOA), Orthopaedic Trauma Society (OTS), British Association of Plastic & Reconstructive Surgeons (BAPRAS), British Geriatric Society (BGS), and the British Limb Reconstruction Society (BLRS). Results. In the first stage, there were 36 respondents to the survey, with over 70% stating their unit treats more than 20 such cases per year. There was a 50:50 split regarding if the timing of surgery should be within 36 hours, as per the hip fracture guidelines, or 72 hours, as per the open fracture guidelines. Overall, 75% would attempt primary wound closure and 25% would utilize a local flap. There was no orthopaedic agreement on fixation, and 75% would permit weightbearing immediately. In the second stage, performed at the BLRS meeting, experts discussed the survey results and agreed upon a consensus for the management of open elderly ankle fractures. Conclusion. A mutually agreed consensus from the expert panel was reached to enable the best practice for the management of patients with frailty with an open ankle fracture: 1) all units managing lower limb fragility fractures should do so through a cohorted multidisciplinary pathway. This pathway should follow the standards laid down in the "care of the older or frail orthopaedic trauma patient" British Orthopaedic Association Standards for Trauma and Orthopaedics (BOAST) guideline. These patients have low bone density, and we should recommend full falls and bone health assessment; 2) all open lower limb fragility fractures should be treated in a single stage within 24 hours of injury if possible; 3) all patients with fragility fractures of the lower limb should be considered for mobilisation on the day following surgery; 4) all patients with lower limb open fragility fractures should be considered for tissue sparing, with judicious debridement as a default; 5) all patients with open lower limb fragility fractures should be managed by a consultant plastic surgeon with primary closure wherever possible; and 6) the method of fixation must allow for immediate unrestricted weightbearing. Cite this article: Bone Jt Open 2024;5(3):236–242


The Bone & Joint Journal
Vol. 103-B, Issue 5 | Pages 971 - 975
1 May 2021
Hurley P Azzopardi C Botchu R Grainger M Gardner A

Aims. The aim of this study was to assess the reliability of using MRI scans to calculate the Spinal Instability Neoplastic Score (SINS) in patients with metastatic spinal cord compression (MSCC). Methods. A total of 100 patients were retrospectively included in the study. The SINS score was calculated from each patient’s MRI and CT scans by two consultant musculoskeletal radiologists (reviewers 1 and 2) and one consultant spinal surgeon (reviewer 3). In order to avoid potential bias in the assessment, MRI scans were reviewed first. Bland-Altman analysis was used to identify the limits of agreement between the SINS scores from the MRI and CT scans for the three reviewers. Results. The limit of agreement between the SINS score from the MRI and CT scans for the reviewers was -0.11 for reviewer 1 (95% CI 0.82 to -1.04), -0.12 for reviewer 2 (95% CI 1.24 to -1.48), and -0.37 for reviewer 3 (95% CI 2.35 to -3.09). The use of MRI tended to increase the score when compared with that using the CT scan. No patient having their score calculated from MRI scans would have been classified as stable rather than intermediate or unstable when calculated from CT scans, potentially leading to suboptimal care. Conclusion. We found that MRI scans can be used to calculate the SINS score reliably, compared with the score from CT scans. The main difference between the scores derived from MRI and CT was in defining the type of bony lesion. This could be made easier by knowing the site of the primary tumour when calculating the score, or by using inverted T1-volumetric interpolated breath-hold examination MRI to assess the bone more reliably, similar to using CT. Cite this article: Bone Joint J 2021;103-B(5):971–975


The Bone & Joint Journal
Vol. 106-B, Issue 4 | Pages 412 - 418
1 Apr 2024
Alqarni AG Nightingale J Norrish A Gladman JRF Ollivere B

Aims. Frailty greatly increases the risk of adverse outcome of trauma in older people. Frailty detection tools appear to be unsuitable for use in traumatically injured older patients. We therefore aimed to develop a method for detecting frailty in older people sustaining trauma using routinely collected clinical data. Methods. We analyzed prospectively collected registry data from 2,108 patients aged ≥ 65 years who were admitted to a single major trauma centre over five years (1 October 2015 to 31 July 2020). We divided the sample equally into two, creating derivation and validation samples. In the derivation sample, we performed univariate analyses followed by multivariate regression, starting with 27 clinical variables in the registry to predict Clinical Frailty Scale (CFS; range 1 to 9) scores. Bland-Altman analyses were performed in the validation cohort to evaluate any biases between the Nottingham Trauma Frailty Index (NTFI) and the CFS. Results. In the derivation cohort, five of the 27 variables were strongly predictive of the CFS (regression coefficient B = 6.383 (95% confidence interval 5.03 to 7.74), p < 0.001): age, Abbreviated Mental Test score, admission haemoglobin concentration (g/l), pre-admission mobility (needs assistance or not), and mechanism of injury (falls from standing height). In the validation cohort, there was strong agreement between the NTFI and the CFS (mean difference 0.02) with no apparent systematic bias. Conclusion. We have developed a clinically applicable tool using easily and routinely measured physiological and functional parameters, which clinicians and researchers can use to guide patient care and to stratify the analysis of quality improvement and research projects. Cite this article: Bone Joint J 2024;106-B(4):412–418


The Bone & Joint Journal
Vol. 103-B, Issue 4 | Pages 775 - 781
1 Apr 2021
Mellema JJ Janssen S Schouten T Haverkamp D van den Bekerom MPJ Ring D Doornberg JN

Aims. This study evaluated variation in the surgical treatment of stable (A1) and unstable (A2) trochanteric hip fractures among an international group of orthopaedic surgeons, and determined the influence of patient, fracture, and surgeon characteristics on choice of implant (intramedullary nailing (IMN) versus sliding hip screw (SHS)). Methods. A total of 128 orthopaedic surgeons in the Science of Variation Group evaluated radiographs of 30 patients with Type A1 and A2 trochanteric hip fractures and indicated their preferred treatment: IMN or SHS. The management of Type A3 (reverse obliquity) trochanteric fractures was not evaluated. Agreement between surgeons was calculated using multirater kappa. Multivariate logistic regression models were used to assess whether patient, fracture, and surgeon characteristics were independently associated with choice of implant. Results. The overall agreement between surgeons on implant choice was fair (kappa = 0.27 (95% confidence interval (CI) 0.25 to 0.28)). Factors associated with preference for IMN included USA compared to Europe or the UK (Europe odds ratio (OR) 0.56 (95% CI 0.47 to 0.67); UK OR 0.16 (95% CI 0.12 to 0.22); p < 0.001); exposure to IMN only during training compared to surgeons that were exposed to both (only IMN during training OR 2.6 (95% CI 2.0 to 3.4); p < 0.001); and A2 compared to A1 fractures (Type A2 OR 10 (95% CI 8.4 to 12); p < 0.001). Conclusion. In an international cohort of orthopaedic surgeons, there was a large variation in implant preference for patients with A1 and A2 trochanteric fractures. This is due to surgeon bias (country of practice and aspects of training). The observation that surgeons favoured the more expensive implant (IMN) in the absence of convincing evidence of its superiority suggests that surgeon de-biasing strategies may be a useful focus for optimizing patient outcomes and promoting value-based healthcare. Cite this article: Bone Joint J 2021;103-B(4):775–781


Bone & Joint Research
Vol. 9, Issue 5 | Pages 242 - 249
1 May 2020
Bali K Smit K Ibrahim M Poitras S Wilkin G Galmiche R Belzile E Beaulé PE

Aims. The aim of the current study was to assess the reliability of the Ottawa classification for symptomatic acetabular dysplasia. Methods. In all, 134 consecutive hips that underwent periacetabular osteotomy were categorized using a validated software (Hip2Norm) into four categories of normal, lateral/global, anterior, or posterior. A total of 74 cases were selected for reliability analysis, and these included 44 dysplastic and 30 normal hips. A group of six blinded fellowship-trained raters, provided with the classification system, looked at these radiographs at two separate timepoints to classify the hips using standard radiological measurements. Thereafter, a consensus meeting was held where a modified flow diagram was devised, before a third reading by four raters using a separate set of 74 radiographs took place. Results. Intrarater results per surgeon between Time 1 and Time 2 showed substantial to almost perfect agreement among the raters (κappa = 0.416 to 0.873). With respect to inter-rater reliability, at Time 1 and Time 2 there was substantial agreement overall between all surgeons (Time 1 κappa = 0.619; Time 2 κappa = 0.623). Posterior and anterior rating categories had moderate and fair agreement at Time 1 (posterior κappa = 0.557; anterior κappa = 0.438) and Time 2 (posterior κappa = 0.506; anterior κappa = 0.250), respectively. At Time 3, overall reliability (κappa = 0.687) and posterior and anterior reliability (posterior κappa = 0.579; anterior κappa = 0.521) improved from Time 1 and Time 2. Conclusion. The Ottawa classification system provides a reliable way to identify three categories of acetabular dysplasia that are well-aligned with surgical management. The term ‘borderline dysplasia’ should no longer be used. Cite this article: Bone Joint Res. 2020;9(5):242–249


The Bone & Joint Journal
Vol. 104-B, Issue 8 | Pages 963 - 971
1 Aug 2022
Sun Z Liu W Liu H Li J Hu Y Tu B Wang W Fan C

Aims. Heterotopic ossification (HO) is a common complication after elbow trauma and can cause severe upper limb disability. Although multiple prognostic factors have been reported to be associated with the development of post-traumatic HO, no model has yet been able to combine these predictors more succinctly to convey prognostic information and medical measures to patients. Therefore, this study aimed to identify prognostic factors leading to the formation of HO after surgery for elbow trauma, and to establish and validate a nomogram to predict the probability of HO formation in such particular injuries. Methods. This multicentre case-control study comprised 200 patients with post-traumatic elbow HO and 229 patients who had elbow trauma but without HO formation between July 2019 and December 2020. Features possibly associated with HO formation were obtained. The least absolute shrinkage and selection operator regression model was used to optimize feature selection. Multivariable logistic regression analysis was applied to build the new nomogram: the Shanghai post-Traumatic Elbow Heterotopic Ossification Prediction model (STEHOP). STEHOP was validated by concordance index (C-index) and calibration plot. Internal validation was conducted using bootstrapping validation. Results. Male sex, obesity, open wound, dislocations, late definitive surgical treatment, and lack of use of non-steroidal anti-inflammatory drugs were identified as adverse predictors and incorporated to construct the STEHOP model. It displayed good discrimination with a C-index of 0.80 (95% confidence interval 0.75 to 0.84). A high C-index value of 0.77 could still be reached in the internal validation. The calibration plot showed good agreement between nomogram prediction and observed outcomes. Conclusion. The newly developed STEHOP model is a valid and convenient instrument to predict HO formation after surgery for elbow trauma. It could assist clinicians in counselling patients regarding treatment expectations and therapeutic choices. Cite this article: Bone Joint J 2022;104-B(8):963–971


The Bone & Joint Journal
Vol. 100-B, Issue 2 | Pages 242 - 246
1 Feb 2018
Ghoshal A Enninghorst N Sisak K Balogh ZJ

Aims. To evaluate interobserver reliability of the Orthopaedic Trauma Association’s open fracture classification system (OTA-OFC). Patients and Methods. Patients of any age with a first presentation of an open long bone fracture were included. Standard radiographs, wound photographs, and a short clinical description were given to eight orthopaedic surgeons, who independently evaluated the injury using both the Gustilo and Anderson (GA) and OTA-OFC classifications. The responses were compared for variability using Cohen’s kappa. Results. The overall interobserver agreement was ĸ = 0.44 for the GA classification and ĸ = 0.49 for OTA-OFC, which reflects moderate agreement (0.41 to 0.60) for both classifications. The agreement in the five categories of OTA-OFC was: for skin, ĸ = 0.55 (moderate); for muscle, ĸ = 0.44 (moderate); for arterial injury, ĸ = 0.74 (substantial); for contamination, ĸ = 0.35 (fair); and for bone loss, ĸ = 0.41 (moderate). Conclusion. Although the OTA-OFC, with similar interobserver agreement to GA, offers a more detailed description of open fractures, further development may be needed to make it a reliable and robust tool. Cite this article: Bone Joint J 2018;100-B:242–6


The Bone & Joint Journal
Vol. 102-B, Issue 3 | Pages 365 - 370
1 Mar 2020
Min KS Fox HM Bedi A Walch G Warner JJP

Aims. Patient-specific instrumentation has been shown to increase a surgeon’s precision and accuracy in placing the glenoid component in shoulder arthroplasty. There is, however, little available information about the use of patient-specific planning (PSP) tools for this operation. It is not known how these tools alter the decision-making patterns of shoulder surgeons. The aim of this study was to investigate whether PSP, when compared with the use of plain radiographs or select static CT images, influences the understanding of glenoid pathology and surgical planning. Methods. A case-based survey presented surgeons with a patient’s history, physical examination, and, sequentially, radiographs, select static CT images, and PSP with a 3D imaging program. For each imaging modality, the surgeons were asked to identify the Walch classification of the glenoid and to propose the surgical treatment. The participating surgeons were grouped according to the annual volume of shoulder arthroplasties that they undertook, and responses were compared with the recommendations of two experts. Results. A total of 59 surgeons completed the survey. For all surgeons, the use of the PSP significantly increased agreement with the experts in glenoid classification (x. 2. = 8.54; p = 0.014) and surgical planning (x. 2. = 37.91; p < 0.001). The additional information provided by the PSP also showed a significantly higher impact on surgical decision-making for surgeons who undertake fewer than ten shoulder arthroplasties annually (p = 0.017). Conclusions. The information provided by PSP has the greatest impact on the surgical decision-making of low volume surgeons (those who perform fewer than ten shoulder arthroplasties annually), and PSP brings all surgeons in to closer agreement with the recommendations of experts for glenoid classification and surgical planning. Cite this article: Bone Joint J 2020;102-B(3):365–370


The Bone & Joint Journal
Vol. 104-B, Issue 4 | Pages 486 - 494
4 Apr 2022
Liu W Sun Z Xiong H Liu J Lu J Cai B Wang W Fan C

Aims. The aim of this study was to develop and internally validate a prognostic nomogram to predict the probability of gaining a functional range of motion (ROM ≥ 120°) after open arthrolysis of the elbow in patients with post-traumatic stiffness of the elbow. Methods. We developed the Shanghai Prediction Model for Elbow Stiffness Surgical Outcome (SPESSO) based on a dataset of 551 patients who underwent open arthrolysis of the elbow in four institutions. Demographic and clinical characteristics were collected from medical records. The least absolute shrinkage and selection operator regression model was used to optimize the selection of relevant features. Multivariable logistic regression analysis was used to build the SPESSO. Its prediction performance was evaluated using the concordance index (C-index) and a calibration graph. Internal validation was conducted using bootstrapping validation. Results. BMI, the duration of stiffness, the preoperative ROM, the preoperative intensity of pain, and grade of post-traumatic osteoarthritis of the elbow were identified as predictors of outcome and incorporated to construct the nomogram. SPESSO displayed good discrimination with a C-index of 0.73 (95% confidence interval 0.64 to 0.81). A high C-index value of 0.70 could still be reached in the interval validation. The calibration graph showed good agreement between the nomogram prediction and the outcome. Conclusion. The newly developed SPESSO is a valid and convenient model which can be used to predict the outcome of open arthrolysis of the elbow. It could assist clinicians in counselling patients regarding the choice and expectations of treatment. Cite this article: Bone Joint J 2022;104-B(4):486–494


Bone & Joint Research
Vol. 5, Issue 8 | Pages 347 - 352
1 Aug 2016
Nuttall J Evaniew N Thornley P Griffin A Deheshi B O’Shea T Wunder J Ferguson P Randall RL Turcotte R Schneider P McKay P Bhandari M Ghert M

Objectives. The diagnosis of surgical site infection following endoprosthetic reconstruction for bone tumours is frequently a subjective diagnosis. Large clinical trials use blinded Central Adjudication Committees (CACs) to minimise the variability and bias associated with assessing a clinical outcome. The aim of this study was to determine the level of inter-rater and intra-rater agreement in the diagnosis of surgical site infection in the context of a clinical trial. Materials and Methods. The Prophylactic Antibiotic Regimens in Tumour Surgery (PARITY) trial CAC adjudicated 29 non-PARITY cases of lower extremity endoprosthetic reconstruction. The CAC members classified each case according to the Centers for Disease Control (CDC) criteria for surgical site infection (superficial, deep, or organ space). Combinatorial analysis was used to calculate the smallest CAC panel size required to maximise agreement. A final meeting was held to establish a consensus. Results. Full or near consensus was reached in 20 of the 29 cases. The Fleiss kappa value was calculated as 0.44 (95% confidence interval (CI) 0.35 to 0.53), or moderate agreement. The greatest statistical agreement was observed in the outcome of no infection, 0.61 (95% CI 0.49 to 0.72, substantial agreement). Panelists reached a full consensus in 12 of 29 cases and near consensus in five of 29 cases when CDC criteria were used (superficial, deep or organ space). A stable maximum Fleiss kappa of 0.46 (95% CI 0.50 to 0.35) at CAC sizes greater than three members was obtained. Conclusions. There is substantial agreement among the members of the PARITY CAC regarding the presence or absence of surgical site infection. Agreement on the level of infection, however, is more challenging. Additional clinical information routinely collected by the prospective PARITY trial may improve the discriminatory capacity of the CAC in the parent study for the diagnosis of infection. Cite this article: J. Nuttall, N. Evaniew, P. Thornley, A. Griffin, B. Deheshi, T. O’Shea, J. Wunder, P. Ferguson, R. L. Randall, R. Turcotte, P. Schneider, P. McKay, M. Bhandari, M. Ghert. The inter-rater reliability of the diagnosis of surgical site infection in the context of a clinical trial. Bone Joint Res 2016;5:347–352. DOI: 10.1302/2046-3758.58.BJR-2016-0036.R1


Aims. The aim of this study was to compare patient-reported outcome measures (PROMs) and the Single Assessment Numerical Evaluation (SANE) score in patients treated with a volar locking plate for a distal radial fracture. Methods. This study was a retrospective review of a prospective database of 155 patients who underwent internal fixation with a volar locking plate for a distal radial fracture between August 2014 and April 2017. Data which were collected included postoperative PROMs (Disabilities of the Arm, Shoulder, and Hand questionnaire (DASH) and Patient-Rated Wrist Evaluation (PRWE)), and SANE scores at one month (n = 153), two months (n = 155), three months (n = 144), six months (n = 128), and one year (n = 73) after operation. Patients with incomplete data were excluded from this study. Correlation and agreement between PROMs and SANE scores were evaluated. Subgroup analyses were carried out to identify correlations according to variables such as age, the length of follow-up, and subcategories of the PRWE score. Results. The Pearson correlation coefficient (r) between PROMs and SANE scores was -0.76 (p < 0.001) for DASH and -0.72 (p < 0.001) for PRWE, respectively. Limits of agreement between PROMs and ‘100-SANE’ scores were met for at least 93% of the data points. In subgroup analysis, there were significant negative correlations between PROMs and SANE scores for all age groups and for follow-up of more than six months. The correlation coefficient between PRWE subcategories and SANE score was -0.67 (p < 0.001) for PRWE pain score and -0.69 (p < 0.001) for PRWE function score, respectively. Conclusion. We found a significant correlation between postoperative SANE and PROMs in patients treated with a volar locking plate for a distal radial fracture. The SANE score is thus a reliable indicator of outcome for patients who undergo surgical treatment for a radial fracture. Cite this article: Bone Joint J 2020;102-B(6):744–748


The Bone & Joint Journal
Vol. 102-B, Issue 1 | Pages 33 - 41
1 Jan 2020
Norman JG Brealey S Keding A Torgerson D Rangan A

Aims. The aim of this study was to explore whether time to surgery affects functional outcome in displaced proximal humeral fractures. Methods. A total of 250 patients presenting within three weeks of sustaining a displaced proximal humeral fracture involving the surgical neck were recruited at 32 acute NHS hospitals in the United Kingdom between September 2008 and April 2011. Of the 125 participants, 109 received surgery (fracture fixation or humeral head replacement) as per randomization. Data were included for 101 and 67 participants at six-month and five-year follow-up, respectively. Oxford Shoulder Scores (OSS) collected at six, 12, and 24 months and at three, four, and five years following randomization was plotted against time to surgery. Long-term recovery was explored by plotting six-month scores against five-year scores and agreement was illustrated with a Bland-Altman plot. Results. The mean time from initial trauma to surgery was 10.5 days (1 to 33). Earlier surgical intervention did not improve OSS throughout follow-up, nor when stratified by participant age (< 65 years vs ≥ 65 years) and fracture severity (one- and two-part vs three- and four-part fractures). Participants managed later than reported international averages (three days in the United States and Germany, eight days in the United Kingdom) did not have worse outcomes. At five-year follow-up, 50 participants (76%) had the same or improved OSS compared with six months (six-month mean OSS 35.8 (SD 10.0); five-year mean OSS 40.1 (SD 9.1); r = 0.613). A Bland-Altman plot demonstrated a positive mean difference (3.3 OSS points (SD 7.92)) with wide 95% limits of agreement (-12.2 and 18.8 points). Conclusion. Timing of surgery did not affect OSS at any stage of follow-up, irrespective of age or fracture type. Most participants had maximum functional outcome at six months that was maintained at five years. These findings may help guide providers of trauma services on surgical prioritization. Cite this article: Bone Joint J 2020;102-B(1):33–41


Bone & Joint Open
Vol. 2, Issue 11 | Pages 900 - 908
3 Nov 2021
Saunders P Smith N Syed F Selvaraj T Waite J Young S

Aims. Day-case arthroplasty is gaining popularity in Europe. We report outcomes from the first 12 months following implementation of a day-case pathway for unicompartmental knee arthroplasty (UKA) and total hip arthroplasty (THA) in an NHS hospital. Methods. A total of 47 total hip arthroplasty (THA) and 24 unicompartmental knee arthroplasty (UKA) patients were selected for the day-case arthroplasty pathway, based on preoperative fitness and agreement to participate. Data were likewise collected for a matched control group (n = 58) who followed the standard pathway three months prior to the implementation of the day-case pathway. We report same-day discharge (SDD) success, reasons for delayed discharge, and patient-reported outcomes. Overall length of stay (LOS) for all lower limb arthroplasty was recorded to determine the wider impact of implementing a day-case pathway. Results. Patients on the day-case pathway achieved SDD in 47% (22/47) of THAs and 67% (16/24) of UKAs. The most common reasons for failed SDD were nausea, hypotension, and pain, which were strongly associated with the use of fentanyl in the spinal anaesthetic. Complications and patient-reported outcomes were not significantly different between groups. Following the introduction of the day-case pathway, the mean LOS reduced significantly by 0.7, 0.6, and 0.5 days respectively in THA, UKA, and total knee arthroplasty cases (p < 0.001). Conclusion. Day-case pathways are feasible in an NHS set-up with only small changes required. We do not recommend fentanyl in the spinal anaesthetic for day-case patients. An important benefit seen in our unit is the so-called ‘day-case effect’, with a significant reduction in mean LOS seen across all lower limb arthroplasty. Cite this article: Bone Jt Open 2021;2(11):900–908


The Bone & Joint Journal
Vol. 103-B, Issue 9 | Pages 1479 - 1487
1 Sep 2021
Davis ET Pagkalos J Kopjar B

Aims. The aim of our study was to investigate the effect of asymmetric crosslinked polyethylene liner use on the risk of revision of cementless and hybrid total hip arthroplasties (THAs). Methods. We undertook a registry study combining the National Joint Registry dataset with polyethylene manufacturing characteristics as supplied by the manufacturers. The primary endpoint was revision for any reason. We performed further analyses on other reasons including instability, aseptic loosening, wear, and liner dissociation. The primary analytic approach was Cox proportional hazard regression. Results. A total of 213,146 THAs were included in the analysis. Overall, 2,997 revisions were recorded, 1,569 in THAs with a flat liner and 1,428 in THAs using an asymmetric liner. Flat liner THAs had a higher risk of revision for any reason than asymmetric liner THAs when implanted through a Hardinge/anterolateral approach (hazard ratio (HR) 1.169, 95% confidence interval (CI) 1.022 to 1.337) and through a posterior approach (HR 1.122, 95% CI 1.108 to 1.346). There was no increased risk of revision for aseptic loosening when asymmetric liners were used for any surgical approach. A separate analysis of the three most frequently used crosslinked polyethylene liners was in agreement with this finding. When analyzing THAs with flat liners only, THAs implanted through a Hardinge/anterolateral approach were associated with a reduced risk of revision for instability compared to posterior approach THAs (HR 0.561 (95% CI 0.446 to 0.706)). When analyzing THAs with an asymmetric liner, there was no significant difference in the risk of revision for instability between the two approaches (HR 0.838 (95% CI 0.633 to 1.110)). Conclusion. For THAs implanted through the posterior approach, the use of asymmetric liners reduces the risk of revision for instability and revision for any reason. In THAs implanted through a Hardinge/anterolateral approach, the use of an asymmetric liner was associated with a reduced risk of revision. The effect on revision for instability was less pronounced than in the posterior approach. Cite this article: Bone Joint J 2021;103-B(9):1479–1487