Advertisement for orthosearch.org.uk
Results 1 - 20 of 273
Results per page:
Orthopaedic Proceedings
Vol. 104-B, Issue SUPP_13 | Pages 36 - 36
1 Dec 2022
Benavides B Cornell D Schneider P Hildebrand K
Full Access

Heterotopic ossification (HO) is a well-known complication of traumatic elbow injuries. The reported rates of post-traumatic HO formation vary from less than 5% with simple elbow dislocations, to greater than 50% in complex fracture-dislocations. Previous studies have identified fracture-dislocations, delayed surgical intervention, and terrible triad injuries as risk factors for HO formation. There is, however, a paucity of literature regarding the accuracy of diagnosing post-traumatic elbow HO. Therefore, the purpose of our study was to determine the inter-rater reliability of HO diagnosis using standard radiographs of the elbow at 52 weeks post-injury, as well as to report on the rate of mature compared with immature HO. We hypothesized inter-rater reliability would be poor among raters for HO formation. Prospectively collected data from a large clinical trial was reviewed by three independent reviewers (one senior orthopedic resident, one senior radiology resident, and one expert upper extremity orthopedic surgeon). Each reviewer examined anonymized 52-week post-injury radiographs of the elbow and recorded: 1. the presence or absence of HO, 2. the location of HO, 3. the size of the HO (in cm, if present), and 4. the maturity of the HO formation. Maturity was defined by consensus prior to image review and defined as an area of well-defined cortical and medullary bone outside the cortical borders of the humerus, ulna, or radius. Immature lesions were defined as an area of punctate calcification with an ill-defined cloud-like density outside the cortical borders of the humerus, ulna or radius. Data were collected using a standardized online data collection form (CognizantMD, Toronto, ON, CA). Inter-rater reliability was calculated using Fleiss’ Kappa statistic and a multivariate logistic regression analysis was performed to identify risk factors for HO formation in general, as well as mature HO at 52 weeks post injury. Statistical analysis was performed using RStudio (version1.4, RStudio, Boston, MA, USA). A total of 79 radiographs at the 52-week follow-up were reviewed (54% male, mean age 50, age SD 14, 52% operatively treated). Inter-rater reliability using Fleiss’ Kappa was k= 0.571 (p = 0.0004) indicating moderate inter-rater reliability among the three reviewers. The rate of immature HO at 52 weeks was 56%. The multivariate logistic regression analysis identified male sex as a significant risk factor for HO development (OR 5.29, 1.55-20.59 CI, p = 0.011), but not for HO maturity at 52 weeks. Age, time to surgery, and operative intervention were not found to be significant predictors for either HO formation or maturity of the lesion in this cohort. Our study demonstrates moderate inter-rater reliability in determining the presence of HO at 52 weeks post-elbow injury. There was a high rate (56%) of immature HO at 52-week follow-up. We also report the finding of male sex as a significant risk factor for post traumatic HO development. Future research directions could include investigation into possible male predominance for traumatic HO formation, as well as improving inter-rater reliability through developing a standardized and validated classification system for reporting the radiographic features of HO formation around the elbow


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_4 | Pages 3 - 3
3 Mar 2023
Roy K Joshi P Ali I Shenoy P Syed A Barlow D Malek I Joshi Y
Full Access

Classifying trochlear dysplasia (TD) is useful to determine the treatment options for patients suffering from patellofemoral instability (PFI). There is no consensus on which classification system is more reliable and reproducible for this purpose to guide clinicians in order to treat PFI. There are also concerns about validity of the Dejour classification (DJC), which is the most widely used classification for TD, having only a fair reliability score. The Oswestry-Bristol classification (OBC) is a recently proposed system of classification of TD and the authors report a fair-to-good interobserver agreement and good-to-excellent intra-observer agreement in the assessment of TD. The aim of this study was to compare the reliability and reproducibility of these two classifications. 6 assessors (4 consultants and 2 registrars) independently evaluated 100 magnetic resonance axial images of the patella-femoral joint for TD and classified them according to OBC and DJC. These assessments were again repeated by all raters after 4 weeks. The inter and intra-observer reliability scores were calculated using Cohen's kappa and Cronbach's alpha. Both classifications showed good to excellent interobserver reliability with high alpha scores. The OBC classification showed a substantial intra-observer agreement (mean kappa 0.628)[p<0.005] whereas the DJC showed a moderate agreement (mean kappa 0.572) [p<0.005]. There was no significant difference in the kappa values when comparing the assessments by consultants to those by registrars, in either classification systems. This large study from a non-founding institute shows both classification systems to be reliable for classifying TD based on magnetic resonance axial images of the patella-femoral joint, with the simple to use OBC having a higher intra-observer reliability score compared to the DJC


Orthopaedic Proceedings
Vol. 102-B, Issue SUPP_7 | Pages 83 - 83
1 Jul 2020
Bali K Smit K Beaulé P Wilkin G Poitras S Ibrahim M
Full Access

Hip dysplasia has traditionally been classified based on the lateral centre edge angle (LCEA). A recent meta-analysis demonstrated no definite consensus and a significant heterogeneity in LCEA values used in various studies to define hip dysplasia and borderline dysplasia. To overcome the shortcomings of classifying hip dysplasia based on just LCEA, a comprehensive classification for adult acetabular dysplasia (CCAD) was proposed to classify symptomatic hips into three discrete prototypical patterns of hip instability, lateral/global, anterior, or posterior. The purpose of this study was to assess the reliability of this recently published CCAD. One thirty four consecutive hips that underwent a PAO were categorized using a validated software (Hip2Norm) into four categories of normal, lateral/global, anterior or psosterior. Based on the prevalence of individual dysplasia and using KappaSize R package version 1.1, seventy four cases were necessary for reliability analysis: 44 dysplastic and 30 normal hips were randomly selected. Six blinded fellowship trained raters were then provided with the classification system and they looked at the x-rays (74 images) at two separate time points (minimum two weeks apart) to classify the hips using standard PACS measurements. Thereafter, a consensus meeting was held where a simplified flow diagram was devised before a third reading by four raters using a separate set of 74 radiographs took place. Intra-rater results per surgeon between Time 1 and Time 2 showed substantial to almost perfect agreement amongst the raters. With respect to inter-rater reliability, at time 1 and time 2, there was substantial agreement overall between all surgeons (kappa of 0.619 for time 1 and 0,623 for time 2). Posterior and anterior rating categories had moderate and fair agreement at time 1 and time 2, respectively. At time 3, overall reliability (kappa of 0.687) and posterior and anterior rating improved from Time 1 and Time 2. The comprehensive classification system provides a reliable way to identify three categories of acetabular dysplasia that are well-aligned with surgical management. The term borderline dysplasia should no longer be used


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_17 | Pages 71 - 71
24 Nov 2023
Heesterbeek P Pruijn N Boks S van Bokhoven S Dorrestijn O Schreurs W Telgt D
Full Access

Aim. Diagnosis of periprosthetic shoulder infections (PSI) is difficult as they are mostly caused by low-virulent bacteria and patients do not show typical infection signs, such as elevated blood markers, wound leakage, or red and swollen skin. Ultrasound-guided biopsies for culture may therefore be an alternative for mini-open biopsies as less costly and invasive method. The aim of this study was to determine the diagnostic value and reliability of ultrasound-guided biopsies for cultures alone and in combination polymerase chain reaction (PCR), and/or synovial markers for preoperative diagnosis of PSI in patients undergoing revision shoulder surgery. Method. A prospective explorative diagnostic cohort study was performed including patients undergoing revision shoulder replacement surgery. A shoulder puncture was taken preoperatively before incision to collect synovial fluid for interleukin-6 (IL-6), calprotectin, WBC, polymorphonuclear cells determination. Prior to revision surgery, six ultrasound-guided synovial tissue biopsies were collected for culture and two additional for PCR analysis. Six routine care tissue biopsies were taken during revision surgery and served as reference standard. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV; primary outcome measure), and accuracy were calculated for ultrasound-guided biopsies, and synovial markers, and combinations of these. Results. Fifty-five patients were included. In 24 patients, routine tissue cultures were positive for infection. Cultures from ultrasound-guided biopsies diagnosed an infection in 7 of these patients, yielding a sensitivity, specificity, PPV, NPV, and accuracy of 29.2%, 93.5%, 77.8%, 63.0%, and 65.6%, respectively. Ultrasound-guided biopsies in combination with synovial WBC increased the NPV to 76.7% and accuracy to 73.8%. When synovial WBC and calprotectin were combined with ultrasound-guided biopsies, it resulted in a better diagnostic value: sensitivity 69.2%, specificity 80.0%, PPV 69.2%, NPV 80.0%, and accuracy 75.8%. Ultrasound-guided biopsies in combination with calprotectin and ESR yielded a sensitivity of 50.0%, specificity of 93.8%, PPV of 80.0%, NPV of 78.9%, and accuracy of 79.2%. Synovial fluid was obtained in 42 patients. Sensitivities of WBC, PMN, IL-6, and calprotectin were between 25.0% and 35.7%, specificities between 89.5% and 95.0%, PPVs between 60.0% and 83.3%, NPVs between 65.4% and 69.4%, and accuracies between 64.5% and 70.6%. Conclusions. In this prospective study we showed that ultrasound-guided biopsies for cultures alone and in combination with PCR and/or synovial markers are not reliable enough to use in clinical practice for the preoperative diagnosis of low grade PSI


Orthopaedic Proceedings
Vol. 99-B, Issue SUPP_6 | Pages 22 - 22
1 Mar 2017
Suchier Y Chollet M Lefebvre F
Full Access

Today, hip prostheses are validated with Standards for fatigue testing: The Standard ISO 7206-4 requires to test 6 components at 230daN during 5 × 10. 6. cycles without crack. For the neck region of stemmed femoral components, the Standard ISO 7206-6 requires 6 tests at 534daN during 10 × 10. 6. cycles without crack. But these tests don't provide provide any indication on reliability level for an implantation in population. At the same time, the number of hip prosthesis implantation is growing, patients are implanted younger and younger and they want to be able to maintain a “normal” life. This way the average “loading spectrum” is getting tougher and tougher, due to this modification of the use of prosthesis in comparison with some years ago. On the other hand, there is new materials, new processes (additive manufacturing), new methods (customized stems…) with no feedback on reliability. A method is then necessary to manage the reliability in fatigue for actual and new products. The objective of this study is to establish a statistical distribution of loading of hip prosthesis in order to define at best a minimum value of strength required for a good fatigue design. To define this strength, the Stress-Strength (well known in automotive sector) approach is applied (fig 1). This approach will allow better assess the reliability in a population, depending on the mean strength and the scattering in fatigue. The first step is to establish the distribution of the loads for a hip prosthesis. Then, for a given risk level, the required strength can be defined, knowing the scattering of this strength. The strategy to have the distribution is based on:. In vivo load recordings on hip prosthesis (find on . Orthoload.com. ),. Analysis of frequency of everyday activities,. Activity level of different category of the population,. Statistical distribution of key parameters, like weight, age…. All these data are collected in the literature, and combined, then processed with the software DEFFI. ®. The goal is first to assess the reliability reached by a “nominal” stem and compare it to the reliability described in implant registers. Another goal is to analyse the stress distribution and compare it to standard requests (ISO 7206-6), in order to assess the reliability of an implant that succeeded this standard. A last, this method is a way to define the minimum strength for implants dedicated to particular populations: young and active patients, patients with high Body Weight, etc…. For any figures or tables, please contact authors directly (see Info & Metrics tab above).


Orthopaedic Proceedings
Vol. 98-B, Issue SUPP_21 | Pages 31 - 31
1 Dec 2016
Younger A Penner M Glazebrook M Goplen G Daniels T Veljkovic A Lalonde K Wing K Dryden P Wong H
Full Access

Reoperations may be a better way of tracking adverse outcomes than complications. Repeat surgery causes cost to the system, and often indicate failure of the primary procedure resulting in the patient not achieving the expected improvement in pain and function. Understanding the cause of repeat surgery at the primary site may result in design improvements to implants or improvements to fusion techniques resulting in better outcomes in the future. The COFAS group have designed a reoperation classification system. The purpose of this study was to outline the inter and intra observer reliability of this classification scheme. To verify the inter- and intra-observer reliability of this new coding system, six fellow ship trained practicing foot and ankle Orthopaedic surgeons were asked to classify 62 repeat surgeries from a single surgeons practice. The six surgeons read the operation reports in random order, and reread the reports 2 weeks later in a different order. Reliability was determined using intraclass correlation coefficients (ICC) and proportions of agreement. The agreement between pairs of readings (915 for inter observer for the first and second read – 61 readings with 15 comparisons, observer 1 with observer 2, observer 1 with observer 3, etc) was determined by seeing how often each observer agreed. This was repeated for the 366 ratings for intra observer readings (61 times 6). The inter-observer reliability on the first read had a mean intra-class correlation coefficient (ICC) of 0.89. The range for the 15 comparisons was 0.81 to 1.0. Amongst all 1830 paired codings between two observers, 1605 (88%) were in agreement. Across the 61 cases, 45 (74%) were given the same code by all six observers. However, the difference when present was larger with more observers not agreeing. The inter-observer reliability test on the second read had a mean ICC of 0.94, with a range of 0.90. There were 43 (72%) observations that were the same across all six observers. Of all pairs (915 in total) there was agreement in 804 pairs for the first reading (88%) and disagreement in 111 (12%). For the second reading there was agreement in 801 pairs (86%) and disagreement in 114 (14%). The intra-observer reliability averaged an ICC value of 0.92, with a range of 0.86 to 0.98. The observers agreed with their own previous observations 324 times out of 366 paired readings (89% agreement of pairs). The COFAS classification of reoperations for end stage ankle arthritis was reliable. This scheme potentially could be applied to other areas of Orthopaedic surgery and should replace the Claiden Dindo modifications that do not accurately reflect Orthopaedic outcomes. As complications are hard to define and lack consistent terminology reoperations and resource utilisation (extra clinic visits, extra days in hospital and extra hours of surgery) may be more reliable measures of the negative effects of surgery


Orthopaedic Proceedings
Vol. 103-B, Issue SUPP_3 | Pages 30 - 30
1 Mar 2021
Gerges M Eng H Chhina H Cooper A
Full Access

Bone age is a radiographical assessment used in pediatric medicine due to its relative objectivity in determining biological maturity compared to chronological age and size.1 Currently, Greulich and Pyle (GP) is one of the most common methods used to determine bone age from hand radiographs.2–4 In recent years, new methods were developed to increase the efficiency in bone age analysis like the shorthand bone age (SBA) and the automated artificial intelligence algorithms. The purpose of this study is to evaluate the accuracy and reliability of these two methods and examine if the reduction in analysis time compromises their accuracy. Two hundred thirteen males and 213 females were selected. Each participant had their bone age determined by two separate raters using the GP (M1) and SBA methods (M2). Three weeks later, the two raters repeated the analysis of the radiographs. The raters timed themselves using an online stopwatch while analyzing the radiograph on a computer screen. De-identified radiographs were securely uploaded to an automated algorithm developed by a group of radiologists in Toronto. The gold standard was determined to be the radiology report attached to each radiograph, written by experienced radiologists using GP (M1). For intra-rater variability, intraclass correlation analysis between trial 1 (T1) and trial 2 (T2) for each rater and method was performed. For inter-rater variability, intraclass correlation was performed between rater 1 (R1) and rater 2 (R2) for each method and trial. Intraclass correlation between each method and the gold standard fell within the 0.8–0.9 range, highlighting significant agreement. Most of the comparisons showed a statistically significant difference between the two new methods and the gold standard; however it may not be clinically significant as it ranges between 0.25–0.5 years. A bone age is considered clinically abnormal if it falls outside 2 standard deviations of the chronological age; standard deviations are calculated and provided in GP atlas.6–8 For a 10-year old female, 2 standard deviations constitute 21.6 months which far outweighs the difference reported here between SBA, automated algorithm and the gold standard. The median time for completion using the GP method was 21.83 seconds for rater 1 and 9.30 seconds for rater 2. In comparison, SBA required a median time of 7 seconds for rater 1 and 5 seconds for rater 2. The automated method had no time restraint as bone age was determined immediately upon radiograph upload. The correlation between the two trials in each method and rater (i.e. R1M1T1 vs R1M1T2) was excellent (κ= 0.9–1) confirming the reliability of the two new methods. Similarly, the correlation between the two raters in each method and trial (i.e. R1M1T1 vs R2M1T1) fell within the 0.9–1 range. This indicates a limited variability between raters who may use these two methods. The shorthand bone age method and an artificial intelligence automated algorithm produced values that are in agreement with the gold standard Greulich and Pyle, while reducing analysis time and maintaining a high inter-rater and intra-rater reliability


Orthopaedic Proceedings
Vol. 98-B, Issue SUPP_4 | Pages 102 - 102
1 Jan 2016
Wada K Mikami H Oba K Yamamoto N Toki S Sairyo K
Full Access

Introduction. The aim of this study is to verify the intra-rater and inter-rater reliability of intra-operative kinematics by hand in TKA using a computer assisted image-free navigation system. Material and Methods. Total knee arthroplasty (TKA) was performed on the knees of twelve (12) patients with knee navigation by one surgeon. Patients were divided into two groups: Group A included six knees that were operated on with assistant A (senior joint surgeon); and Group B included the other six knees that were operated on with assistant B (resident). For each knee, axial rotation was evaluated three times by the operator and the assistant using a navigation system at 30°, 60°, 90°, 120° passive flexions by hand. Intra-class correlation coefficients (ICC) were calculated for each evaluation to examine intra-rater and inter-rater reliability. Results. As intra-rater reliability, ICC (1,1) of the operator were 0.965 (range: 0.951 to 0.978); assistant A were 0.949 (range: 0.898 to 0.974); and assistant B were 0.987 (range: 0.982 to 0.992). As inter-rater reliability, ICC (2,1) of the operator and assistant A were 0.947 (range: 0.937 to 0.975), while the operator and assistant B were 0.949 (range: 0.887 to 0.989). The results demonstrated almost perfect reliability (ICC>0.81) of the inter-examiner and intra-examiner in each knee flexion angle. Conclusion. Intra-operative kinematic analysis by hand using a knee navigation system showed almost perfect reliability of not only the intra-examiner but also the inter-examiner. This result indicates that intra-operative kinematic analysis is useful information to be referred


Orthopaedic Proceedings
Vol. 94-B, Issue SUPP_XL | Pages 185 - 185
1 Sep 2012
Takao M Nishii T Sakai T Sugano N
Full Access

Introduction. Preoperative planning is an essential procedure for successful total hip arthroplasty. Many studies reported lower accuracy of two-dimensional analogue or digital templating for developmentally dysplastic hips (DDH). There have been few studies regarding the utility of three-dimensional (3D) templating for DDH. The aim of the present study is to assess the accuracy and reliability of 3D templating of cementless THA for hip dysplasia. Methods. We used 86 sets of 3D-CT data of 84 patients who underwent consecutive cementless THA using an anatomical stem and a rim-enlarged cup. There were six men and 78 women with the mean age of 58 years. The diagnosis was developmental dysplasia in 70 hips and osteonecrosis in 14 hips and primary osteoarthritis in 2 hips. There were 53 hips in Crowe group I, 11 hips in Crowe group II and 6 hips in Crowe group III. Each operator performed 3D templating prior surgery using a planning workstation of CT-based navigation system. Planned-versus-achieved accuracy was evaluated. The templating results were categorized as either exact size or +/− 1 size of implanted size. To assess the intra- and inter-planner reliabilities, 3D templating was performed by two authors blinded to surgery twice at an interval of one month. Kappa values were calculated. The accuracy and the intra- and inter-planner reliabilities were compared between the DDH group (70 hips) and the non DDH group (16 hips). Results. There was no significant difference in accuracy of component sizes between the DDH group and the non-DDH group. The accuracy of templating for cup sizes was 76 % for DDH and 75 % for non-DDH group (p=0.95). If accuracy was expanded to include all cups within one size of the implanted size, the accuracy was 97 % and 94 %, respectively (p=0.51). The accuracy of templating for stem sizes was 60 % for the DDH group and 75 % for the non-DDH group (p=0.27). The accuracy within 1 size was 99 % and 94 %, respectively (p=0.25). Regarding intra-planner reliability, mean kappa value for the cup size was 0.67 in the DDH group and 0.81 for the non-DDH group (p=0.18). Mean kappa value for the stem size was 0.64 in the DDH group and 0.79 for the non-DDH group (p=0.18). There were no significant differences in intra-planner reliability between the DDH and non-DDH group. Regarding inter-planner reliability for the cup size, mean kappa value was 0.33 in the DDH group and 0.37 in the non-DDH group (p=0.14). Mean kappa value for the stem size was 0.46 in the DDH group and 0.69 in the non-DDH group (p=0.07). There were no significant differences in inter-planner reliability between the DDH and non-DDH group. Conclusion. The 3D templating for cementless THA was accurate for hip dysplasia. Intra- and inter-planner reliabilities of the 3D templating were comparable with those of other primary diagnosis, while intra-planner reliability of cup sizes was fair regardless of diagnosis


Orthopaedic Proceedings
Vol. 95-B, Issue SUPP_14 | Pages 57 - 57
1 Mar 2013
Firth G Robertson A Ramguthy Y Schepers A
Full Access

Purpose of Study. Multiple measurements have been described for the assessment of developmental dysplasia of the hip (DDH). In particular, the centre edge angle (CEA) has been described by Wiberg to assess the position of the femoral head in relation to the acetabular edge in patients over the age of five years. The purpose of this study is twofold. Firstly to assess the reliability of all measurements available in the literature and secondly to evaluate whether or not the CEA can be reliably measured below five years of age. Methods. Eighty seven patients were included for assessment. Radiographs were measured within six months of spica cast/Batchelor cast removal, depending on whether closed or open reduction was performed. A web based computer programme was used to store the radiographs electronically and with the help of an electronic template the following measurements were recorded: CEA, AI, centre head distance discrepancy ratio (CHDDR), Smith's c/b and h/b ratios. Three readers recorded measurements at two intervals, to determine intra and inter reader reliability. Results. The mean age at measurement was 2.26 years (Range 0.60–5.99). Regarding intra reader reliability, the AI and CEA were the most reliable measurements with a mean intraclass correlation coefficient (ICC) of 0.87 [CI 0.78–0.94] and 0.78 [CI 0.43–0.94] respectively. Regarding inter reader reliability, the CEA was the most reliable measurement with a mean ICC of 0.84 [CI 0.79–0.90]. Conclusion. This study confirms the reliability of the CEA, AI, CHDDR, Smith's c/b and h/b ratios in children with DDH. It also describes the reliable use of the CEA at a younger age in DDH than previously described which has prognostic implications. NO DISCLOSURES


Orthopaedic Proceedings
Vol. 98-B, Issue SUPP_21 | Pages 95 - 95
1 Dec 2016
Pathy R Dodwell E Green D Scher D Blanco J Doyle S Daluiski A Sink E
Full Access

There is currently no standardised complication grading classification routinely used for paediatric orthopaedic surgical procedures. The Clavien-Dindo classification used in general surgery was modified and validated in 2011 by Sink et al. and has been used regularly to classify complications following hip preservation surgery. The aim of this study was to adapt and validate Sink et al.'s modification of the Clavien-Dindo classification system for grading complications following surgical interventions of the upper and lower extremities and spine in paediatric orthopaedic patients. Sink et al.'s modification of the Clavien-Dindo classification system was further modified for paediatric orthopaedic procedures. The modified grading scheme was based on the treatment required to treat the complication and the long term morbidity of the complication. Grade I complications do not require deviation from standard treatment. Grade II complications deviate from the normal post-operative course and require outpatient treatment. Grade III complications require investigations, re-admission or re-operation. Grade IV complications are limb or life threatening or have a potential for permanent disability (IVa: with no long term disability and IVb: with long-term disability). Grade V complications result in death. Forty-five complication scenarios were developed. Seven paediatric orthopaedic surgeons were trained to use the modified system and they each graded the scenarios on two occasions. The scenarios were presented in a different random order each time they were graded. Fleiss' and Cohen's k statistics were performed to test for inter-rater and intra-rater reliabilities, respectively. The overall Fleiss' k value for inter-rater reliability was 0.772 (95% CI, 0.744–0.799). The weighted k was 0.765 (95% CI, 0.703–0.826) for Grade I, 0.692 (95% CI, 0.630–0.753) for Grade II, 0.733 (95% CI, 0.671–0.795) for Grade III, 0.657(95% CI, 0.595–0.719) for Grade IVa, 0.769 (95% CI, 0.707–0.83) for Grade IVb and 1.000 for Grade V (p value <0.001). The Cohen's k value for intra-rater reliability was 0.918 (95% CI, 0.887–0.947). These tests show that the adapted classification system has high inter- and intra-rater reliabilities for grading complications following paediatric orthopaedic surgery. Given the high intra- and inter-rater reliability and simplicity of this system, adoption of this grading scheme as a standard of reporting complications in paediatric orthopaedic surgery could be considered. Since the evaluation of surgical outcomes should include the ability to reliably grade surgical complications, this reproducible, reliable system to assess paediatric surgical complications will be a valuable tool for improving surgical practices and patient outcomes


Orthopaedic Proceedings
Vol. 95-B, Issue SUPP_34 | Pages 293 - 293
1 Dec 2013
Dossett HG
Full Access

The development of the High Reliability Organization focused on safety in organizations such as nuclear power plants, to avoid catastrophes in an environment where accidents might be expected due to risk factors and complexity. (Figure 1) The Agency for Healthcare Research and Quality applied High Reliability Concepts to hospitals in an effort to improve safety and quality. The Institute for Healthcare Improvement has further expanded this approach to include establishing processes to ensure highly reliable care through analysis, design or redesign, using a model for improvement, and supported by technology and the physical environment. These concepts can be applied to total knee replacement by identifying key processes, conducting regular measurement and analysis, and ensuring daily problem solving to create and maintain process reliability. The application of patient specific technology to our conventional total knee replacement procedures creates an opportunity to improve both quality and safety in total knee replacement procedures. Preoperative imaging and use of computer software allows the surgeon to develop an individual blueprint for each operative procedure. A patient specific cutting guide is fabricated for use in surgery. Intra-operative measurement of bone cuts with comparison to the planned blueprint allows correction of inaccurate bone cuts during surgery. Post operative CT scanning provides a final accurate check of limb, knee and implant alignment in 3 dimensions, with comparison to the preoperative plan. Feedback from the surgeon to the engineers involved in the planning process allows daily improvement of the guide fit, cut accuracy and accuracy of limb, knee and implant alignment for these procedures. Patient reported outcome measures such as the Oxford Knee Score or WOMAC score can be carried out preoperatively and at 6 months post op, to assess reduction of pain and functional improvements resulting from the operative procedure. Ongoing annual patient surveillance using the 12 questions on the Oxford Knee Score, one question about satisfaction, and one question asking if the patient has undergone further surgery on the operative knee, can help assess the durability of the patient outcomes and the longevity of the prosthesis. Use of patient specific cutting guides, coupled with preoperative software for planning a kinematically aligned TKA, has demonstrated improved RCT outcomes at the Phoenix VA. Figure 2 compares the distribution of WOMAC scores for kinematically aligned and mechanically aligned TKA. Individualizing the alignment for each patient has narrowed the distribution of the scores, with 87% of the kinematically aligned scores better than the median score for mechanically aligned patients. There have been additional recent preoperative, perioperative and postoperative processes and checklists designed to increase quality and safety of TKA. Medical team training for preoperative briefing and post operative debriefing, use of the AAOS new STEPPS training program, monitoring post operative results with the NSQIP/VASQIP program and database give us additional tools to improve safety and quality. Coupled with patient specific alignment technology, I believe we currently have an excellent opportunity to move toward High Reliability in total knee replacement


Orthopaedic Proceedings
Vol. 96-B, Issue SUPP_13 | Pages 21 - 21
1 Sep 2014
Steck H Robertson A
Full Access

Background. The gold standard of care of clubfoot is the Ponseti method of serial manipulation and casting, followed by percutaneous tendo-achilles tenotomy. In our setting, registrars work in district hospitals where they run Ponseti clubfoot clinics with little or no specialist supervision. They use the Pirani score to serially assess improvement of the deformity during casting and to determine whether the foot is ready for tenotomy. Purpose of Study. To test the inter-observer reliability of the Pirani score, and whether it can be used by non-specialist doctors running Ponseti clubfoot clinics. Methods. Ethics permission was obtained from our institution. This is a prospective study where patients under the age of one year with idiopathic clubfoot were recruited from clubfoot clinics at our institution, over a period of four months. Following a training session using the original description of the score, each foot was independently assessed using the Pirani score by two paediatric orthopaedic surgeons, two orthopaedic registrars and two medical officers. The inter-observer reliability was assessed using the Fixed-marginal Kappa statistic and Percentage agreement. The first 15 feet were used as a learning curve, and hence excluded from final analysis. Results. 73 feet in 37 patients with idiopathic clubfoot (25 boys, 12 girls) under the age of 1 year were included in the study. The Kappa statistic and percentage agreement for the six variables of the Pirani score were determined. Whilst the overall agreement was determined by the Kappa statistic to be slight to fair, the two consultants were found to have a higher inter-observer reliability than the registrars and medical officers. Conclusion. Our results conflict with previously published studies in that the inter-observer reliability of the Pirani score was poor. In addition, we feel that this score cannot be reliably used by non-specialist doctors running Ponseti clubfoot clinics. NO DISCLOSURES


Orthopaedic Proceedings
Vol. 95-B, Issue SUPP_1 | Pages 64 - 64
1 Jan 2013
Smith T Shakokani M Cogan A Patel S Toms A Donell S
Full Access

Background. Patellar instability is a complex, multi-factorial disorder. Radiological assessment is regarded as an important part of the management of this population. The purpose of this study was to determine the intra- and inter-rater reliability of common radiological measurements used to evaluate patellar instability. Methods. One hundred and fifty x-rays from 51 individuals were reviewed by five reviewers: two orthopaedic trainees, a radiological trainee, a consultant radiologist and an orthopaedic physiotherapist. Radiological measurements assessed included patellar shape, sulcus angle, congruence angle, lateral patellofemoral angle (LPA), lateral patellar displacement (LPD), lateral displacement measurement (LDM), boss height, and patellar height ratios (Caton-Deschamps, Blackburne-Peel, Insall-Salvati). All assessors were provided with a summary document outlining the method of assessing each measurement. Bland-Altman analyses were adopted to assess intra- and inter-rater reliability. Results. The results indicated generally low measurement error on intra-rater reliability assessment, particularly for LPD (within-subject variance 0.7mm to 3.7mm), LDM (0.7mm to 3.5mm) and boss height (0.4mm to 1.6mm) for all assessors. There was greater measurement error for the calculation of sulcus angle (0.7° to 10.6°), congruence angle (0.8° to 18.4°) and LPA (0.8° to 16.5°). Whilst the inter-rater reliability between assessors indicated a low mean difference for assessments of patellar height measurements (0.0° to 0.6°), there was greater variability for LPA (0.1° to 3.6°), LPD (0.2mm to 4.6mm) and LDM (0.1mm to 4.0mm), with wide 95% limits of agreement for all measurements indicated poor precision. Conclusions. Many of the standard measurements used to assess the patellofemoral joint on plain radiographs have poor precision. Intra-rater reliability may be related to experience but it seems likely that to achieve good inter-rater reliability, specific training may be required to calibrate observers. More formal training in the technique of radiological measurement for those who were inexperienced might have improved the inter-rater reliability


Orthopaedic Proceedings
Vol. 94-B, Issue SUPP_XXI | Pages 83 - 83
1 May 2012
D.R. G
Full Access

We present a hip mapping system to describe chondral lesions within the hip, and an assessment of its inter-observer reliability and ease of use. The mapping system divides the acetabular articular surface into ten zones (five inner and five outer) and the femoral head into five zones using easily identifiable features. This study was performed by six surgeons during hip arthroscopy of 60 patients. During each operation, one of the surgeons identified up to three small intra-articular features to several (one, two or three) of the other five surgeons. Each surgeon examined the hip independently without discussion and recorded the locations on a hip map. If two surgeons had observed a point, this provided one pair to assess agreement; three or four surgeons provided three or six pairs respectively. Each observation of a point by a pair of surgeons (a point-pair) provided one opportunity for assessment of agreement. One Hundred and Fifty Four points were mapped by two, three or four surgeons, giving 353 point-pairs for assessment. In 325 cases (92%), the pair of surgeons were in agreement, designating the point as within the same zone. On 23 (8%) occasions, there was disagreement but always across a boundary between adjacent zones. Disagreements were more common about points on the femoral head (15) than on the acetabulum (13). Disagreements in acetabulum occurred equally at each radial boundary but only rarely between inner and outer acetabular zones. All surgeons reported that they found the system easy to use. There was no difference in the level of disagreement between more and less experienced surgeons or a learning effect with time. Inter-observer reliability of this mapping system was 92%, supporting the use of a zone based mapping system in clinical practice. This map shows a good balance between precision and reliability


Orthopaedic Proceedings
Vol. 101-B, Issue SUPP_4 | Pages 6 - 6
1 Apr 2019
Wilson C Singh V
Full Access

Introduction. The intra-operative diagnosis of Prosthetic Joint Infection (PJI) is a dilemma requiring intra-operative sampling of suspicious tissues for frozen section, deep tissue culture and histopathology to secure a diagnosis. Alfa defensin-1 testing has been introduced as a quick and reliable test for confirming or ruling out PJI. This study aims to assess its intra-operative reliability compared to the standard tests. Methods. Twenty patients who underwent revision hip and knee arthroplasty surgery were included. Patients joint aspirate was tested intra-operatively with the Synovasure kit, which takes approximately ten minutes for a result. Our standard protocol of collecting 5 deep tissue samples for culture and one sample for histopathology was followed. Results for Alfa defensin-1 test were then compared with final culture and histopathology results in all these patients. Results. Our results show an excellent correlation with the final deep tissue cultures and histopathology outcomes. Literature reports frozen section to have low (58–73%) sensitivity but high (96%) specificity. Conclusions. Alfa defensin-1 test is easy, quick and efficient; results were available immediately intra-operatively. Cryosection is time consuming with samples shipped to the reference laboratory at times resulting in intra-operative delays. In our practice Alfa defensin-1 test certainly will replace frozen section for intra-operative testing


Orthopaedic Proceedings
Vol. 94-B, Issue SUPP_II | Pages 86 - 86
1 Feb 2012
McCarthy M Grevitt M Silcocks P Hobbs G
Full Access

The NDI is a simple 10-item questionnaire used to assess patients with neck pain. The original validation was performed on 52 patients with neck pain and the test-retest on 17 whiplash patients with a 2-day interval. The SF36 measures functional ability, wellbeing and the overall health of patients. It is used in health economics to assess the health utility, gain and economic impact of medical interventions. Objectives were to independently validate the NDI in patients with neck pain and to draw comparison between the NDI and SF36. 160 patients with neck pain attending the spinal clinic completed self-assessment questionnaires. A second questionnaire was completed in 34 patients after a period of 1-2 weeks. The internal consistency of the NDI and SF36 was calculated using Cronbach alpha. The test-retest reliability was assessed using the Bland and Altman method and the concurrent validity between the two questionnaires was assessed using Pearson correlation. Both questionnaires showed robust internal consistency: SF36 alpha = 0.878 (se=0.014, 95%CI=0.843 to 0.906) and NDI = 0.864 (se=0.017, 95%CI=0.825 to 0.894). The NDI had significant correlation to all eight domains of the SF36 (p<0.001). The individual scores for each of the ten items had significant correlation with the total disability score (p<0.001). The test-retest reliability of the NDI was acceptable. We have shown irrefutably that the NDI has good reliability and validity and that it stands up well to the SF36


Orthopaedic Proceedings
Vol. 98-B, Issue SUPP_3 | Pages 92 - 92
1 Jan 2016
Noble P Noel C
Full Access

INTRODUCTION. The timely identification of outliers (implants, surgeons or patients) using prospectively collected registry data is confounded by many factors, including the assumption that the sampled population is representative of the entire cohort of patients. In this study we utilized a computer simulation of a joint registry to address the question: How does incomplete enrollment of patients in registries affect the reliability of identification of outliers, and what percent capture of the target population is sufficient?. MATERIALS AND METHODS. A synthetic registry was created consisting of 10,000 patients (100 surgeons), of whom, 1000 underwent joint replacement using a new implant. A predictive model for the risk of revision was created from data published by the Swedish TKR Registry and the AOANJRR. The pairing of patients, surgeons and implants was randomized and for each assignment, the probability of revision was computed. We then chose random samples of all patients in 10% increments from 10% to 100%, simulating incomplete capture of all potential cases by the registry. For each sample we calculated the number of cases of the new implant predicted to end in revision. The assignments were repeated 2000 times using implants with revision rates of 1.5%, 2.0% and 3.0% per annum vs. 1.0% for all other implants of the same class. RESULTS. The observed failure rate of the new implant averaged 2.0%, but varied from 0.7–3.8% over the 2000 trials, with 100% enrollment. With only 10% enrollment, the spread of failure rates increased to 0.0–7.8%, corresponding to a 152% increase in the variability of the observed revision rate. When enrollment was increased from 80% to 100%, the variability of the failure rate changed by only 9% from a range of 1.63% (1.23–2.86%) to 1.50% (1.30–2.80%) (90% CI). The reliability of detection of poorly performing implants improved dramatically with enrollment. With 70% enrollment, an implant with a 2.0% failure rate could be detected with 95% confidence, while a 3.0% implant became apparent with only 21% enrollment. Conversely, with even 100% enrollment it was not possible to identify implants with a 1.5% annual failure rate with 95% confidence. CONCLUSIONS. If registries collect a truly representative sample of only 50–80% of the total patient population, there will be only a slight increase in the risk of overlooking an inferior outlier, including poorly-performing implants, compared to 100% patient capture. Our results suggest that enrollment of every patient receiving a given treatment is not nearly as important as randomization of the sample subjected to analysis


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_3 | Pages 37 - 37
23 Feb 2023
van der Gaast N Huitema J Brouwers L Edwards M Hermans E Doornberg J Jaarsma R
Full Access

Classification systems for tibial plateau fractures suffer from poor interobserver agreement, and their value in preoperative assessment to guide surgical fixation strategies is limited. For tibial plateau fractures four major characteristics are identified: lateral split fragment, posteromedial fragment, anterior tubercle fragment, and central zone of comminution. These fracture characteristics support preoperative assessment of fractures and guide surgical decision-making as each specific component requires a respective fixation strategy. We aimed to evaluate the additional value of 3D-printed models for the identification of tibial plateau fracture characteristics in terms of the interobserver agreement on different fracture characteristics.

Preoperative images of 40 patients were randomly selected. Nine trauma surgeons, eight senior and eight junior registrars indicated the presence or absence of four fracture characteristics with and without 3D-printed models. The Fleiss kappa was used to determine interobserver agreement for fracture classification and for interpretation, the Landis and Koch criteria were used.

3D-printed models lead to a categorical improvement in interobserver agreement for three of four fracture characteristics: lateral split (Kconv = 0.445 versus K3Dprint = 0.620; P < 0.001), anterior tubercle fragment (Kconv = 0.288 versus K3Dprint = 0.449; P < 0.001) and zone of comminution (Kconv = 0.535 versus K3Dprint = 0.652; P < 0.001).

The overall interobserver agreement improved for three of four fracture characteristics after the addition of 3D printed models. For two fracture characteristics, lateral split and zone of comminution, a substantial interobserver agreement was achieved.

Fracture characteristics seem to be a more reliable way to assess tibial plateau fractures and one should consider including these in the preoperative assessment of tibial plateau fractures compared to the commonly used classification systems.


Orthopaedic Proceedings
Vol. 95-B, Issue SUPP_15 | Pages 277 - 277
1 Mar 2013
Nagamine R Hirokawa S Todo M Weijia C Kondo K
Full Access

Introduction. Reliability of a gap control technique with the tensor/balancer during PS-TKA was assessed by means of fluoroscopic images after TKA. Methods. Thirty-one subjects were selected for assessment. The mean age of the subjects was 73.0 years old. During PS-TKA, a parapatellar approach was used. Cruciate ligaments were excised, and distal femoral and proximal tibial cuts were made. After all osteophytes were removed, the joint gap angle and distance were measured in full extension and at 90° flexion using a tensor/balancer. Medial soft tissue releases were performed and soft tissue balancing was obtained in full extension so that the joint gap angle was 3° or less than 3°. The joint gap angle and distance between femoral and tibial cut surfaces in full extension, and between a tangent to the posterior femoral condyles and tibial cut surface at 90° flexion were measured. The external rotation angle of the anterior and posterior cuts of the femur was decided based on the joint gap angle at 90° flexion. The size of the femoral component was decided based on the joint gap distance in full extension and at 90° flexion. Then only the trial femoral component was inserted. The joint gap angle and distance between the tangent to the condyles of the trial femoral component and tibial cut surface in full extension and at 90° flexion were measured. More than one month after TKA, the fluoroscopic images of the prostheses were taken during knee extension/flexion. Then, a torque of about 5 Nm was applied to the lower leg in order to assess the varus/valgus flexibility during flexion. The pattern matching method was used to measure the 3D movements of the prostheses from the fluoroscopic images. The joint gap angle was calculated in full extension and at 90° flexion. The varus/valgus flexibility at each flexion angle was also assessed. Results. During TKA, the mean joint gap angle was 0.9° varus in full extension, and was 0.3° valgus at 90° flexion. The mean difference of the gap distance between extension and flexion was 2.3 mm. The results from fluoroscopic images showed that the mean joint gap angle was 0.1° valgus in extension, and was 0.6° varus at 90° flexion. The mean joint gap in full extension and at 90° flexion was less than 1° both during TKA and after TKA. The mean varus/valgus flexibility in the implanted knees was 1.6° in full extension, and was 3.9° at 90° flexion. Discussion. The results showed that the joint gap was almost rectangular both in extension and flexion both during TKA and after TKA. The tensor/balancer, with a load of 30 inch-pounds, was reliable during PS-TKA. Muscles function had recovered and the implanted knees might be stable. However, the results of this study clearly showed the theoretical ground for the reliability of the tensor/balancer during TKA. Conclusion. During PS-TKA by means of the gap control technique, the tensor/balancer with 30 inch-pounds can provide reliable joint gap angle and distance