Advertisement for orthosearch.org.uk
Results 1 - 20 of 276
Results per page:

Aims. Classifying trochlear dysplasia (TD) is useful to determine the treatment options for patients suffering from patellofemoral instability (PFI). There is no consensus on which classification system is more reliable and reproducible for the purpose of guiding clinicians’ management of PFI. There are also concerns about the validity of the Dejour Classification (DJC), which is the most widely used classification for TD, having only a fair reliability score. The Oswestry-Bristol Classification (OBC) is a recently proposed system of classification of TD, and the authors report a fair-to-good interobserver agreement and good-to-excellent intraobserver agreement in the assessment of TD. The aim of this study was to compare the reliability and reproducibility of these two classifications. Methods. In all, six assessors (four consultants and two registrars) independently evaluated 100 axial MRIs of the patellofemoral joint (PFJ) for TD and classified them according to OBC and DJC. These assessments were again repeated by all raters after four weeks. The inter- and intraobserver reliability scores were calculated using Cohen’s kappa and Cronbach’s α. Results. Both classifications showed good to excellent interobserver reliability with high α scores. The OBC classification showed a substantial intraobserver agreement (mean kappa 0.628; p < 0.005) whereas the DJC showed a moderate agreement (mean kappa 0.572; p < 0.005). There was no significant difference in the kappa values when comparing the assessments by consultants with those by registrars, in either classification system. Conclusion. This large study from a non-founding institute shows both classification systems to be reliable for classifying TD based on axial MRIs of the PFJ, with the simple-to-use OBC having a higher intraobserver reliability score than that of the DJC. Cite this article: Bone Jt Open 2023;4(7):532–538


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_4 | Pages 3 - 3
3 Mar 2023
Roy K Joshi P Ali I Shenoy P Syed A Barlow D Malek I Joshi Y
Full Access

Classifying trochlear dysplasia (TD) is useful to determine the treatment options for patients suffering from patellofemoral instability (PFI). There is no consensus on which classification system is more reliable and reproducible for this purpose to guide clinicians in order to treat PFI. There are also concerns about validity of the Dejour classification (DJC), which is the most widely used classification for TD, having only a fair reliability score. The Oswestry-Bristol classification (OBC) is a recently proposed system of classification of TD and the authors report a fair-to-good interobserver agreement and good-to-excellent intra-observer agreement in the assessment of TD. The aim of this study was to compare the reliability and reproducibility of these two classifications. 6 assessors (4 consultants and 2 registrars) independently evaluated 100 magnetic resonance axial images of the patella-femoral joint for TD and classified them according to OBC and DJC. These assessments were again repeated by all raters after 4 weeks. The inter and intra-observer reliability scores were calculated using Cohen's kappa and Cronbach's alpha. Both classifications showed good to excellent interobserver reliability with high alpha scores. The OBC classification showed a substantial intra-observer agreement (mean kappa 0.628)[p<0.005] whereas the DJC showed a moderate agreement (mean kappa 0.572) [p<0.005]. There was no significant difference in the kappa values when comparing the assessments by consultants to those by registrars, in either classification systems. This large study from a non-founding institute shows both classification systems to be reliable for classifying TD based on magnetic resonance axial images of the patella-femoral joint, with the simple to use OBC having a higher intra-observer reliability score compared to the DJC


Orthopaedic Proceedings
Vol. 91-B, Issue SUPP_II | Pages 215 - 215
1 May 2009
Qureshi AA Roberts A
Full Access

Aim: To assess the Interobserver Reliability of the Sauvegrain Skeletal Age Assessment. Methods and Results: Elbow radiographs requested to exclude injury were anonymised. Sixteen examinations were assessed by ten independent orthopaedic specialist registrars or consultants. The Sauvegrain method as modified by Dimeglio was used to score the radiographs. The observations made were then assessed for interobserver reliability by means of a multiple observer Kappa score and the total scores by intra-class correlation coefficient. Kappa scores for the components of the score were 0.403 for the lateral condyle; 0.492 for the trochlea; 0.354 for the proximal radius and 0.508 for the olecranon. Adding item scores to produce a modified Sauvegrain score had an intra-class reliability of 0.858 (95% CI 0.758 to 0.935). Conclusions: Methods of identifying skeletal maturation and predicting future growth generally depend on the use of an atlas of hand radiographs. Difficulties with poor interobserver reliability associated with these methods have led to a move towards assessments that do not depend upon bone age estimations. Unfortunately plans based on ratios of growth or average patterns produce errors when unusual types of growth disturbance are present. We conclude that use of a scoring system for maturation assessed by elbow radiographs offers a significant advantage when substituted into the straight-line method of growth prediction. The Sauvegrain method as modified by Dimeglio. 1. has demonstrated an excellent level of interobserver reliability. We have used Sauvegrain scores to improve the accuracy of timing when using the Mosely straight-line method. 3.


Orthopaedic Proceedings
Vol. 94-B, Issue SUPP_I | Pages 49 - 49
1 Jan 2012
Brunse M Stochkendahl M Vach W Kongsted A Poulsen E Hartvigsen J Christensen H
Full Access

Background and purpose. The musculoskeletal system is recognized as a possible source of pain in patients with chest pain. The objectives of the present study were (1) to investigate the interobserver reliability of an overall diagnosis of musculoskeletal chest pain using a standardized examination protocol in a cohort of patients with chest pain suspected to be of non-cardiac origin, (2) to investigate the interobserver reliability of the single components of the protocol, and finally, (3) to investigate the importance of clinical experience on the level of interobserver reliability. Methods and results. Eighty patients with acute chest pain were recruited from a cardiology department. Four observers (two chiropractors and two chiropractic students) performed a physical examination and an extended manual examination of the spine and chest wall. Percentage agreement, Cohen's Kappa and ICC were calculated for observer pairs and overall. Musculoskeletal chest pain was diagnosed in 44.0 % of patients. Interobserver kappa values were substantial for the chiropractors and overall, and moderate for the students. For single items of the protocol, both pairs showed fair to substantial agreement regarding pain provocation tests and poor to fair agreement regarding spinal segmental dysfunction tests. Conclusions. Suspected musculoskeletal chest pain can be identified with substantial interobserver reliability using this standardized protocol if used by experienced and trained observers. Agreement for individual components of the protocol showed, however, considerable variation. Provided training of observers, the examination protocol can be used in selected patients and can be implemented in pre- and post-graduate clinical training


Orthopaedic Proceedings
Vol. 93-B, Issue SUPP_I | Pages 38 - 38
1 Jan 2011
Qureshi A Roberts A
Full Access

The purpose of this study was to assess the Interobserver Reliability of the Sauvegrain Skeletal Age Assessment. Elbow radiographs requested to exclude injury were anonymised. Sixteen examinations were assessed by ten independent orthopaedic specialist registrars or consultants. The Sauvegrain method as modified by Dimeglio was used to score the radiographs. The observations made were then assessed for interobserver reliability by means of a multiple observer Kappa score and the total scores by intra-class correlation coefficient. Kappa scores for the components of the score were 0.403 for the lateral condyle; 0.492 for the trochlea; 0.354 for the proximal radius and 0.508 for the olecranon. Adding item scores to produce a modified Sauvegrain score had an intraclass reliability of 0.858 (95% CI 0.758 to 0.935). Methods of identifying skeletal maturation and predicting future growth generally depend on the use of an atlas of hand radiographs. Difficulties with poor interobserver reliability associated with these methods has led to a move towards assessments that do not depend upon bone age estimations. Unfortunately plans based on ratios of growth or average patterns produce errors when unusual types of growth disturbance are present. We conclude that use of a scoring system for maturation assessed by elbow radiographs offers a significant advantage when substituted into the straight line method of growth prediction. The Sauvegrain method as modified by Dimeglio1 has demonstrated an excellent level of inter observer reliability. We have used Sauvegrain scores to improve the accuracy of timing when using the Mosely straight line method


Orthopaedic Proceedings
Vol. 87-B, Issue SUPP_I | Pages 69 - 69
1 Mar 2005
Viehweger E Hélix M Jacquemier M Scavarda D Rohon MA Scorsone-Pagny S
Full Access

Introduction: With the evolution and the complexity of the treatments in cerebral palsy (CP) patients it is essential to assess their outcome using validated tools. Technical analysis offers objective data which may be associated to more subjective functional evaluation and health related quality of life tests. Simplified visual tests were proposed as an alternative to the complex and expensive instrumented three-dimensional gait analysis. The Edinburgh Visual Gait Score (EVGS) was proposed for routine clinical use when complete technical analysis is not available or may represent a part of a global patient evaluation. The purposes of our study were: 1) to apply a French translation of the EVGS to standard video recordings of a group of independent walking spastic diplegic CP patients 2) to evaluate the intraobserver and interobserver reliability and 3) to compare the results of gait analysis with experienced and inexperienced observers. Material & methods: A series of ten standard video recordings of spastic diplegic CP patients, acquired during routine clinical gait analysis were examined by eight observers, two times, with two weeks in between the assessments. Observers were selected from following specialties: three paediatric orthopaedic surgeons, one resident in orthopaedic surgery, one neurosurgeon, one physiatrist and two physiotherapists. Observers were separated into two groups according to their experience with gait analysis interpretations. Kappa statistics and intraclass correlation coefficient were calculated. Results: Better intraobserver and interobserver reliability was observed for foot and knee scores with significant difference between stance and swing phase results. Pelvis, hip and trunk score results were significantly lower. The interobserver reliability for segment scores and the global EVGS showed better results than the intraobserver reliability. The gait analysis experienced observer group showed significantly higher intraobserver and interobserver reliability. Discussion & conclusion: Our reliability results about the use of the EVGS are close to the results of Read et al. Interestingly we showed a significant difference between the two observer groups. Observers familiar with gait analysis obtained better reliability results. That shows the importance to either be used to clinical gait analysis interpretation including learning the visualisation of the different gait phases, or to benefit of a video analysis training before using the visual score as a standard clinical evaluation tool. For this study we did not use the patient preparation recommendations of the initial authors to improve accuracy of scoring because the possibility to use historic standard videos wanted to be tested. Poor score reliability of the pelvis and hip may be improved. Further studies of multilevel surgery outcome evaluation by visual analysis trained observers are needed to explore clinical changes in CP patients over time


Orthopaedic Proceedings
Vol. 88-B, Issue SUPP_II | Pages 314 - 314
1 May 2006
Elkinson I Crawford H Barnes M Boxch P Ferguson J
Full Access

The aim was to evaluate the Intraobserver and Interobserver reliability of Pelvic Incidence as a fundamental parameter of sagittal spino-pelvic balance in patients with spondylolisthesis compared to controls with Idiopathic Adolescent Scoliosis. A blinded test retest study including multi-surgeon assessment of Pelvic Incidence in patients with spondylolisthesis and Idiopathic Adolescent Scoliosis was carried out. We assessed the agreement between the pelvic incidence measurements using the Bland and Altman method and mean differences (95% confidence interval) are reported. Forty patients seen at Starship Children’s Hospital between 1992 – 2003 by two spinal surgeons were retrospectively identified. The main group had 20 patients with spondylolisthesis (Isthmic and/or Dysplastic types) and the control group consisted of 20 patients with Idiopathic Adolescent Scoliosis. Five observers with different levels of experience included the two orthopaedic surgeons, one fellow, one senior trainee and one non-trainee registrar. Prior to the initial test phase, a consensus-building session was carried out. All five observers arrived at a standardised method for measuring the Pelvic Incidence. In the test phase randomly ordered lateral lumbosacral radiographs were independently evaluated by the five observers and pelvic incidence was measured. Assessment of the Pelvic Incidence was repeated one week later in the re-test phase. The radiographs were presented in a randomly pre-assigned order. Bland and Altman plots were constructed and mean differences (95% confidence interval) reported to evaluate the agreement between the Pelvic Incidence measurements among the five independent observers. All analysis was performed on the statistical software package SAS. P-value of 0.05 was considered statistically significant. The spondylolisthesis group had 11 (55%) males and 9 (45%) females with an average age of 14 ± 4.2. 2 patients had high-grade (Meyerding Class III, IV, V) and 16 had low-grade (Meyerding Class I, II) spondylolisthesis. 2 patients were post-reduction of spondylolisthesis. In the Scoliosis group there were 2 (10%) males and 18 (90%) females with an average age of 15 ± 2.9. There was no significant difference between male and females pelvic incidence measurement (60° ± 18.7° vs. 57° ± 14.6°, p=0.540) or age (15 ± 2.9 vs. 14 ± 3.8, p=0.181). There was no difference in pelvic incidence across the Meyerding groups, p=0.257. There was a significant difference between spondylolisthesis and scoliosis pelvic incidence measurements 65° ± 15.6° vs. 51° ± 12.8°, p=0.003. In the . Spondylolisthesis Group. the interobserver reliability between five clinicians, expressed as the mean difference in pelvic incidence measurement was 0.6° (95%CI −0.81, 1.91) and was not significantly different from zero p=0.423. The agreement limits were from −12.8° to 13.9°. The intraobserver reliability of pelvic incidence showed the mean difference ranging from −2.1° to 1.4° (p=0.129 and 0.333 with 95% CI). One had marginal evidence of a significant difference of 3.3° (95% CI 0.05° to 6.55°, p=0.047). In the . Scoliosis Group. the interobserver reliability was 0.3° (95% CI −0.81, 1.49) and was not significantly different from zero p=0.726. The agreement limits were from −11.0° to 11.6°. The intraobserver reliability among four observers ranged from −1.7° to 0.5° (p=0.178 and 0.661). One had a significant difference in readings of 4.1° (95% CI of 0.70° to 7.40°, p= 0.020). Scoliosis patients had a significantly smaller pelvic incidence than spondylolisthesis patients. The interobserver reliability of the pelvic incidence measurement was excellent across both groups. The intraobserver reliability was good with only one observer in each group demonstrating a marginally significant difference. Pelvic incidence is therefore a reliable measurement which can be used as a predictor in progression of spondylolisthesis


Orthopaedic Proceedings
Vol. 88-B, Issue SUPP_I | Pages 171 - 171
1 Mar 2006
Sanchez R Salcedo C Martinez M Molina J Vera F Villarreal J
Full Access

Introduction and objectives: The purpose of the research is to show the agreement and reproducibility among 5 observers when they are questioned about 51 open fractures using two open fracture classifications for long bones (Gustilo and Aybar), interpreting the results obtained between both classifications. Material and Method: A classification protocol is established for open fractures. The fractures are graded independently using each of the systems being evaluated (Gustilo and Aybar), by visualising slides with clinical and radiologic images in addition to a report of the data in the clinical history. The survey is conducted twice with a time difference of one to eight weeks. 5 members of the Orthopedic and Traumatologic Surgery Department (OTSD) were questioned (1 Professor, 2 Specialists and 2 Residents). The statistical method used to analyse the results was the interobserver agreement percentage and the inter- and intraobserver kappa index. Results: The interobserver agreement percentage for the Gustilo classification was 58.82% and 39.21% for the Aybar classification. The kappa index for the interobserver agreement for the Gustilo classification was 0.51 and for the Aybar classification was 0.54. The kappa index for the intraobserver reproducibility was 0.69 for the Gustilo classification and 0.58 for the Aybar one. Conclusions: The interobserver agreemnet was considered moderate-poor for the Gustilo and Aybar classifications. The intraobserver reproducibility was considered substantial for the Gustilo classification and moderate for the Aybar one. We conclude that this agreement shows too much variability as to accept just one classification as the only valid method to take therapeutic decisions or for comparing results. Therefore, it’s necessary to create a more detailed and careful classification, which is quick to use, reliable, reproducible and which contains a more objective criteria


Orthopaedic Proceedings
Vol. 84-B, Issue SUPP_I | Pages 46 - 46
1 Mar 2002
Lautman S Faizon G Roger R Rosset P
Full Access

Purpose: Classifications of fractures of the thoracolumbar spine are theoretically designed to help make therapeutic decisions. Three classifications (J. Laulan, F. Denis, F. Magerl) were compared to assess reproducibility for use by a surgery team. Material and methods: The classifications were described during a SOFCOT symposium in 1995. Four observers examined 60 files reading them twice at a 1 month interval. The files included plain radiographs (AP and lateral view) and a scanner series and were read in random order. Intra- and interobserver concordance were measured with the kappa method. Results: Intra- and interobserver reproducibility was good for the classification proposed by F. Denis (kappa = 0.6229 and 0.0795) for classification groups but was weak for subgroups (kappa = 0.028 and 0.571). Reproducibility was moderate for the classification proposed by J. Laulin (interob-server kappa = 0.460, intraobserver kappa = 0.541). The Magerl classification produced low to negligible reproducibility for classification groups and subgroups (intra- and interobserver kappa = 0.138 to 0.0343). Discussion: Because of its low to negligible reproducibility, the Magerl classification would be difficult to use in clinical practice to make coherent therapeutic decisions or for scientific research to analyze series of fractures treated using this classification. The reproducibility of the F. Denis classification was good for groups but low for subgroups that include fractures resulting from different mechanisms requiring radically different treatment strategies. This is a good classification system for descriptive work but can lead to treatments poorly adapted to the causal mechanism of the fracture. The reproducibility of the J. Laulan classification is moderate but each group in this classification corresponds to fractures caused by the same mechanism. Therapeutic indications determined with this system would be more coherent


Orthopaedic Proceedings
Vol. 96-B, Issue SUPP_11 | Pages 315 - 315
1 Jul 2014
Dhooge Y Wentink N Theelen L van Hemert W Senden R
Full Access

Summary. The ankle X-ray has moderate diagnostic power to identify syndesmotic instability, showing large sensitivity ranges between observers. Classification systems and radiographic measurements showed moderate to high interobserver agreement, with extended classifications performing worse. Introduction. There is no consensus regarding the diagnosis and treatment of ankle fractures with respect to syndesmotic injury. The diagnosis of syndesmotic injury is currently based on intraoperative findings. Surgical indication is mainly made by ankle X-ray assessment, by several classification systems and radiographic measurements. Misdiagnosis of the injury results in suboptimal treatment, which may lead to chronic complaints, like instability and osteoarthritis. This study investigates the diagnostic power and interobserver agreement of three classification methods and radiographic measures, currently used to assess X-ankles and to identify syndesmotic injury. Patients and Methods. Twenty patients (43.2 ± 15.3yrs) with an ankle fracture, indicated for surgery, were prospectively included. All patients received a preoperative ankle X-ray, which was assessed by several observers: two orthopaedic surgeons, one trauma surgeon and two radiologists. The ankle X-ray was assessed on syndesmotic injury/stability and presence of fractures (fibula, medial/tertius malleolus). Three classification systems were used: Weber, AO-Müller (short-version n=3 options; extended-version n=27 options), Lauge-Hansen (short-version n=5 options; extended-version n=17 options) and two radiographic measurements were done: tibiofibular overlap (TFO) and ratio medial clearspace/superior clear space (MCS/SCS). All observers were instructed about the assessments before the measurements. During surgery, a proper intraoperative description of the syndesmosis was noted. Agreement (%), Intraclass Correlation Coefficients (ICC) and Kappa were calculated to determine interobserver agreement. Kappa statistic was interpreted according to Landis and Koch. To test the diagnostic power of ankle X-rays to identify syndesmotic instability, sensitivity and specificity were calculated with intraoperative findings serving as golden standard. Results. Six of 20 ankles showed syndesmotic instability intraoperatively. An overall sensitivity of 43% (specificity: 78) was found for X-rays in identifying syndesmotic instability, showing a wide range in sensitivity between observers (17–83%), with radiologists performing better (range 50–83%) than surgeons (range: 17–33%). Overall, substantial to perfect interobserver agreement (range 70–100%) was found for all short classification systems, showing an average kappa ≥0.60. The agreement reduced for more extended classification systems. E.g. observer agreement for the AO-Muller classification with 3, 9 and 27 options was respectively 85% (kappa 0.66), 68% (kappa 0.57) and 55% (kappa 0.51). One observer deviated slightly from others in all classification assessments. Removing this observer resulted in excellent agreement for all classification systems (>90%). Radiographic measurements showed moderate to high interobserver agreement, with TFO performing best (avg. ICC 0.88). Discussion/Conclusion. In ankle fractures, a preoperative X-ray has low sensitivity in detecting syndesmotic instability, showing large sensitivity ranges between observers. Further study is needed to investigate the contribution of classification systems in determining the best treatment method for syndesmotic injury. Ankle X-ray assessment using the three classification systems and radiographic measures was consistent among observers. Disagreement between observers can be attributed to intrinsic differences among the systems (e.g. stepwise classification vs. single assessment). No preference for one specific classification was found, as all showed comparable interobserver agreement. However classification systems with few options are recommended, as the observer agreement reduced with more extending classifications


Orthopaedic Proceedings
Vol. 88-B, Issue SUPP_I | Pages 187 - 187
1 Mar 2006
Maguire M Mohil R Ng A Hodgson S
Full Access

The AO, Frykman, Mayo and Fernandez classification system for distal radius fractures were evaluated for interobserver reliability and intraobserver reproducibility using plain radiographs. Five orthopaedic consultants, five orthopaedic registras and five orthopaedic senior house officers classified 20 sets of distal radius fractures on two seperate occasions. There were 2400 induvidual observations. Kappa statistics were used to establish a relative level of agreement between observers for the two readings and between seperate readings by the same observer. Our results for intraobserver reproducibility showed Fernandez Kappa value of 0.49, Frykman 0.47, Mayo 0.45 and AO 0.33. A 0.4 result shows good consistecy accorcing to well reconised staistical boundries and is significant. That is reproducibility happened at a level greater than by chance. Interobserver Kappa values were poor in all classification systems. We also sought to look at varibles within grade of surgeon and developed Kappa values for these also


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_11 | Pages 12 - 12
4 Jun 2024
Chapman J Choudhary Z Gupta S Airey G Mason L
Full Access

Introduction

Treatment pathways of 5th metatarsal fractures are commonly directed based on fracture classification, with Jones types for example, requiring closer observation and possibly more aggressive management.

Primary objective

To investigate the reliability of assessment of subtypes of 5th metatarsal fractures by different observers.


Orthopaedic Proceedings
Vol. 92-B, Issue SUPP_I | Pages 27 - 27
1 Mar 2010
Cunningham MR Quirno M Bendo J Steiber J
Full Access

Purpose: Facet joint arthrosis is an entity that can have a key role in the etiology of low back pain, especially with hyperextension, and is a key component of surgical planning, especially when considering disc arthroplasty. Plain films and MRI are most commonly utilized as the initial imaging of choice for low back pain, but these methods may not truly allow an accurate assessment of facet arthosis. Our purpose was to observe the inter- and intraobserver reliability of utilizing CT and MRI to evaluate facet arthrosis, the inter- and intraobserver reliability of the facet grading system, and the agreement of surgeons as to when to perform disc arthroplasty after the lumbar facets are evaluated. Method: A power analysis was performed which showed we would need 6 reviewers and 43 images to have 80% power to show excellent reliability. 102 CT and the corresponding MRI images of lumbar facets were obtained from patients who were to undergo lumbar spine surgery of any type. 10 spine surgeons and 3 spine fellows reviewed the randomized images at 2 time points, 3 months apart, graded the facet arthosis as well as indicated whether they would chose to perform a disc arthroplasty based on the amount of facet arthrosis. Both interobserver and intraobserver kappa values were calculated by result comparison between observers at the two time points and between CT and MRI images from the same patient. Results: interobserver reliability for MRI was 0.21 and 0.07(fair to slight agreement), and for CT was 0.33 and 0.27(fair agreement), for the spine surgeons and spine fellows respectively. The mean intraobserver reliability for MRI was 0.36 and 0.26 (fair agreement) and for CT was 0.52 and 0.51 (moderate agreement). The kappa value for agreement of whether to perform a disc arthroplasty after grading the facet arthrosis utilizing MRI was 0.22 (fair agreement) and utilizing CT was 0.33 (fair agreement) among the senior spine surgeons. Conclusion: The existing grading system for facet arthrosis and of whether to perform a disc arthroplasty utilizing the grading system has at best only fair agreement. CT is more reliable for grading facet arthrosis


Orthopaedic Proceedings
Vol. 94-B, Issue SUPP_XXIV | Pages 16 - 16
1 May 2012
Rajan R Chandrasenan J Metcalfe J Konstantoulakis C
Full Access

The purpose of our study was to independently assess the modified Herring lateral pillar classification. Methods and results. 35 standardised true antero-posterior radiographs of children in various stages of fragmentation were independently assessed by 6 senior observers on 2 separate occasions (6 weeks apart). Kappa analysis was used to assess the inter and intraobserver agreement between observations made. Intraobserver analysis revealed at best only moderate agreement for two observers. 3 observers showed fair consistency, whilst 1 remaining observer showed poor consistency between repeated observations (p<0.01). The highest scores for interobserver agreement varying between moderate to good could only be established between 2 observers. For the remaining observers results were just fair (p<0.01). Conclusion. This stdy highlights the lack of agreement between senior clinicians when applying the modified LPC. This clearly has clinical implications. To our knowledge this is the first time the modified lateral pillar classification has been independently tested for its reproducibility by a specialist orthopaedic unit


Orthopaedic Proceedings
Vol. 94-B, Issue SUPP_XXXVII | Pages 207 - 207
1 Sep 2012
Chandrasenan J Rajan R Price K
Full Access

The lateral pillar classification (LPC) is a widely used tool in determining prognosis and planning treatment in patients who are in the fragmentation stage of Perthes disease. The original classification has been modified to help increase the accuracy of the classification system by the Herring group. The purpose of our study was to independently assess this modified Herring classification. 35 standardized true antero-posterior radiographs of children in various stages of fragmentation were independently assessed by 6 senior observers on 2 separate occasions (6 weeks apart). Kappa analysis was used to assess the inter and intraobserver agreement between observations made. The degrees of agreement were as follows: poor, fair, moderate, good and very good. Intraobserver analysis revealed at best only moderate agreement for two observers. 3 observers showed fair consistency, whilst 1 remaining observer showed poor consistency between repeated observations (p<0.01). The highest scores for interobserver agreement varying between moderate to good could only be established between 2 observers. For the remaining observers results were just fair (p<0.01). This study highlights the lack of agreement between senior clinicians when applying the modified LPC. This has clinical implications when applying the classification to the decision making process in treating patients at risk of developing adverse outcomes from the disease. To our knowledge, this is the first time the modified LPC has been independently tested for its reproducibility by another specialist paediatric orthopaedic unit


Orthopaedic Proceedings
Vol. 85-B, Issue SUPP_II | Pages 151 - 151
1 Feb 2003
Al-lami M Fourie B Koreli A Finn P Wilson S Gregg P
Full Access

The Department of Health and the Public Health Laboratory Service established the Nosocomial Infection National Surveillance Scheme (NINSS) in response to the need to standardise the collection of information about infections acquired in hospital. This would provide national data that could be used as a ‘benchmark’ by hospitals to measure their own performance. The definition of superficial incisional infection (skin and subcutaneous tissue), set by Centers of Disease Control (CDC), should meet at least one of the following criteria: I: Purulent drainage from the superficial incision. II: The superficial incision yields organisms from the culture of aseptically aspirated fluid or tissue, or from a swab, and pus cells are present. III: At least two of the following symptoms and signs of inflammation: pain or tenderness, localized swelling, redness or heat, and a. the superficial incision is deliberately opened by a surgeon to manage the infection, unless the incision is culture-negative or b. clinician’s diagnosis of superficial incisional infection. This study assessed the interobserver reliability of the superficial incisional infection criteria, set by the CDC, in current practice. The incisional site of 50 consecutive patients, who underwent elective primary joint arthroplasty (Hips & Knees), were evaluated independently by four observers. The most significant results of the study I: All four observers achieved absolute agreement (kappa=1) for Purulent wound discharge and clinical diagnosis of wound infection. II: The four observers obtained good agreement for pain criteria (kappa=0.76, III: There was significant disagreement (fair to poor) between all four observers for the following criteria: Localized swelling (kappa=0.34), Redness (kappa=0.33) and tenderness (kappa = 0.05). This is the first study to assess the reliability of the criteria, as set by the CDC and recommended by NINSS, for the diagnosis of superficial incisional infection and shows the Criterion III is not reliable and we recommend it should be revised. Failure to do so could lead to inaccurate statistics regarding hospital wound infection and detrimental effect on hospital trusts in the setting of league table


Orthopaedic Proceedings
Vol. 87-B, Issue SUPP_II | Pages 120 - 120
1 Apr 2005
Tourraine D Poilbout N Racineux P Toulemonde J Massin P
Full Access

Purpose: We tested the reliability of a digitalised x-ray reading system, Imagika(r), used to measure linear wear of total hip arthroplasy on the AP view of the pelvis. Material and methods: Wear measurements were taken for total hip arthroplasties without cement (n=20) and with cement (n=19) using the distance between the centre of the acetabular cup and the femoral ball. The system delivered measures in hundredths of millimetres that were rounded off to the nearest tenth millimetre. For non-cemented implants, the centre of the acetabular cup was found automatically on the digitalised radiograms using the contour of the metal socket. For cemented cups, the centre of the cup was determined from five points situated on the metallic ellipse included in the polyethylene circumference. The software placed the point clicked by the reader on the adjacent intermediary zone showing the greatest contrast. Five observers read the radiograms twice at 15 day intervals. The observers were a young resident, a senior traumatology surgeon,and a senior surgeon specialised in hip surgery. Results were compared to determine inter- and intra- observer variability. Results: Intra-observer variability was low since the standard deviation (at alpha error set at 5%) ranged from one tenth of a millimetre to six-tenths of a millimetre for four observers. It was higher (2 millimetres) for the fourth observer. The younger observers achieved the best reproducibility, to the order of a tenth of millimetre. Conversely, interobserver variability was high with standard deviation of several millimetres for an alpha risk of 5%. Comparing the two observers who achieved the best performances, the standard deviation of the measures was in the 3 to 4 millimetre range. Discussion: Measurement precision was greater for cemented cups. Conversely, for press-fit cups, the contour of the head was sometimes difficult to distinguish even with optimal contrast and measurement deviations were to the order of one millimetre. Conclusion: The reproducibility of the Imagika(r) system is insufficient to measure wear of total hip arthroplasty where the precision must be to the order of a tenth of a millimetre


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_12 | Pages 27 - 27
23 Jun 2023
Chen K Wu J Xu L Han X Chen X
Full Access

To propose a modified approach to measuring femoro-epiphyseal acetabular roof (FEAR) index while still abiding by its definition and biomechanical basis, and to compare the reliabilities of the two methods. To propose a classification for medial sourcil edges.

We retrospectively reviewed a consecutive series of patients treated with periacetabular osteotomy and/or hip arthroscopy. A modified FEAR index was defined. Lateral center-edge angle, Sharp's angle, Tonnis angle on all hips, as well as FEAR index with original and modified approaches were measured. Intra- and inter-observer reliability were calculated as intraclass correlation coefficients (ICC) for FEAR index with both approaches and other alignments. A classification was proposed to categorize medial sourcil edges. ICC for the two approaches across different sourcil groups were also calculated.

After reviewing 411 patients, 49 were finally included. Thirty-two patients (40 hips) were identified as having borderline dysplasia defined by an LCEA of 18 to 25 degrees. Intra-observer ICC for the modified method were good to excellent for borderline hips; poor to excellent for DDH; moderate to excellent for normal hips. As for inter-observer reliability, modified approach outperformed original approach with moderate to good inter-observer reliability (DDH group, ICC=0.636; borderline dysplasia group, ICC=0.813; normal hip group, ICC=0.704). The medial sourcils were classified to 3 groups upon its morphology. Type II(39.0%) and III(43.9%) sourcils were the dominant patterns. The sourcil classification had substantial intra-observer agreement (observer 4, kappa=0.68; observer 1, kappa=0.799) and moderate inter-observer agreement (kappa=0.465). Modified approach to FEAR index possessed greater inter-observer reliability in all medial sourcil patterns.

The modified FEAR index has better intra- and inter-observer reliability compared with the original approach. Type II and III sourcils accounts for the majority to which only the modified approach is applicable.


Bone & Joint Research
Vol. 9, Issue 5 | Pages 242 - 249
1 May 2020
Bali K Smit K Ibrahim M Poitras S Wilkin G Galmiche R Belzile E Beaulé PE

Aims

The aim of the current study was to assess the reliability of the Ottawa classification for symptomatic acetabular dysplasia.

Methods

In all, 134 consecutive hips that underwent periacetabular osteotomy were categorized using a validated software (Hip2Norm) into four categories of normal, lateral/global, anterior, or posterior. A total of 74 cases were selected for reliability analysis, and these included 44 dysplastic and 30 normal hips. A group of six blinded fellowship-trained raters, provided with the classification system, looked at these radiographs at two separate timepoints to classify the hips using standard radiological measurements. Thereafter, a consensus meeting was held where a modified flow diagram was devised, before a third reading by four raters using a separate set of 74 radiographs took place.


Orthopaedic Proceedings
Vol. 98-B, Issue SUPP_4 | Pages 25 - 25
1 Jan 2016
Stevens A Wilson C Shunmugam M Ranawat V Krishnan J
Full Access

Inter- and intra-observer variation has been noted in the analysis of radiographic examinations with regard to experience of surgeons, and the monitors used for conducting the evaluations. The aim of this study is to evaluate inter/intra observer variation in the measurement of mechanical alignment from long-leg radiographs.

40 patients from the elective waiting list for TKA underwent long leg radiographs pre-operatively and 6 months post-operatively (total of 80 radiographs). The x-rays were analysed by 5 observers ranging in experience from medical student to head orthopaedic surgeon. Two observers re-analysed their results 6 months later to determine intraobserver correlation, and one observer re-measured the alignment on a different monitor. These measurements were all conducted blindly and none of the observers had access to the others’ results.

80 radiographs were analysed in total, 40 pre-op and 40 post-op. The mechanical alignment was analysed using Pearson's correlation (r = 0 no agreement, r = 1 perfect agreement) and revealed that experience as an orthopaedic surgeon has little effect on the measurement of mechanical alignment from long leg radiograph. The results for the different monitor analysis were also analysed using Pearson's correlation of long leg alignment. Monitor quality does seem to affect the correlation between alignment measurements when reviewing both intra and inter observer correlation on different computer monitors.

Surgical experience has little impact on the measurement of alignment on long leg radiographs. Of greater concern is that monitors of different resolution can affect measurement of mechanical alignment. As there might be a range of monitors in use in different institutions, and also in outpatient clinics to surgical theatres, close attention should be paid to the implications of these results.