Advertisement for orthosearch.org.uk
Results 1 - 50 of 300
Results per page:

Aims. Classifying trochlear dysplasia (TD) is useful to determine the treatment options for patients suffering from patellofemoral instability (PFI). There is no consensus on which classification system is more reliable and reproducible for the purpose of guiding clinicians’ management of PFI. There are also concerns about the validity of the Dejour Classification (DJC), which is the most widely used classification for TD, having only a fair reliability score. The Oswestry-Bristol Classification (OBC) is a recently proposed system of classification of TD, and the authors report a fair-to-good interobserver agreement and good-to-excellent intraobserver agreement in the assessment of TD. The aim of this study was to compare the reliability and reproducibility of these two classifications. Methods. In all, six assessors (four consultants and two registrars) independently evaluated 100 axial MRIs of the patellofemoral joint (PFJ) for TD and classified them according to OBC and DJC. These assessments were again repeated by all raters after four weeks. The inter- and intraobserver reliability scores were calculated using Cohen’s kappa and Cronbach’s α. Results. Both classifications showed good to excellent interobserver reliability with high α scores. The OBC classification showed a substantial intraobserver agreement (mean kappa 0.628; p < 0.005) whereas the DJC showed a moderate agreement (mean kappa 0.572; p < 0.005). There was no significant difference in the kappa values when comparing the assessments by consultants with those by registrars, in either classification system. Conclusion. This large study from a non-founding institute shows both classification systems to be reliable for classifying TD based on axial MRIs of the PFJ, with the simple-to-use OBC having a higher intraobserver reliability score than that of the DJC. Cite this article: Bone Jt Open 2023;4(7):532–538


The Bone & Joint Journal
Vol. 100-B, Issue 5 | Pages 596 - 602
1 May 2018
Bock P Pittermann M Chraim M Rois S

Aims. Various radiological parameters are used to evaluate a flatfoot deformity and their measurements may differ. The aims of this study were to answer the following questions: 1) Which of the 11 parameters have the best inter- and intraobserver reliability in a standardized radiological setting? 2) Are pre- and postoperative assessments equally reliable? 3) What are the identifiable sources of variation?. Patients and Methods. Measurements of the 11 parameters were recorded on anteroposterior and lateral weight-bearing radiographs of 38 feet before and after surgery for flatfoot, by three observers with different experience in foot surgery (A, ten years; B, three years; C, third-year orthopaedic resident). The inter- and intraobserver reliability was calculated. Results. Preoperative interobserver reliability was high for four, moderate for five, and low for two parameters. Postoperative interobserver reliability was high for four, moderate for five, and low for two parameters. Intraobserver reliability was excellent for all parameters preoperatively as recorded by observer A (PB) and B (MP), and for eight parameters as recorded by observer C (SR). Intraobserver reliability was excellent for ten parameters postoperatively as recorded by observer A and B, and for eight parameters as recorded by observer C. Conclusion. The following parameters can be recommended. For preoperative and postoperative evaluation of flatfoot: anteroposterior, talonavicular coverage angle; lateral, talometatarsal I angle, calcaneal pitch angle, and cuneiform-medial height (high interobserver reliability); and anteroposterior, talometatarsal II angle; lateral, talocalcaneal angle,tibiocalcaneal angle (moderate interobserver reliability). For more experienced observers, we also recommend the anteroposterior talometatarsal I angle (moderate reliability). The inter- and intraobserver reliability for most parameters were similar pre- and postoperatively. The experience of the observer and the definition and ability to measure the parameters themselves were sources of variation. Cite this article: Bone Joint J 2018;100-B:596–602


The Journal of Bone & Joint Surgery British Volume
Vol. 91-B, Issue 6 | Pages 766 - 771
1 Jun 2009
Brunner A Honigmann P Treumann T Babst R

We evaluated the impact of stereo-visualisation of three-dimensional volume-rendering CT datasets on the inter- and intraobserver reliability assessed by kappa values on the AO/OTA and Neer classifications in the assessment of proximal humeral fractures. Four independent observers classified 40 fractures according to the AO/OTA and Neer classifications using plain radiographs, two-dimensional CT scans and with stereo-visualised three-dimensional volume-rendering reconstructions. Both classification systems showed moderate interobserver reliability with plain radiographs and two-dimensional CT scans. Three-dimensional volume-rendered CT scans improved the interobserver reliability of both systems to good. Intraobserver reliability was moderate for both classifications when assessed by plain radiographs. Stereo visualisation of three-dimensional volume rendering improved intraobserver reliability to good for the AO/OTA method and to excellent for the Neer classification. These data support our opinion that stereo visualisation of three-dimensional volume-rendering datasets is of value when analysing and classifying complex fractures of the proximal humerus


Orthopaedic Proceedings
Vol. 87-B, Issue SUPP_I | Pages 69 - 69
1 Mar 2005
Viehweger E Hélix M Jacquemier M Scavarda D Rohon MA Scorsone-Pagny S
Full Access

Introduction: With the evolution and the complexity of the treatments in cerebral palsy (CP) patients it is essential to assess their outcome using validated tools. Technical analysis offers objective data which may be associated to more subjective functional evaluation and health related quality of life tests. Simplified visual tests were proposed as an alternative to the complex and expensive instrumented three-dimensional gait analysis. The Edinburgh Visual Gait Score (EVGS) was proposed for routine clinical use when complete technical analysis is not available or may represent a part of a global patient evaluation. The purposes of our study were: 1) to apply a French translation of the EVGS to standard video recordings of a group of independent walking spastic diplegic CP patients 2) to evaluate the intraobserver and interobserver reliability and 3) to compare the results of gait analysis with experienced and inexperienced observers. Material & methods: A series of ten standard video recordings of spastic diplegic CP patients, acquired during routine clinical gait analysis were examined by eight observers, two times, with two weeks in between the assessments. Observers were selected from following specialties: three paediatric orthopaedic surgeons, one resident in orthopaedic surgery, one neurosurgeon, one physiatrist and two physiotherapists. Observers were separated into two groups according to their experience with gait analysis interpretations. Kappa statistics and intraclass correlation coefficient were calculated. Results: Better intraobserver and interobserver reliability was observed for foot and knee scores with significant difference between stance and swing phase results. Pelvis, hip and trunk score results were significantly lower. The interobserver reliability for segment scores and the global EVGS showed better results than the intraobserver reliability. The gait analysis experienced observer group showed significantly higher intraobserver and interobserver reliability. Discussion & conclusion: Our reliability results about the use of the EVGS are close to the results of Read et al. Interestingly we showed a significant difference between the two observer groups. Observers familiar with gait analysis obtained better reliability results. That shows the importance to either be used to clinical gait analysis interpretation including learning the visualisation of the different gait phases, or to benefit of a video analysis training before using the visual score as a standard clinical evaluation tool. For this study we did not use the patient preparation recommendations of the initial authors to improve accuracy of scoring because the possibility to use historic standard videos wanted to be tested. Poor score reliability of the pelvis and hip may be improved. Further studies of multilevel surgery outcome evaluation by visual analysis trained observers are needed to explore clinical changes in CP patients over time


Orthopaedic Proceedings
Vol. 88-B, Issue SUPP_II | Pages 314 - 314
1 May 2006
Elkinson I Crawford H Barnes M Boxch P Ferguson J
Full Access

The aim was to evaluate the Intraobserver and Interobserver reliability of Pelvic Incidence as a fundamental parameter of sagittal spino-pelvic balance in patients with spondylolisthesis compared to controls with Idiopathic Adolescent Scoliosis. A blinded test retest study including multi-surgeon assessment of Pelvic Incidence in patients with spondylolisthesis and Idiopathic Adolescent Scoliosis was carried out. We assessed the agreement between the pelvic incidence measurements using the Bland and Altman method and mean differences (95% confidence interval) are reported. Forty patients seen at Starship Children’s Hospital between 1992 – 2003 by two spinal surgeons were retrospectively identified. The main group had 20 patients with spondylolisthesis (Isthmic and/or Dysplastic types) and the control group consisted of 20 patients with Idiopathic Adolescent Scoliosis. Five observers with different levels of experience included the two orthopaedic surgeons, one fellow, one senior trainee and one non-trainee registrar. Prior to the initial test phase, a consensus-building session was carried out. All five observers arrived at a standardised method for measuring the Pelvic Incidence. In the test phase randomly ordered lateral lumbosacral radiographs were independently evaluated by the five observers and pelvic incidence was measured. Assessment of the Pelvic Incidence was repeated one week later in the re-test phase. The radiographs were presented in a randomly pre-assigned order. Bland and Altman plots were constructed and mean differences (95% confidence interval) reported to evaluate the agreement between the Pelvic Incidence measurements among the five independent observers. All analysis was performed on the statistical software package SAS. P-value of 0.05 was considered statistically significant. The spondylolisthesis group had 11 (55%) males and 9 (45%) females with an average age of 14 ± 4.2. 2 patients had high-grade (Meyerding Class III, IV, V) and 16 had low-grade (Meyerding Class I, II) spondylolisthesis. 2 patients were post-reduction of spondylolisthesis. In the Scoliosis group there were 2 (10%) males and 18 (90%) females with an average age of 15 ± 2.9. There was no significant difference between male and females pelvic incidence measurement (60° ± 18.7° vs. 57° ± 14.6°, p=0.540) or age (15 ± 2.9 vs. 14 ± 3.8, p=0.181). There was no difference in pelvic incidence across the Meyerding groups, p=0.257. There was a significant difference between spondylolisthesis and scoliosis pelvic incidence measurements 65° ± 15.6° vs. 51° ± 12.8°, p=0.003. In the . Spondylolisthesis Group. the interobserver reliability between five clinicians, expressed as the mean difference in pelvic incidence measurement was 0.6° (95%CI −0.81, 1.91) and was not significantly different from zero p=0.423. The agreement limits were from −12.8° to 13.9°. The intraobserver reliability of pelvic incidence showed the mean difference ranging from −2.1° to 1.4° (p=0.129 and 0.333 with 95% CI). One had marginal evidence of a significant difference of 3.3° (95% CI 0.05° to 6.55°, p=0.047). In the . Scoliosis Group. the interobserver reliability was 0.3° (95% CI −0.81, 1.49) and was not significantly different from zero p=0.726. The agreement limits were from −11.0° to 11.6°. The intraobserver reliability among four observers ranged from −1.7° to 0.5° (p=0.178 and 0.661). One had a significant difference in readings of 4.1° (95% CI of 0.70° to 7.40°, p= 0.020). Scoliosis patients had a significantly smaller pelvic incidence than spondylolisthesis patients. The interobserver reliability of the pelvic incidence measurement was excellent across both groups. The intraobserver reliability was good with only one observer in each group demonstrating a marginally significant difference. Pelvic incidence is therefore a reliable measurement which can be used as a predictor in progression of spondylolisthesis


Orthopaedic Proceedings
Vol. 92-B, Issue SUPP_I | Pages 27 - 27
1 Mar 2010
Cunningham MR Quirno M Bendo J Steiber J
Full Access

Purpose: Facet joint arthrosis is an entity that can have a key role in the etiology of low back pain, especially with hyperextension, and is a key component of surgical planning, especially when considering disc arthroplasty. Plain films and MRI are most commonly utilized as the initial imaging of choice for low back pain, but these methods may not truly allow an accurate assessment of facet arthosis. Our purpose was to observe the inter- and intraobserver reliability of utilizing CT and MRI to evaluate facet arthrosis, the inter- and intraobserver reliability of the facet grading system, and the agreement of surgeons as to when to perform disc arthroplasty after the lumbar facets are evaluated. Method: A power analysis was performed which showed we would need 6 reviewers and 43 images to have 80% power to show excellent reliability. 102 CT and the corresponding MRI images of lumbar facets were obtained from patients who were to undergo lumbar spine surgery of any type. 10 spine surgeons and 3 spine fellows reviewed the randomized images at 2 time points, 3 months apart, graded the facet arthosis as well as indicated whether they would chose to perform a disc arthroplasty based on the amount of facet arthrosis. Both interobserver and intraobserver kappa values were calculated by result comparison between observers at the two time points and between CT and MRI images from the same patient. Results: interobserver reliability for MRI was 0.21 and 0.07(fair to slight agreement), and for CT was 0.33 and 0.27(fair agreement), for the spine surgeons and spine fellows respectively. The mean intraobserver reliability for MRI was 0.36 and 0.26 (fair agreement) and for CT was 0.52 and 0.51 (moderate agreement). The kappa value for agreement of whether to perform a disc arthroplasty after grading the facet arthrosis utilizing MRI was 0.22 (fair agreement) and utilizing CT was 0.33 (fair agreement) among the senior spine surgeons. Conclusion: The existing grading system for facet arthrosis and of whether to perform a disc arthroplasty utilizing the grading system has at best only fair agreement. CT is more reliable for grading facet arthrosis


Orthopaedic Proceedings
Vol. 88-B, Issue SUPP_I | Pages 171 - 171
1 Mar 2006
Sanchez R Salcedo C Martinez M Molina J Vera F Villarreal J
Full Access

Introduction and objectives: The purpose of the research is to show the agreement and reproducibility among 5 observers when they are questioned about 51 open fractures using two open fracture classifications for long bones (Gustilo and Aybar), interpreting the results obtained between both classifications. Material and Method: A classification protocol is established for open fractures. The fractures are graded independently using each of the systems being evaluated (Gustilo and Aybar), by visualising slides with clinical and radiologic images in addition to a report of the data in the clinical history. The survey is conducted twice with a time difference of one to eight weeks. 5 members of the Orthopedic and Traumatologic Surgery Department (OTSD) were questioned (1 Professor, 2 Specialists and 2 Residents). The statistical method used to analyse the results was the interobserver agreement percentage and the inter- and intraobserver kappa index. Results: The interobserver agreement percentage for the Gustilo classification was 58.82% and 39.21% for the Aybar classification. The kappa index for the interobserver agreement for the Gustilo classification was 0.51 and for the Aybar classification was 0.54. The kappa index for the intraobserver reproducibility was 0.69 for the Gustilo classification and 0.58 for the Aybar one. Conclusions: The interobserver agreemnet was considered moderate-poor for the Gustilo and Aybar classifications. The intraobserver reproducibility was considered substantial for the Gustilo classification and moderate for the Aybar one. We conclude that this agreement shows too much variability as to accept just one classification as the only valid method to take therapeutic decisions or for comparing results. Therefore, it’s necessary to create a more detailed and careful classification, which is quick to use, reliable, reproducible and which contains a more objective criteria


Orthopaedic Proceedings
Vol. 94-B, Issue SUPP_XXIV | Pages 16 - 16
1 May 2012
Rajan R Chandrasenan J Metcalfe J Konstantoulakis C
Full Access

The purpose of our study was to independently assess the modified Herring lateral pillar classification. Methods and results. 35 standardised true antero-posterior radiographs of children in various stages of fragmentation were independently assessed by 6 senior observers on 2 separate occasions (6 weeks apart). Kappa analysis was used to assess the inter and intraobserver agreement between observations made. Intraobserver analysis revealed at best only moderate agreement for two observers. 3 observers showed fair consistency, whilst 1 remaining observer showed poor consistency between repeated observations (p<0.01). The highest scores for interobserver agreement varying between moderate to good could only be established between 2 observers. For the remaining observers results were just fair (p<0.01). Conclusion. This stdy highlights the lack of agreement between senior clinicians when applying the modified LPC. This clearly has clinical implications. To our knowledge this is the first time the modified lateral pillar classification has been independently tested for its reproducibility by a specialist orthopaedic unit


Orthopaedic Proceedings
Vol. 94-B, Issue SUPP_XXXVII | Pages 207 - 207
1 Sep 2012
Chandrasenan J Rajan R Price K
Full Access

The lateral pillar classification (LPC) is a widely used tool in determining prognosis and planning treatment in patients who are in the fragmentation stage of Perthes disease. The original classification has been modified to help increase the accuracy of the classification system by the Herring group. The purpose of our study was to independently assess this modified Herring classification. 35 standardized true antero-posterior radiographs of children in various stages of fragmentation were independently assessed by 6 senior observers on 2 separate occasions (6 weeks apart). Kappa analysis was used to assess the inter and intraobserver agreement between observations made. The degrees of agreement were as follows: poor, fair, moderate, good and very good. Intraobserver analysis revealed at best only moderate agreement for two observers. 3 observers showed fair consistency, whilst 1 remaining observer showed poor consistency between repeated observations (p<0.01). The highest scores for interobserver agreement varying between moderate to good could only be established between 2 observers. For the remaining observers results were just fair (p<0.01). This study highlights the lack of agreement between senior clinicians when applying the modified LPC. This has clinical implications when applying the classification to the decision making process in treating patients at risk of developing adverse outcomes from the disease. To our knowledge, this is the first time the modified LPC has been independently tested for its reproducibility by another specialist paediatric orthopaedic unit


The Journal of Bone & Joint Surgery British Volume
Vol. 82-B, Issue 5 | Pages 636 - 642
1 Jul 2000
Wainwright AM Williams JR Carr AJ

We assessed the inter- and intraobserver variation in classification systems for fractures of the distal humerus. Three orthopaedic trauma consultants, three trauma registrars and three consultant musculoskeletal radiologists independently classified 33 sets of radiographs of such fractures on two occasions, each using three separate systems. For interobserver variation, the Riseborough and Radin system produced ‘moderate’ agreement (kappa = 0.513), but half of the fractures were not classifiable by this system. For the complete AO system, agreement was ‘fair’ (kappa = 0.343), but if only AO type and group or AO type alone was used, agreement improved to ‘moderate’ and ‘substantial’, respectively (kappa = 0.52 and 0.66). Agreement for the system of Jupiter and Mehne was ‘fair’ (kappa = 0.295). Similar levels of intraobserver variation were found. Systems of classification are useful in decision-making and evaluation of outcome only if there is agreement and consistency among observers. Our study casts doubt on these aspects of the systems currently available for fractures of the distal humerus


Orthopaedic Proceedings
Vol. 88-B, Issue SUPP_I | Pages 187 - 187
1 Mar 2006
Maguire M Mohil R Ng A Hodgson S
Full Access

The AO, Frykman, Mayo and Fernandez classification system for distal radius fractures were evaluated for interobserver reliability and intraobserver reproducibility using plain radiographs. Five orthopaedic consultants, five orthopaedic registras and five orthopaedic senior house officers classified 20 sets of distal radius fractures on two seperate occasions. There were 2400 induvidual observations. Kappa statistics were used to establish a relative level of agreement between observers for the two readings and between seperate readings by the same observer. Our results for intraobserver reproducibility showed Fernandez Kappa value of 0.49, Frykman 0.47, Mayo 0.45 and AO 0.33. A 0.4 result shows good consistecy accorcing to well reconised staistical boundries and is significant. That is reproducibility happened at a level greater than by chance. Interobserver Kappa values were poor in all classification systems. We also sought to look at varibles within grade of surgeon and developed Kappa values for these also


Introduction: The purpose of this study was to evaluate the impact of volume rendering 3D computed tomography reconstructions on the inter- and intraobserver reliability of the OTA/AO and Neer classifications in the assessment of proximal humerus fractures. Material and Methods: Four observers with different levels of clinical training classified forty proximal humerus fractures according to the OTA/AO and Neer classifications. Three rounds of evaluation were performed and compared. First, fractures were classified on the basis of plain radiographs alone. Then, four weeks later, the combination of plain radiographs and computed tomography scans with conventional 3D SSD reconstructions was evaluated. Finally, four weeks later, the combination of plain radiographs, computed tomography scans, and 3D volume rendering reconstructions was assessed. These readings were repeated in a newly randomized order after an interval of twelve weeks to evaluate intraobserver reliability. Results: Interobserver reliability for the AO/ASIF classification showed good interobserver reliability with plain radiographs (k=0,65) and two-dimensional CT scans with conventional three-dimensional (SSD) reconstructions (k=0,71). Interobserver reliability improved to excellent when the fractures were classified on the basis of 3D volume rendering reconstructions scans (k=0,84). Intraobserver reliability of the OTA/AO classification was good with plain radiographs (k=0,70) and improved to excellent after adding three-dimensional SSD reconstructions (k=0,80) and three-dimensional VR reconstructions (k=0,88). Interobserver reliability of the Neer classification was poor with plain radiographs (k=0,39) and moderate with two-dimensional CT scans and conventional three-dimensional (SSD) reconstructions (k=0,56) and improved to good with the addition of 3D VR scans (k=0,74). Intraobserver reliability for was poor with plain radiographs (k=0,34), good with three-dimensional SSD reconstructions (k=0,61), and excellent with three-dimensional VR reconstructions (k=0,80). Conclusion: In this study, three-dimensional volume rendering computed tomography improved the inter- and intraobserver reliability of the AO/OTA and the Neer classifications in the assessment of proximal humerus fractures. In the opinion of the authors, 3D volume rendering CT-scans are a helpful tool for preoperative planning and classification of fractures of the proximal humerus


The Journal of Bone & Joint Surgery British Volume
Vol. 84-B, Issue 1 | Pages 15 - 18
1 Jan 2002
Whelan DB Bhandari M McKee MD Guyatt GH Kreder HJ Stephen D Schemitsch EH

The reliability of the radiological assessment of the healing of tibial fractures remains undetermined. We examined the inter- and intraobserver agreement of the healing of such fractures among four orthopaedic trauma surgeons who, on two separate occasions eight weeks apart, independently assessed the radiographs of 30 patients with fractures of the tibial shaft which had been treated by intramedullary fixation. The radiographs were selected from a database to represent fractures at various stages of healing. For each radiograph, the surgeon scored the degree of union, quantified the number of cortices bridged by callus or with a visible fracture line, described the extent and quality of the callus, and provided an overall rating of healing. The interobserver chance-corrected agreement using a quadratically weighted kappa (κ) statistic in which values of 0.61 to 0.80 represented substantial agreement were as follows: radiological union scale (κ = 0.60); number of cortices bridged by callus (κ = 0.75); number of cortices with a visible fracture line (κ = 0.70); the extent of the callus (κ = 0.57); and general impression of fracture healing (κ = 0.67). The intraobserver agreement of the overall impression of healing (κ = 0.89) and the number of cortices bridged by callus (κ = 0.82) or with a visible fracture line (κ = 0.83) was almost perfect. There are no validated scales which allow surgeons to grade fracture healing radiologically. Among those examined, the number of cortices bridged by bone appears to be a reliable, and easily measured radiological variable to assess the healing of fractures after intramedullary fixation


Bone & Joint Research
Vol. 9, Issue 5 | Pages 242 - 249
1 May 2020
Bali K Smit K Ibrahim M Poitras S Wilkin G Galmiche R Belzile E Beaulé PE

Aims

The aim of the current study was to assess the reliability of the Ottawa classification for symptomatic acetabular dysplasia.

Methods

In all, 134 consecutive hips that underwent periacetabular osteotomy were categorized using a validated software (Hip2Norm) into four categories of normal, lateral/global, anterior, or posterior. A total of 74 cases were selected for reliability analysis, and these included 44 dysplastic and 30 normal hips. A group of six blinded fellowship-trained raters, provided with the classification system, looked at these radiographs at two separate timepoints to classify the hips using standard radiological measurements. Thereafter, a consensus meeting was held where a modified flow diagram was devised, before a third reading by four raters using a separate set of 74 radiographs took place.


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_12 | Pages 27 - 27
23 Jun 2023
Chen K Wu J Xu L Han X Chen X
Full Access

To propose a modified approach to measuring femoro-epiphyseal acetabular roof (FEAR) index while still abiding by its definition and biomechanical basis, and to compare the reliabilities of the two methods. To propose a classification for medial sourcil edges.

We retrospectively reviewed a consecutive series of patients treated with periacetabular osteotomy and/or hip arthroscopy. A modified FEAR index was defined. Lateral center-edge angle, Sharp's angle, Tonnis angle on all hips, as well as FEAR index with original and modified approaches were measured. Intra- and inter-observer reliability were calculated as intraclass correlation coefficients (ICC) for FEAR index with both approaches and other alignments. A classification was proposed to categorize medial sourcil edges. ICC for the two approaches across different sourcil groups were also calculated.

After reviewing 411 patients, 49 were finally included. Thirty-two patients (40 hips) were identified as having borderline dysplasia defined by an LCEA of 18 to 25 degrees. Intra-observer ICC for the modified method were good to excellent for borderline hips; poor to excellent for DDH; moderate to excellent for normal hips. As for inter-observer reliability, modified approach outperformed original approach with moderate to good inter-observer reliability (DDH group, ICC=0.636; borderline dysplasia group, ICC=0.813; normal hip group, ICC=0.704). The medial sourcils were classified to 3 groups upon its morphology. Type II(39.0%) and III(43.9%) sourcils were the dominant patterns. The sourcil classification had substantial intra-observer agreement (observer 4, kappa=0.68; observer 1, kappa=0.799) and moderate inter-observer agreement (kappa=0.465). Modified approach to FEAR index possessed greater inter-observer reliability in all medial sourcil patterns.

The modified FEAR index has better intra- and inter-observer reliability compared with the original approach. Type II and III sourcils accounts for the majority to which only the modified approach is applicable.


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_4 | Pages 3 - 3
3 Mar 2023
Roy K Joshi P Ali I Shenoy P Syed A Barlow D Malek I Joshi Y
Full Access

Classifying trochlear dysplasia (TD) is useful to determine the treatment options for patients suffering from patellofemoral instability (PFI). There is no consensus on which classification system is more reliable and reproducible for this purpose to guide clinicians in order to treat PFI. There are also concerns about validity of the Dejour classification (DJC), which is the most widely used classification for TD, having only a fair reliability score.

The Oswestry-Bristol classification (OBC) is a recently proposed system of classification of TD and the authors report a fair-to-good interobserver agreement and good-to-excellent intra-observer agreement in the assessment of TD. The aim of this study was to compare the reliability and reproducibility of these two classifications.

6 assessors (4 consultants and 2 registrars) independently evaluated 100 magnetic resonance axial images of the patella-femoral joint for TD and classified them according to OBC and DJC. These assessments were again repeated by all raters after 4 weeks. The inter and intra-observer reliability scores were calculated using Cohen's kappa and Cronbach's alpha.

Both classifications showed good to excellent interobserver reliability with high alpha scores. The OBC classification showed a substantial intra-observer agreement (mean kappa 0.628)[p<0.005] whereas the DJC showed a moderate agreement (mean kappa 0.572) [p<0.005]. There was no significant difference in the kappa values when comparing the assessments by consultants to those by registrars, in either classification systems.

This large study from a non-founding institute shows both classification systems to be reliable for classifying TD based on magnetic resonance axial images of the patella-femoral joint, with the simple to use OBC having a higher intra-observer reliability score compared to the DJC.


Orthopaedic Proceedings
Vol. 85-B, Issue SUPP_III | Pages 257 - 257
1 Mar 2003
Hell Anna K Ruehmann O Peters G Lazovic D
Full Access

Introduction. In Mid-Europe developmental dysplasia of the hip (DDH) is diagnosed using the sonographic hip screening described by Graf. To learn the necessary standards three courses are mandatory. However, little is known about learning curves and measurement errors of doctors at different levels of training and experience.

Material and Methods. Between 1997 and 2002 participants of the basic, advanced and final hip ultrasonogra-phy course were evaluated by a questionnaire and 34 normal and pathological sonograms. They were asked to measure the alpha and beta angle. “Normal” angles of each hip were created through the mean values of two experienced course organizers.

Results. 186 doctors (40% orthopedic surgeons, 60% pediatricians) were evaluated. The group included 20% interns, 60% residents and 20% consultants. An average time of 6.3 months lay between the basic and the advanced, and of 16.7 months between the advanced and the final course. The evaluation of the sonograms according to Graf showed major inter-observer differences of up to 30°. Participants had more difficulties in evaluating a correct beta angle than an alpha angle. Sonographic pictures of minor quality and pathological hips produced more difficulties than pictures of Graf type I and II hips. In the basic course all measurements showed an average difference of 3,6°, in the advanced course of 3,1° and in the final course of 4,2°. The number of examinations between courses did not correlate with good measurements.

Conclusion. Even participants of all three courses seem to develop major systemic errors if ultrasonography is regularly applied without supervision. Therefore, regular training and supervision should be mandatory in order to guarantee good quality.


Bone & Joint Research
Vol. 4, Issue 12 | Pages 190 - 194
1 Dec 2015
Kleinlugtenbelt YV Hoekstra M Ham SJ Kloen P Haverlag R Simons MP Bhandari M Goslings JC Poolman RW Scholtes VAB

Objectives

Current studies on the additional benefit of using computed tomography (CT) in order to evaluate the surgeons’ agreement on treatment plans for fracture are inconsistent. This inconsistency can be explained by a methodological phenomenon called ‘spectrum bias’, defined as the bias inherent when investigators choose a population lacking therapeutic uncertainty for evaluation. The aim of the study is to determine the influence of spectrum bias on the intra-observer agreement of treatment plans for fractures of the distal radius.

Methods

Four surgeons evaluated 51 patients with displaced fractures of the distal radius at four time points: T1 and T2: conventional radiographs; T3 and T4: radiographs and additional CT scan (radiograph and CT). Choice of treatment plan (operative or non-operative) and therapeutic certainty (five-point scale: very uncertain to very certain) were rated. To determine the influence of spectrum bias, the intra-observer agreement was analysed, using Kappa statistics, for each degree of therapeutic certainty.


Orthopaedic Proceedings
Vol. 88-B, Issue SUPP_III | Pages 436 - 436
1 Oct 2006
Rajan RA Metcalfe J Konstantoulakis C Jones S Sprigg A
Full Access

Introduction: The assessment of bone age using the standard Gruel and Pyle chart based on hand and wrist radiographs is usually carried out by Senior Radiologists. We performed a study to look at both intra and inter observer variability with different grades of clinicians.

Materials and Methods: 30 sets of wrist radiographs were selected at random. The investigators included a Senior Radiographer, a Consultant and Registrar Radiologist an Orthopaedic Consultant and Senior Orthopaedic Fellow.

Discussion: The Radiology team appear to be more consistent in their readings for the assessment of skeletal bone age than the Orthopaedic team. Howevr, it is interesting to note that although the Orthopaedic team are less consistent, when looking at the inter-observer variability, it suggests that both teams are equally well equipped to perform the task.

Conclusion: Our study suggests that we should not cross professional boundaries. Render unto Caeser what is Ceaser’s!


Bone & Joint Research
Vol. 13, Issue 1 | Pages 19 - 27
5 Jan 2024
Baertl S Rupp M Kerschbaum M Morgenstern M Baumann F Pfeifer C Worlicek M Popp D Amanatullah DF Alt V

Aims. This study aimed to evaluate the clinical application of the PJI-TNM classification for periprosthetic joint infection (PJI) by determining intraobserver and interobserver reliability. To facilitate its use in clinical practice, an educational app was subsequently developed and evaluated. Methods. A total of ten orthopaedic surgeons classified 20 cases of PJI based on the PJI-TNM classification. Subsequently, the classification was re-evaluated using the PJI-TNM app. Classification accuracy was calculated separately for each subcategory (reinfection, tissue and implant condition, non-human cells, and morbidity of the patient). Fleiss’ kappa and Cohen’s kappa were calculated for interobserver and intraobserver reliability, respectively. Results. Overall, interobserver and intraobserver agreements were substantial across the 20 classified cases. Analyses for the variable ‘reinfection’ revealed an almost perfect interobserver and intraobserver agreement with a classification accuracy of 94.8%. The category 'tissue and implant conditions' showed moderate interobserver and substantial intraobserver reliability, while the classification accuracy was 70.8%. For 'non-human cells,' accuracy was 81.0% and interobserver agreement was moderate with an almost perfect intraobserver reliability. The classification accuracy of the variable 'morbidity of the patient' reached 73.5% with a moderate interobserver agreement, whereas the intraobserver agreement was substantial. The application of the app yielded comparable results across all subgroups. Conclusion. The PJI-TNM classification system captures the heterogeneity of PJI and can be applied with substantial inter- and intraobserver reliability. The PJI-TNM educational app aims to facilitate application in clinical practice. A major limitation was the correct assessment of the implant situation. To eliminate this, a re-evaluation according to intraoperative findings is strongly recommended. Cite this article: Bone Joint Res 2024;13(1):19–27


The Bone & Joint Journal
Vol. 105-B, Issue 10 | Pages 1123 - 1130
1 Oct 2023
Donnan M Anderson N Hoq M Donnan L

Aims. The aim of this study was to investigate the agreement in interpretation of the quality of the paediatric hip ultrasound examination, the reliability of geometric and morphological assessment, and the relationship between these measurements. Methods. Four investigators evaluated 60 hip ultrasounds and assessed their quality based the standard plane of Graf et al. They measured geometric parameters, described the morphology of the hip, and assigned the Graf grade of dysplasia. They analyzed one self-selected image and one randomly selected image from the ultrasound series, and repeated the process four weeks later. The intra- and interobserver agreement, and correlations between various parameters were analyzed. Results. In the assessment of quality, there a was moderate to substantial intraobserver agreement for each element investigated, but interobserver agreement was poor. Morphological features showed weak to moderate agreement across all parameters but improved to significant when responses were reduced. The geometric measurements showed nearly perfect agreement, and the relationship between them and the morphological features showed a dose response across all parameters with moderate to substantial correlations. There were strong correlations between geometric measurements. The Graf classification showed a fair to moderate interobserver agreement, and moderate to substantial intraobserver agreement. Conclusion. This investigation into the reliability of the interpretation of hip ultrasound scans identified the difficulties in defining what is a high-quality ultrasound. We confirmed that geometric measurements are reliably interpreted and may be useful as a further measurement of quality. Morphological features are generally poorly interpreted, but a simpler binary classification considerably improves agreement. As there is a clear dose response relationship between geometric and morphological measurements, the importance of morphology in the diagnosis of hip dysplasia should be questioned. Cite this article: Bone Joint J 2023;105-B(10):1123–1130


Bone & Joint Open
Vol. 1, Issue 7 | Pages 355 - 358
7 Jul 2020
Konrads C Gonser C Ahmad SS

Aims. The Oswestry-Bristol Classification (OBC) was recently described as an MRI-based classification tool for the femoral trochlear. The authors demonstrated better inter- and intraobserver agreement compared to the Dejour classification. As the OBC could potentially provide a very useful MRI-based grading system for trochlear dysplasia, it was the aim to determine the inter- and intraobserver reliability of the classification system from the perspective of the non-founder. Methods. Two orthopaedic surgeons independently assessed 50 MRI scans for trochlear dysplasia and classified each according to the OBC. Both observers repeated the assessments after six weeks. The inter- and intraobserver agreement was determined using Cohen’s kappa statistic and S-statistic nominal and linear weights. Results. The OBC with grading into four different trochlear forms showed excellent inter- and intraobserver agreement with a mean kappa of 0.78. Conclusion. The OBC is a simple MRI-based classification system with high inter- and intraobserver reliability. It could present a useful tool for grading the severity of trochlear dysplasia in daily practice. Cite this article: Bone Joint Open 2020;1-7:355–358


The Bone & Joint Journal
Vol. 106-B, Issue 9 | Pages 964 - 969
1 Sep 2024
Wang YC Song JJ Li TT Yang D Lv ZB Wang ZY Zhang ZM Luo Y

Aims. To propose a new method for evaluating paediatric radial neck fractures and improve the accuracy of fracture angulation measurement, particularly in younger children, and thereby facilitate planning treatment in this population. Methods. Clinical data of 117 children with radial neck fractures in our hospital from August 2014 to March 2023 were collected. A total of 50 children (26 males, 24 females, mean age 7.6 years (2 to 13)) met the inclusion criteria and were analyzed. Cases were excluded for the following reasons: Judet grade I and Judet grade IVb (> 85° angulation) classification; poor radiograph image quality; incomplete clinical information; sagittal plane angulation; severe displacement of the ulna fracture; and Monteggia fractures. For each patient, standard elbow anteroposterior (AP) view radiographs and corresponding CT images were acquired. On radiographs, Angle P (complementary to the angle between the long axis of the radial head and the line perpendicular to the physis), Angle S (complementary to the angle between the long axis of the radial head and the midline through the proximal radial shaft), and Angle U (between the long axis of the radial head and the straight line from the distal tip of the capitellum to the coronoid process) were identified as candidates approximating the true coronal plane angulation of radial neck fractures. On the coronal plane of the CT scan, the angulation of radial neck fractures (CTa) was measured and served as the reference standard for measurement. Inter- and intraobserver reliabilities were assessed by Kappa statistics and intraclass correlation coefficient (ICC). Results. Angle U showed the strongest correlation with CTa (p < 0.001). In the analysis of inter- and intraobserver reliability, Kappa values were significantly higher for Angles S and U compared with Angle P. ICC values were excellent among the three groups. Conclusion. Angle U on AP view was the best substitute for CTa when evaluating radial neck fractures in children. Further studies are required to validate this method. Cite this article: Bone Joint J 2024;106-B(9):964–969


The Bone & Joint Journal
Vol. 106-B, Issue 9 | Pages 898 - 906
1 Sep 2024
Kayani B Wazir MUK Mancino F Plastow R Haddad FS

Aims. The primary objective of this study was to develop a validated classification system for assessing iatrogenic bone trauma and soft-tissue injury during total hip arthroplasty (THA). The secondary objective was to compare macroscopic bone trauma and soft-tissues injury in conventional THA (CO THA) versus robotic arm-assisted THA (RO THA) using this classification system. Methods. This study included 30 CO THAs versus 30 RO THAs performed by a single surgeon. Intraoperative photographs of the osseous acetabulum and periacetabular soft-tissues were obtained prior to implantation of the acetabular component, which were used to develop the proposed classification system. Interobserver and intraobserver variabilities of the proposed classification system were assessed. Results. The BOne trauma and Soft-Tissue Injury classification system in total Hip arthroplasty (BOSTI Hip) grades osseous acetabular trauma and periarticular muscle damage during THA. The classification system has an interclass correlation coefficient of 0.90 (95% CI 0.86 to 0.93) for interobserver agreement and 0.89 (95% CI 0.84 to 0.93) for intraobserver agreement. RO THA was associated with improved BOSTI Hip scores (p = 0.002) and more pristine osseous surfaces in the anterior superior (p = 0.001) and posterior superior (p < 0.001) acetabular quadrants compared with CO THA. There were no differences between the groups in relation to injury to the gluteus medius (p = 0.084), obturator internus (p = 0.241), piriformis (p = 0.081), superior gamellus (p = 0.116), inferior gamellus (p = 0.132), quadratus femoris (p = 0.208), and vastus lateralis (p = 0.135), but overall combined muscle injury was reduced in RO THA compared with CO THA (p = 0.023). Discussion. The proposed BOSTI Hip classification provides a reproducible grading system for stratifying iatrogenic bone trauma and soft-tissue injury during THA. RO THA was associated with improved BOSTI Hip scores, more pristine osseous acetabular surfaces, and reduced combined periarticular muscle injury compared with CO THA. Further research is required to understand if these intraoperative findings translate to differences in clinical outcomes between the treatment groups. Cite this article: Bone Joint J 2024;106-B(9):898–906


The Bone & Joint Journal
Vol. 104-B, Issue 6 | Pages 715 - 720
1 Jun 2022
Dunsmuir RA Nisar S Cruickshank JA Loughenbury PR

Aims. The aim of the study was to determine if there was a direct correlation between the pain and disability experienced by patients and size of their disc prolapse, measured by the disc’s cross-sectional area on T2 axial MRI scans. Methods. Patients were asked to prospectively complete visual analogue scale (VAS) and Oswestry Disability Index (ODI) scores on the day of their MRI scan. All patients with primary disc herniation were included. Exclusion criteria included recurrent disc herniation, cauda equina syndrome, or any other associated spinal pathology. T2 weighted MRI scans were reviewed on picture archiving and communications software. The T2 axial image showing the disc protrusion with the largest cross sectional area was used for measurements. The area of the disc and canal were measured at this level. The size of the disc was measured as a percentage of the cross-sectional area of the spinal canal on the chosen image. The VAS leg pain and ODI scores were each correlated with the size of the disc using the Pearson correlation coefficient (PCC). Intraobserver reliability for MRI measurement was assessed using the interclass correlation coefficient (ICC). We assessed if the position of the disc prolapse (central, lateral recess, or foraminal) altered the symptoms described by the patient. The VAS and ODI scores from central and lateral recess disc prolapses were compared. Results. A total of 56 patients (mean age 41.1 years (22.8 to 70.3)) were included. A high degree of intraobserver reliability was observed for MRI measurement: single measure ICC was 0.99 (95% confidence interval (CI) from 0.97 to 0.99 (p < 0.001)). The PCC comparing VAS leg scores with canal occupancy for herniated disc was 0.056. The PCC comparing ODI for herniated disc was 0.070. We found 13 disc prolapses centrally and 43 lateral recess prolapses. There were no foraminal prolapses in this group. The position of the prolapse was not found to be related to the mean VAS score or ODI experienced by the patients (VAS, p = 0.251; ODI, p = 0.093). Conclusion. The results of the statistical analysis show that there is no direct correlation between the size or position of the disc prolapse and a patient’s symptoms. The symptoms experienced by patients should be the primary concern in deciding to perform discectomy. Cite this article: Bone Joint J 2022;104-B(6):715–720


The Bone & Joint Journal
Vol. 102-B, Issue 1 | Pages 102 - 107
1 Jan 2020
Sharma N Brown A Bouras T Kuiper JH Eldridge J Barnett A

Aims. Trochlear dysplasia is a significant risk factor for patellofemoral instability. The Dejour classification is currently considered the standard for classifying trochlear dysplasia, but numerous studies have reported poor reliability on both plain radiography and MRI. The severity of trochlear dysplasia is important to establish in order to guide surgical management. We have developed an MRI-specific classification system to assess the severity of trochlear dysplasia, the Oswestry-Bristol Classification (OBC). This is a four-part classification system comprising normal, mild, moderate, and severe to represent a normal, shallow, flat, and convex trochlear, respectively. The purpose of this study was to assess the inter- and intraobserver reliability of the OBC and compare it with that of the Dejour classification. Methods. Four observers (two senior and two junior orthopaedic surgeons) independently assessed 32 CT and axial MRI scans for trochlear dysplasia and classified each according to the OBC and the Dejour classification systems. Assessments were repeated following a four-week interval. The inter- and intraobserver agreement was determined by using Fleiss’ generalization of Cohen’s kappa statistic and S-statistic nominal and linear weights. Results. The OBC showed fair-to-good interobserver agreement and good-to-excellent intraobserver agreement (mean kappa 0.68). The Dejour classification showed poor interobserver agreement and fair-to-good intraobserver agreement (mean kappa 0.52). Conclusion. The OBC can be used to assess the severity of trochlear dysplasia. It can be applied in clinical practice to simplify and standardize surgical decision-making in patients with recurrent patella instability. Cite this article: Bone Joint J 2020;102-B(1):102–107


Bone & Joint Research
Vol. 8, Issue 8 | Pages 357 - 366
1 Aug 2019
Zhang B Sun H Zhan Y He Q Zhu Y Wang Y Luo C

Objectives. CT-based three-column classification (TCC) has been widely used in the treatment of tibial plateau fractures (TPFs). In its updated version (updated three-column concept, uTCC), a fracture morphology-based injury mechanism was proposed for effective treatment guidance. In this study, the injury mechanism of TPFs is further explained, and its inter- and intraobserver reliability is evaluated to perfect the uTCC. Methods. The radiological images of 90 consecutive TPF patients were collected. A total of 47 men (52.2%) and 43 women (47.8%) with a mean age of 49.8 years (. sd. 12.4; 17 to 77) were enrolled in our study. Among them, 57 fractures were on the left side (63.3%) and 33 were on the right side (36.7%); no bilateral fracture existed. Four observers were chosen to classify or estimate independently these randomized cases according to the Schatzker classification, TCC, and injury mechanism. With two rounds of evaluation, the kappa values were calculated to estimate the inter- and intrareliability. Results. The overall inter- and intraobserver agreements of the injury mechanism were substantial (κ. inter. = 0.699, κ. intra. = 0.749, respectively). The initial position and the force direction, which are two components of the injury mechanism, had substantial agreement for both inter-reliability or intrareliability. The inter- and intraobserver agreements were lower in high-energy fractures (Schatzker types IV to VI; κ. inter. = 0.605, κ. intra. = 0.721) compared with low-energy fractures (Schatzker types I to III; κ. inter. = 0.81, κ. intra. = 0.832). The inter- and intraobserver agreements were relatively higher in one-column fractures (κ. inter. = 0.759, κ. intra. = 0.801) compared with two-column and three-column fractures. Conclusion. The complete theory of injury mechanism of TPFs was first put forward to make the TCC consummate. It demonstrates substantial inter- and intraobserver agreement generally. Furthermore, the injury mechanism can be promoted clinically. Cite this article: B-B. Zhang, H. Sun, Y. Zhan, Q-F. He, Y. Zhu, Y-K. Wang, C-F. Luo. Reliability and repeatability of tibial plateau fracture assessment with an injury mechanism-based concept. Bone Joint Res 2019;8:357–366. DOI: 10.1302/2046-3758.88.BJR-2018-0331.R1


The Bone & Joint Journal
Vol. 103-B, Issue 8 | Pages 1345 - 1350
1 Aug 2021
Czubak-Wrzosek M Nitek Z Sztwiertnia P Czubak J Grzelecki D Kowalczewski J Tyrakowski M

Aims. The aim of the study was to compare two methods of calculating pelvic incidence (PI) and pelvic tilt (PT), either by using the femoral heads or acetabular domes to determine the bicoxofemoral axis, in patients with unilateral or bilateral primary hip osteoarthritis (OA). Methods. PI and PT were measured on standing lateral radiographs of the spine in two groups: 50 patients with unilateral (Group I) and 50 patients with bilateral hip OA (Group II), using the femoral heads or acetabular domes to define the bicoxofemoral axis. Agreement between the methods was determined by intraclass correlation coefficient (ICC) and the standard error of measurement (SEm). The intraobserver reproducibility and interobserver reliability of the two methods were analyzed on 31 radiographs in both groups to calculate ICC and SEm. Results. In both groups, excellent agreement between the two methods was obtained, with ICC of 0.99 and SEm 0.3° for Group I, and ICC 0.99 and SEm 0.4° for Group II. The intraobserver reproducibility was excellent for both methods in both groups, with an ICC of at least 0.97 and SEm not exceeding 0.8°. The study also revealed excellent interobserver reliability for both methods in both groups, with ICC 0.99 and SEm 0.5° or less. Conclusion. Either the femoral heads or acetabular domes can be used to define the bicoxofemoral axis on the lateral standing radiographs of the spine for measuring PI and PT in patients with idiopathic unilateral or bilateral hip OA. Cite this article: Bone Joint J 2021;103-B(8):1345–1350


The Bone & Joint Journal
Vol. 103-B, Issue 8 | Pages 1339 - 1344
1 Aug 2021
Jain S Mohrir G Townsend O Lamb JN Palan J Aderinto J Pandit H

Aims. This aim of this study was to assess the reliability and validity of the Unified Classification System (UCS) for postoperative periprosthetic femoral fractures (PFFs) around cemented polished taper-slip (PTS) stems. Methods. Radiographs of 71 patients with a PFF admitted consecutively at two centres between 25 February 2012 and 19 May 2020 were collated by an independent investigator. Six observers (three hip consultants and three trainees) were familiarized with the UCS. Each PFF was classified on two separate occasions, with a mean time between assessments of 22.7 days (16 to 29). Interobserver reliability for more than two observers was assessed using percentage agreement and Fleiss’ kappa statistic. Intraobserver reliability between two observers was calculated with Cohen kappa statistic. Validity was tested on surgically managed UCS type B PFFs where stem stability was documented in operation notes (n = 50). Validity was assessed using percentage agreement and Cohen kappa statistic between radiological assessment and intraoperative findings. Kappa statistics were interpreted using Landis and Koch criteria. All six observers were blinded to operation notes and postoperative radiographs. Results. Interobserver reliability percentage agreement was 58.5% and the overall kappa value was 0.442 (moderate agreement). Lowest kappa values were seen for type B fractures (0.095 to 0.360). The mean intraobserver reliability kappa value was 0.672 (0.447 to 0.867), indicating substantial agreement. Validity percentage agreement was 65.7% and the mean kappa value was 0.300 (0.160 to 0.4400) indicating only fair agreement. Conclusion. This study demonstrates that the UCS is unsatisfactory for the classification of PFFs around PTS stems, and that it has considerably lower reliability and validity than previously described for other stem types. Radiological PTS stem loosening in the presence of PFF is poorly defined and formal intraoperative testing of stem stability is recommended. Cite this article: Bone Joint J 2021;103-B(8):1339–1344


The Bone & Joint Journal
Vol. 106-B, Issue 1 | Pages 19 - 27
1 Jan 2024
Tang H Guo S Ma Z Wang S Zhou Y

Aims. The aim of this study was to evaluate the reliability and validity of a patient-specific algorithm which we developed for predicting changes in sagittal pelvic tilt after total hip arthroplasty (THA). Methods. This retrospective study included 143 patients who underwent 171 THAs between April 2019 and October 2020 and had full-body lateral radiographs preoperatively and at one year postoperatively. We measured the pelvic incidence (PI), the sagittal vertical axis (SVA), pelvic tilt, sacral slope (SS), lumbar lordosis (LL), and thoracic kyphosis to classify patients into types A, B1, B2, B3, and C. The change of pelvic tilt was predicted according to the normal range of SVA (0 mm to 50 mm) for types A, B1, B2, and B3, and based on the absolute value of one-third of the PI-LL mismatch for type C patients. The reliability of the classification of the patients and the prediction of the change of pelvic tilt were assessed using kappa values and intraclass correlation coefficients (ICCs), respectively. Validity was assessed using the overall mean error and mean absolute error (MAE) for the prediction of the change of pelvic tilt. Results. The kappa values were 0.927 (95% confidence interval (CI) 0.861 to 0.992) and 0.945 (95% CI 0.903 to 0.988) for the inter- and intraobserver reliabilities, respectively, and the ICCs ranged from 0.919 to 0.997. The overall mean error and MAE for the prediction of the change of pelvic tilt were -0.3° (SD 3.6°) and 2.8° (SD 2.4°), respectively. The overall absolute change of pelvic tilt was 5.0° (SD 4.1°). Pre- and postoperative values and changes in pelvic tilt, SVA, SS, and LL varied significantly among the five types of patient. Conclusion. We found that the proposed algorithm was reliable and valid for predicting the standing pelvic tilt after THA. Cite this article: Bone Joint J 2024;106-B(1):19–27


Orthopaedic Proceedings
Vol. 104-B, Issue SUPP_11 | Pages 34 - 34
1 Nov 2022
Haleem S Malik M Azzopardi C Botchu R Marks D
Full Access

Abstract. Purpose. Intracanal rib head penetration is a well-known entity in dystrophic scoliotic curves in neurofibromatosis type 1. There is potential for spinal cord injury if this is not recognised and managed appropriately. No current CT-based classification system is currently in use to quantify rib head penetration. This study aims to propose and evaluate a novel CT-based classification for rib head penetration primarily for neurofibromatosis but which can also be utilised in other conditions of rib head penetration. Materials and methods. The grading was developed as four grades: normal rib head (RH) position—Grade 0, subluxed ext-racanal RH position—Grade 1, RH at pedicle—Grade 2, intracanal RH—Grade 3. Grade 3 was further classified depending on the head position in the canal divided into thirds. Rib head penetration into proximal third (from ipsilateral side)—Grade 3A, into the middle third—Grade 3B and into the distal third—Grade 3C. Seventy-five axial CT images of Neurofibromatosis Type 1 patients in the paediatric age group were reviewed by a radiologist and a spinal surgeon independently to assess interobserver and intraobserver agreement of the novel CT classification. Agreement analysis was performed using the weighted Kappa statistic. Results. There was substantial interobserver correlation with mean Kappa score (k = 0.8, 95% CI 0.7–0.9) and near perfect intraobserver Kappa of 1.0 (95% CI 0.9–1.0) and 0.9 (95% CI 0.9–1.0) for the two readers. Conclusion. The novel CT-based classification quantifies rib head penetration which aids in management planning


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_14 | Pages 8 - 8
10 Oct 2023
Leow J Oliver W Bell K Molyneux S Clement N Duckworth A
Full Access

To develop a reliable and effective radiological score to assess the healing of isolated ulnar shaft fractures (IUSF), the Radiographic Union Score for Ulna fractures (RUSU). Initially, 20 patients with radiographs six weeks following a non-operatively managed ulnar shaft fracture were selected and scored by three blinded observers. After intraclass correlation (ICC) analysis, a second group of 54 patients with radiographs six weeks after injury (18 who developed a nonunion and 36 who united) were scored by the same observers. In the initial study, interobserver and intraobserver ICC were 0.89 and 0.93, respectively. In the validation study the interobserver ICC was 0.85. The median score for patients who united was significantly higher than those who developed a nonunion (11 vs 7, p<0.001). A ROC curve demonstrated that a RUSU ≤8 had a sensitivity of 88.9% and specificity of 86.1% in identifying patients at risk of nonunion. Patients with a RUSU ≤8 (n = 21) were more likely to develop a nonunion (n = 16/21) than those with a RUSU ≥9 (n = 2/33; OR 49.6, 95% CI 8.6–284.7). Based on a PPV of 76%, if all patients with a RUSU ≤8 underwent fixation at 6-weeks, the number of procedures needed to avoid one nonunion would be 1.3. The RUSU shows good interobserver and intraobserver reliability and is effective in identifying patients at risk of nonunion six weeks after fracture. This tool requires external validation but may enhance the management of patients with isolated ulnar shaft fractures


The Bone & Joint Journal
Vol. 105-B, Issue 6 | Pages 696 - 701
1 Jun 2023
Kurisunkal V Morris G Kaneuchi Y Bleibleh S James S Botchu R Jeys L Parry MC

Aims. Intra-articular (IA) tumours around the knee are treated with extra-articular (EA) resection, which is associated with poor functional outcomes. We aim to evaluate the accuracy of MRI in predicting IA involvement around the knee. Methods. We identified 63 cases of high-grade sarcomas in or around the distal femur that underwent an EA resection from a prospectively maintained database (January 1996 to April 2020). Suspicion of IA disease was noted in 52 cases, six had IA pathological fracture, two had an effusion, two had prior surgical intervention (curettage/IA intervention), and one had an osseous metastasis in the proximal tibia. To ascertain validity, two musculoskeletal radiologists (R1, R2) reviewed the preoperative imaging (MRI) of 63 consecutive cases on two occasions six weeks apart. The radiological criteria for IA disease comprised evidence of tumour extension within the suprapatellar pouch, intercondylar notch, extension along medial/lateral retinaculum, and presence of IA fracture. The radiological predictions were then confirmed with the final histopathology of the resected specimens. Results. The resection histology revealed 23 cases (36.5%) showing IA disease involvement compared with 40 cases without (62%). The intraobserver variability of R1 was 0.85 (p < 0.001) compared to R2 with κ = 0.21 (p = 0.007). The interobserver variability was κ = 0.264 (p = 0.003). Knee effusion was found to be the most sensitive indicator of IA involvement, with a sensitivity of 91.3% but specificity of only 35%. However, when combined with a pathological fracture, this rose to 97.5% and 100% when disease was visible in Hoffa’s fat pad. Conclusion. MRI imaging can sometimes overestimate IA joint involvement and needs to be correlated with clinical signs. In the light of our findings, we would recommend EA resections when imaging shows effusion combined with either disease in Hoffa’s fat pad or retinaculum, or pathological fractures. Cite this article: Bone Joint J 2023;105-B(6):696–701


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_16 | Pages 82 - 82
19 Aug 2024
Courington R Ferreira R Shaath MK Green C Langford J Haidukewych G
Full Access

When treating periprosthetic femur fractures (PPFFs) around total hip arthroplasty (THA)], determining implant fixation status preoperatively is important, since this guides treatment regarding ORIF versus revision. The purpose of this study was to determine the accuracy of preoperative implant fixation status determination utilizing plain films and CT scans. Twenty-four patients who underwent surgery for Vancouver B type PPFF were included in the study. Two joint surgeons and two traumatologists reviewed plain films alone and made a judgment on fixation status. They then reviewed CT scans and fixation status was reassessed. Concordance and discordance were recorded. Interobserver reliability was assessed using Kendall's W and intraobserver reliability was assessed using Cohen's Kappa. Ultimately, the “correct” response was determined by intraoperative findings, as we routinely test the component intraoperatively. Fifteen implants were found to be well-fixed (63%) and 9 were loose. Plain radiographs alone predicted correct fixation status in 53% of cases. When adding the CT data, the correct prediction only improved to 55%. Interestingly, concordance between plain radiographs and CT was noted in 82%. In concordant cases, the fixation status was found to be correct in 55% of cases. Of the 18% of cases with discordance, plain films were correct in 43% of cases, and the CT was correct in 57%. Interobserver reliability demonstrated poor agreement on plain films and moderate agreement on CT. Intraobserver reliability demonstrated moderate agreement on both plain films and CT. The ability to determine fixation status for proximal PPFFs around uncemented femoral components remains challenging. The addition of routine CT scanning did not significantly improve accuracy. We recommend careful intraoperative testing of femoral component fixation with surgical dislocation if necessary, and the surgeon should be prepared to revise or fix the fracture based on those findings


Bone & Joint Open
Vol. 4, Issue 4 | Pages 262 - 272
11 Apr 2023
Batailler C Naaim A Daxhelet J Lustig S Ollivier M Parratte S

Aims. The impact of a diaphyseal femoral deformity on knee alignment varies according to its severity and localization. The aims of this study were to determine a method of assessing the impact of diaphyseal femoral deformities on knee alignment for the varus knee, and to evaluate the reliability and the reproducibility of this method in a large cohort of osteoarthritic patients. Methods. All patients who underwent a knee arthroplasty from 2019 to 2021 were included. Exclusion criteria were genu valgus, flexion contracture (> 5°), previous femoral osteotomy or fracture, total hip arthroplasty, and femoral rotational disorder. A total of 205 patients met the inclusion criteria. The mean age was 62.2 years (SD 8.4). The mean BMI was 33.1 kg/m. 2. (SD 5.5). The radiological measurements were performed twice by two independent reviewers, and included hip knee ankle (HKA) angle, mechanical medial distal femoral angle (mMDFA), anatomical medial distal femoral angle (aMDFA), femoral neck shaft angle (NSA), femoral bowing angle (FBow), the distance between the knee centre and the top of the FBow (DK), and the angle representing the FBow impact on the knee (C’KS angle). Results. The FBow impact on the mMDFA can be measured by the C’KS angle. The C’KS angle took the localization (length DK) and the importance (FBow angle) of the FBow into consideration. The mean FBow angle was 4.4° (SD 2.4; 0 to 12.5). The mean C’KS angle was 1.8° (SD 1.1; 0 to 5.8). Overall, 84 knees (41%) had a severe FBow (> 5°). The radiological measurements showed very good to excellent intraobserver and interobserver agreements. The C’KS increased significantly when the length DK decreased and the FBow angle increased (p < 0.001). Conclusion. The impact of the diaphyseal femoral deformity on the mechanical femoral axis is measured by the C’KS angle, a reliable and reproducible measurement. Cite this article: Bone Jt Open 2023;4(4):262–272


The Bone & Joint Journal
Vol. 102-B, Issue 8 | Pages 1041 - 1047
1 Aug 2020
Hamoodi Z Singh J Elvey MH Watts AC

Aims. The Wrightington classification system of fracture-dislocations of the elbow divides these injuries into six subtypes depending on the involvement of the coronoid and the radial head. The aim of this study was to assess the reliability and reproducibility of this classification system. Methods. This was a blinded study using radiographs and CT scans of 48 consecutive patients managed according to the Wrightington classification system between 2010 and 2018. Four trauma and orthopaedic consultants, two post CCT fellows, and one speciality registrar based in the UK classified the injuries. The seven observers reviewed preoperative radiographs and CT scans twice, with a minimum four-week interval. Radiographs and CT scans were reviewed separately. Inter- and intraobserver reliability were calculated using Fleiss and Cohen kappa coefficients. The Landis and Koch criteria were used to interpret the strength of the kappa values. Validity was assessed by calculating the percentage agreement against intraoperative findings. Results. Of the 48 patients, three (6%) had type A injury, 11 (23%) type B, 16 (33%) type B+, 16 (33%) Type C, two (4%) type D+, and none had a type D injury. All 48 patients had anteroposterior (AP) and lateral radiographs, 44 had 2D CT scans, and 39 had 3D reconstructions. The interobserver reliability kappa value was 0.52 for radiographs, 0.71 for 2D CT scans, and 0.73 for a combination of 2D and 3D reconstruction CT scans. The median intraobserver reliability was 0.75 (interquartile range (IQR) 0.62 to 0.79) for radiographs, 0.77 (IQR 0.73 to 0.94) for 2D CT scans, and 0.89 (IQR 0.77 to 0.93) for the combination of 2D and 3D reconstruction. Validity analysis showed that accuracy significantly improved when using CT scans (p = 0.018 and p = 0.028 respectively). Conclusion. The Wrightington classification system is a reliable and valid method of classifying fracture-dislocations of the elbow. CT scans are significantly more accurate than radiographs when identifying the pattern of injury, with good intra- and interobserver reproducibility. Cite this article: Bone Joint J 2020;102-B(8):1041–1047


Bone & Joint Open
Vol. 3, Issue 5 | Pages 423 - 431
1 May 2022
Leong JWY Singhal R Whitehouse MR Howell JR Hamer A Khanduja V Board TN

Aims. The aim of this modified Delphi process was to create a structured Revision Hip Complexity Classification (RHCC) which can be used as a tool to help direct multidisciplinary team (MDT) discussions of complex cases in local or regional revision networks. Methods. The RHCC was developed with the help of a steering group and an invitation through the British Hip Society (BHS) to members to apply, forming an expert panel of 35. We ran a mixed-method modified Delphi process (three rounds of questionnaires and one virtual meeting). Round 1 consisted of identifying the factors that govern the decision-making and complexities, with weighting given to factors considered most important by experts. Participants were asked to identify classification systems where relevant. Rounds 2 and 3 focused on grouping each factor into H1, H2, or H3, creating a hierarchy of complexity. This was followed by a virtual meeting in an attempt to achieve consensus on the factors which had not achieved consensus in preceding rounds. Results. The expert group achieved strong consensus in 32 out of 36 factors following the Delphi process. The RHCC used the existing Paprosky (acetabulum and femur), Unified Classification System, and American Society of Anesthesiologists (ASA) classification systems. Patients with ASA grade III/IV are recognized with a qualifier of an asterisk added to the final classification. The classification has good intraobserver and interobserver reliability with Kappa values of 0.88 to 0.92 and 0.77 to 0.85, respectively. Conclusion. The RHCC has been developed through a modified Delphi technique. RHCC will provide a framework to allow discussion of complex cases as part of a local or regional hip revision MDT. We believe that adoption of the RHCC will provide a comprehensive and reproducible method to describe each patient’s case with regard to surgical complexity, in addition to medical comorbidities that may influence their management. Cite this article: Bone Jt Open 2022;3(5):423–431


Orthopaedic Proceedings
Vol. 104-B, Issue SUPP_1 | Pages 31 - 31
1 Jan 2022
Haleem S Malik M Guduri V Azzopardi C James S Botchu R
Full Access

Abstract. Purpose. No clinical CT based classification system is currently in use for Lumbar Foraminal Stenosis. MRI scanners are not easily available, are expensive and may be contraindicated in an increasing number of patients. This study aims to propose and evaluate the reproducibility of a novel CT based classification for lumbar foraminal stenosis. Materials and Methods. The grading was developed as 4 grades. Normal foramen – Grade 0, Anteroposterior(AP)/Superoinferior (SI)(single plane) fat compression – Grade 1, Both AP/SI compression (two planes) – Grade 2 (both AP and SI) without distortion of nerve root, Grade 2 with distortion of nerve root – Grade 3. 800 lumbar foramen of a cohort of 100 random patients over the age of 60 who had undergone both CT and MRI scans were reviewed by two radiologists independently to assess agreement of the novel CT classification against the MRI based grading system of Lee et al. Interobserver(n=400) and intraobserver agreement(n=160) was also evaluated. Agreement analysis was performed using the Weighted Kappa statistic. Results. 100 patients (M:F = 45:55) with a mean age of 68.5 years (range 60 – 83 years were included in the study. The duration between CT and MRI scans was 98 days(range 0 – 540, SD – 108). There was good correlation between CT and MRI with Kappa scores (k=0.81) and intraobserver Kappa of 0.89 and 0.98 for the two readers. Conclusion. The novel CT based classification correlates well with the MRI grading system and can safely and accurately replace it where required


Bone & Joint Open
Vol. 2, Issue 10 | Pages 858 - 864
18 Oct 2021
Guntin J Plummer D Della Valle C DeBenedetti A Nam D

Aims. Prior studies have identified that malseating of a modular dual mobility liner can occur, with previous reported incidences between 5.8% and 16.4%. The aim of this study was to determine the incidence of malseating in dual mobility implants at our institution, assess for risk factors for liner malseating, and investigate whether liner malseating has any impact on clinical outcomes after surgery. Methods. We retrospectively reviewed the radiographs of 239 primary and revision total hip arthroplasties with a modular dual mobility liner. Two independent reviewers assessed radiographs for each patient twice for evidence of malseating, with a third observer acting as a tiebreaker. Univariate analysis was conducted to determine risk factors for malseating with Youden’s index used to identify cut-off points. Cohen’s kappa test was used to measure interobserver and intraobserver reliability. Results. In all, 12 liners (5.0%), including eight Stryker (6.8%) and four Zimmer Biomet (3.3%), had radiological evidence of malseating. Interobserver reliability was found to be 0.453 (95% confidence interval (CI) 0.26 to 0.64), suggesting weak inter-rater agreement, with strong agreement being greater than 0.8. We found component size of 50 mm or less to be associated with liner malseating on univariate analysis (p = 0.031). Patients with malseated liners appeared to have no associated clinical consequences, and none required revision surgery at a mean of 14 months (1.4 to 99.2) postoperatively. Conclusion. The incidence of liner malseating was 5.0%, which is similar to other reports. Component size of 50 mm or smaller was identified as a risk factor for malseating. Surgeons should be aware that malseating can occur and implant design changes or changes in instrumentation should be considered to lower the risk of malseating. Although further follow-up is needed, it remains to be seen if malseating is associated with any clinical consequences. Cite this article: Bone Jt Open 2021;2(10):858–864


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_16 | Pages 81 - 81
19 Aug 2024
Angelomenos V Shareghi B Itayem R Mohaddes M
Full Access

Early micromotion of hip implants measured with radiostereometric analysis (RSA) is a predictor for late aseptic loosening. Computed Tomography Radiostereometric Analysis (CT-RSA) can be used to determine implant micro-movements using low-dose CT scans. CT-RSA enables a non-invasive measurement of implants. We evaluated the precision of CT-RSA in measuring early stem migration. Standard marker-based RSA was used as reference. We hypothesised that CT-RSA can be used as an alternative to RSA in assessing implant micromotions. We included 31 patients undergoing Total Hip Arthroplasty (THA). Distal femoral stem migration at 1 year was measured with both RSA and CT-RSA. Comparison of the two methods was performed with paired-analysis and Bland-Altman plots. Furthermore, the inter- and intraobserver reliability of the CT-RSA method was evaluated. No statistical difference was found between RSA and CTMA measurements. The Bland-Altman plots showed good agreement between marker-based RSA and CT-RSA. The intra- and interobserver reliability of the CT-RSA method was found to be excellent (≥0.992). CT-RSA is comparable to marker-based RSA in measuring distal femoral stem migration. CTMA can be used as an alternative method to detect early implant migration


Bone & Joint Research
Vol. 6, Issue 9 | Pages 530 - 534
1 Sep 2017
Krakow L Klockow A Roehner E Brodt S Eijer H Bossert J Matziolis G

Objectives. The determination of the volumetric polyethylene wear on explanted material requires complicated equipment, which is not available in many research institutions. Our aim in this study was to present and validate a method that only requires a set of polyetheretherketone balls and a laboratory balance to determine wear. Methods. The insert to be measured was placed on a balance, and a ball of the appropriate diameter was inserted. The cavity remaining between the ball and insert caused by wear was filled with contrast medium and the weight of the contrast medium was recorded. The volume was calculated from the known density of the liquid. The precision, inter- and intraobserver reliability, were determined by four investigators on four days using nine inserts with specified wear (0.094 ml to 1.626 ml), and the intra-class correlation coefficient was calculated. The feasibility of using this method in routine clinical practice and the time required for measurement were tested on 84 explanted inserts by one investigator. Results. In order to get the mean for all investigators and determinations, the deviation between the measured and specified wear was -0.08 ml . (sd. 0.12; -0.21 to 0.11). The interobserver reliability was 0.989 ml (95% confidence interval (CI) 0.964 to 0.997) and the intraobserver reliability was 0.941 for observer 1 (95% CI 0.846 to 0.985), 0.983 for observer 2 (95% CI 0.956 to 0.995), 0.939 for observer 3 (95% CI 0.855 to 0.984), and 0.934 for observer 4 (95% CI 0.790 to 0.984). The mean time required to examine the samples was two minutes . (sd. 2; 1 to 5). Conclusion. The method presented here was shown to be sufficiently precise for many settings and is a cost-effective and quick method of determining the volumetric wear of explanted acetabular components. However, the measurement of wear for scientific purposes will probably continue to involve more accurate and dedicated laboratory equipment. Cite this article: Bone Joint Res 2017;6:530–534


Orthopaedic Proceedings
Vol. 104-B, Issue SUPP_7 | Pages 9 - 9
1 Jul 2022
Fleming T Torrie A Murphy T Dodds A Engelke D Curwen C Gosal H Pegrum J
Full Access

Abstract. INTRODUCTION. COVID-19 reduced availability of cross-sectional imaging, prompting the need to clinically justify pre-operative computed tomography (CT) in tibial plateau fractures (TPF). The study purpose was to establish to what extent does a CT alter the pre-operative plan in TPF compared to radiographs. There is a current paucity of evidence assessing its impact on surgical planning. METHODOLOGY. 50 consecutive TPF with preoperative CT were assessed by 4 consultant surgeons. Anonymised radiographs were assessed defining the column classification, planned setup, approach, and fixation technique. At a 1-month interval, randomised matched CT scans were assessed and the same data collected. A tibial plateau disruption score (TPDS) was derived for all 4 quadrants (no injury=0,split=1,split/depression=2 and depression=3). Radiograph and CT TPDS were assessed using an unpaired T-test. RESULTS. 26 female and 24 male patients, mean age 50.3, were included. Mean TPDS on radiographs and CT scans were 2.77 and 3.17 respectively. A significant higher net CT TPDS was observed of 0.4 (95%CI 0.10-0.71)[P=0.0093]. Both radiograph and CT TPDS ANOVA were significant (P<0.0001), showing high intraobserver variability for TPF classification. Fracture apex requiring fixation changed in 34% of cases between the radiographs and CT, whilst set-up and surgical approach changed in 27% and 28.5% of cases respectively. All surgeons agreed no CT was required in only 11 out of 50 cases. CONCLUSION. CT scanning in TPF significantly affects the classification, setup, approach and fixation technique when compared to radiographs alone and can justifiably be requested as part of pre-operative planning


Orthopaedic Proceedings
Vol. 104-B, Issue SUPP_6 | Pages 4 - 4
1 Jun 2022
Hoban K Downie S Adamson D MacLean J Cool P Jariwala AC
Full Access

Mirels’ score predicts the likelihood of sustaining pathological fractures using pain, lesion site, size and morphology. The aim is to investigate its reproducibility, reliability and accuracy in upper limb bony metastases and validate its use in pathological fracture prediction. A retrospective cohort study of patients with upper limb metastases, referred to an Orthopaedic Trauma Centre (2013–18). Mirels’ was calculated in 32 patients; plain radiographs at presentation scored by 6 raters. Radiological aspects were scored twice by each rater, 2-weeks apart. Inter- and intra-observer reliability were calculated (Fleiss’ kappa test). Bland-Altman plots compared variances of individual score components &total Mirels’ score. Mirels’ score of ≥9 did not accurately predict lesions that would fracture (11% 5/46 vs 65.2% Mirels’ score ≤8, p<0.0001). Sensitivity was 14.3% &specificity was 72.7%. When Mirels’ cut-off was lowered to ≥7, patients were more likely to fracture (48% 22/46 versus 28% 13/46, p=0.045). Sensitivity rose to 62.9%, specificity fell to 54.6%. Kappa values for interobserver variability were 0.358 (fair, 0.288–0.429) for lesion size, 0.107 (poor, 0.02–0.193) for radiological appearance and 0.274 (fair, 0.229–0.318) for total Mirels’ score. Values for intraobserver variability were 0.716 (good, 95% CI 0.432–0.999) for lesion size, 0.427 (moderate, 95% CI 0.195–0.768) for radiological appearance and 0.580 (moderate, 0.395–0.765) for total Mirels’ score. We showed moderate to substantial agreement between &within raters using Mirels’ score on upper limb radiographs. Mirels’ has poor sensitivity &specificity predicting upper limb fractures - we recommend the cut-off score for prophylactic surgery should be lower than for lower limb lesions


Orthopaedic Proceedings
Vol. 102-B, Issue SUPP_7 | Pages 7 - 7
1 Jul 2020
Schaeffer E Teo T Cherukupalli A Cooper A Aroojis A Sankar W Upasani V Carsen S Mulpuri K Bone J Reilly CW
Full Access

The Gartland extension-type supracondylar humerus fracture is the most common elbow fracture in the paediatric population. Depending on fracture classification, treatment options range from nonoperative treatment such as taping, splinting or casting to operative treatments such as closed reduction and percutaneous pinning or open reduction. Classification variability between surgeons is a potential contributing factor to existing controversy over nonoperative versus operative treatment for Type II supracondylar fractures. The purpose of this study was to investigate levels of agreement in classification of extension-type supracondylar humerus fractures using the Gartland classification system. A retrospective chart review was conducted on patients aged 2–12 years who had sustained an extension-type supracondylar fracture and received either operative or nonoperative treatment at a tertiary children's hospital. De-identified baseline anteroposterior (AP) and lateral plain elbow radiographs were provided along with a brief summary of the modified Gartland classification system to surgeons across Canada, United States, Australia, United Kingdom and India. Each surgeon was blinded to patient treatment and asked to classify the fractures as Type I, IIA, IIB or III according to the classification system provided. A total of 21 paediatric orthopaedic surgeons completed one round of classification, of these, 15 completed a second round using the same radiographs in a reshuffled order. Kappa values using pre-determined weighted kappa coefficients were calculated to assess interobserver and intraobserver levels of agreement. In total, 60 sets of baseline elbow radiographs were provided to survey respondents. Interobserver agreement for classification based on the Gartland criteria between surgeons was a mean of 0.68, 95% CI [0.67, 0.69] (0.61–0.80 considered substantial agreement). Intraobserver agreement was a mean of 0.80 [0.75, 0.84]. (0.61–0.80 substantial agreement, 0.81–1 almost perfect agreement). Radiographic classification of extension-type supracondylar humerus fractures at baseline demonstrated substantial agreement both between and within surgeon raters. Levels of agreement are substantial enough to suggest that classification variability is not a major contributing factor to variability in treatment between surgeons for Type II supracondylar fractures. Further research is needed to compare patient outcomes between nonoperative and operative treatment for these fractures, so as to establish consensus and a standardized treatment protocol for optimal patient care across centres


The Bone & Joint Journal
Vol. 101-B, Issue 9 | Pages 1042 - 1049
1 Sep 2019
Murphy MP Killen CJ Ralles SJ Brown NM Hopkinson WJ Wu K

Aims. Several radiological methods of measuring anteversion of the acetabular component after total hip arthroplasty (THA) have been described. These are limited by low reproducibility, are less accurate than CT 3D reconstruction, and are cumbersome to use. These methods also partly rely on the identification of obscured radiological borders of the component. We propose two novel methods, the Area and Orthogonal methods, which have been designed to maximize use of readily identifiable points while maintaining the same trigonometric principles. Patients and Methods. A retrospective study of plain radiographs was conducted on 160 hips of 141 patients who had undergone primary THA. We compared the reliability and accuracy of the Area and Orthogonal methods with two of the current leading methods: those of Widmer and Lewinnek, respectively. Results. The 160 anteroposterior pelvis films revealed that the proposed Area method was statistically different from those described by Widmer and Lewinnek (p < 0.001 and p = 0.004, respectively). They gave the highest inter- and intraobserver reliability (0.992 and 0.998, respectively), and took less time (27.50 seconds (. sd. 3.19); p < 0.001) to complete. In addition, 21 available CT 3D reconstructions revealed the Area method achieved the highest Pearson’s correlation coefficient (r = 0.956; p < 0.001) and least statistical difference (p = 0.704) from CT with a mean within 1° of CT-3D reconstruction between ranges of 1° to 30° of measured radiological anteversion. Conclusion. Our results support the proposed Area method to be the most reliable, accurate, and speedy. They did not support any statistical superiority of the proposed Orthogonal method to that of the Widmer or Lewinnek method. Cite this article: Bone Joint J 2019;101-B:1042–1049


Orthopaedic Proceedings
Vol. 99-B, Issue SUPP_20 | Pages 17 - 17
1 Dec 2017
Knez D Mohar J Cirman RJ Likar B Pernuš F Vrtovec T
Full Access

We present an analysis of manual and computer-assisted preoperative pedicle screw placement planning. Preoperative planning of 256 pedicle screws was performed manually twice by two experienced spine surgeons (M1 and M2) and automatically once by a computer-assisted method (C) on three-dimensional computed tomography images of 17 patients with thoracic spinal deformities. Statistical analysis was performed to obtain the intraobserver and interobserver variability for the pedicle screw size (i.e. diameter and length) and insertion trajectory (i.e. pedicle crossing point, sagittal and axial inclination, and normalized screw fastening strength). In our previous study, we showed that the differences among both manual plannings (M1 and M2) and computer-assisted planning (C) are comparable to the differences between manual plannings, except for the pedicle screw inclination in the sagittal plane. In this study, however, we obtained also the intraobserver variability for both manual plannings (M1 and M2), which revealed that larger differences occurred again for the sagittal screw inclination, especially in the case of manual planning M2 with average differences of up to 18.3°. On the other hand, the interobserver variability analysis revealed that the intraobserver variability for each pedicle screw parameter was, in terms of magnitude, comparable to the interobserver variability among both manual and computer-assisted plannings. The results indicate that computer-assisted pedicle screw placement planning is not only more reproducible and faster than, but also as reliable as manual planning


The Bone & Joint Journal
Vol. 101-B, Issue 12 | Pages 1578 - 1584
1 Dec 2019
Batailler C Weidner J Wyatt M Pfluger D Beck M

Aims. A borderline dysplastic hip can behave as either stable or unstable and this makes surgical decision making challenging. While an unstable hip may be best treated by acetabular reorientation, stable hips can be treated arthroscopically. Several imaging parameters can help to identify the appropriate treatment, including the Femoro-Epiphyseal Acetabular Roof (FEAR) index, measured on plain radiographs. The aim of this study was to assess the reliability and the sensitivity of FEAR index on MRI compared with its radiological measurement. Patients and Methods. The technique of measuring the FEAR index on MRI was defined and its reliability validated. A retrospective study assessed three groups of 20 patients: an unstable group of ‘borderline dysplastic hips’ with lateral centre edge angle (LCEA) less than 25° treated successfully by periacetabular osteotomy; a stable group of ‘borderline dysplastic hips’ with LCEA less than 25° treated successfully by impingement surgery; and an asymptomatic control group with LCEA between 25° and 35°. The following measurements were performed on both standardized radiographs and on MRI: LCEA, acetabular index, femoral anteversion, and FEAR index. Results. The FEAR index showed excellent intraobserver and interobserver reliability on both MRI and radiographs. The FEAR index was more reliable on radiographs than on MRI. The FEAR index on MRI was lower in the stable borderline group (mean -4.2° (. sd. 9.1°)) compared with the unstable borderline group (mean 7.9° (. sd. 6.8°)). With a FEAR index cut-off value of 2°, 90% of patients were correctly identified as stable or unstable using the radiological FEAR index, compared with 82.5% using the FEAR index on MRI. The FEAR index was a better predictor of instability on plain radiographs than on MRI. Conclusion. The FEAR index measured on MRI is less reliable and less sensitive than the FEAR index measured on radiographs. The cut-off value of 2° for radiological FEAR index predicted hip stability with 90% probability. Cite this article: Bone Joint J 2019;101-B:1578–1584


The Bone & Joint Journal
Vol. 102-B, Issue 5 | Pages 593 - 599
1 May 2020
Amanatullah DF Cheng RZ Huddleston III JI Maloney WJ Finlay AK Kappagoda S Suh GA Goodman SB

Aims. To establish the utility of adding the laboratory-based synovial alpha-defensin immunoassay to the traditional diagnostic work-up of a prosthetic joint infection (PJI). Methods. A group of four physicians evaluated 158 consecutive patients who were worked up for PJI, of which 94 underwent revision arthroplasty. Each physician reviewed the diagnostic data and decided on the presence of PJI according to the 2014 Musculoskeletal Infection Society (MSIS) criteria (yes, no, or undetermined). Their initial randomized review of the available data before or after surgery was blinded to each alpha-defensin result and a subsequent randomized review was conducted with each result. Multilevel logistic regression analysis assessed the effect of having the alpha-defensin result on the ability to diagnose PJI. Alpha-defensin was correlated to the number of synovial white blood cells (WBCs) and percentage of polymorphonuclear cells (%PMN). Results. Intraobserver reliability and interobserver agreement did not change when the alpha-defensin result was available. Positive alpha-defensin results had greater synovial WBCs (mean 31,854 cells/μL, SD 32,594) and %PMN (mean 93.0%, SD 5.5%) than negative alpha-defensin results (mean 974 cells/μL, SD 3,988; p < 0.001 and mean 39.4% SD 28.6%; p < 0.001). Adding the alpha-defensin result did not alter the diagnosis of a PJI using preoperative (odds ratio (OR) 0.52, 95% confidence interval (CI) 0.14 to 1.88; p = 0.315) or operative (OR 0.52, CI 0.18 to 1.55; p = 0.242) data when clinicians already decided that PJI was present or absent with traditionally available testing. However, when undetermined with traditional preoperative testing, alpha-defensin helped diagnose (OR 0.44, CI 0.30 to 0.64; p < 0.001) or rule out (OR 0.41, CI 0.17 to 0.98; p = 0.044) PJI. Of the 27 undecided cases with traditional testing, 24 (89%) benefited from the addition of alpha-defensin testing. Conclusion. The laboratory-based synovial alpha-defensin immunoassay did not help diagnose or rule out a PJI when added to routine serologies and synovial fluid analyses except in cases where the diagnosis of PJI was unclear. We recommend against the routine use of alpha-defensin and suggest using it only when traditional testing is indeterminate. Cite this article: Bone Joint J 2020;102-B(5):593–599


Introduction. Patient-specific cutting guides entered into clinical practice few years ago, first introduced in total knee replacement and recently also for other joint replacements. Advantages claimed are improving accuracy and repeatability in implant placement. New patient-specific guides to perform an accurate femoral neck resection and provide a precise alignment reference for acetabular reaming in total hip arthroplasty (THA) were recently developed by Medacta International: MyHip Technology. To date femoral guides can be designed for both anterior and posterior approaches, whereas acetabular guides are available only for posterior approach. Evaluation of the repeatability and reproducibility of MyHip guides placement on cadavers is performed using a navigation system. Accuracy of femoral MyHip guides is evaluated also through one author's clinical experience (RP). Materials and Methods. During each cadaveric session one body (2 hips) was available. A pre-operative CT scan has been obtained and used in order to create the 3D bone model of the pelvis and proximal femurs. Afterwards, a surgical planning for THA has been performed for each case, and, once it was approved by the surgeons, the designed patient-specific blocks were made. Intraobserver and interobserver agreement in positioning the guides was assessed getting measures of femoral head resection height (mm), femoral head plane inclination/anteversion (°) and acetabular reaming axis orientation (°). 9 surgeons, through 2 cadaveric sessions, positioned each guide, removed it and re-positioned it 5 times alternatively. The system is judged as accurate if all measures differ less than 3mm and 5°for lengths and angles respectively from the average among all the acquisitions. Clinical experience includes 68 THA which were performed between March 2014 and April 2015. Anterior femoral MyHip guides were used for the femoral head resection, while the acetabular side was prepared using the standard metal instrumentation for minimally invasive anterior approach. Intra-operative complications, as well post-operative leg length difference and implant positioning are assessed. Results. During cadaveric sessions, all measures taken meet the acceptance criteria with the exception of two measures, which are −5,98° and −5,57°, in femoral head plane anteversion and inclination respectively with femoral anterior guides. Looking at intraobserver variation, MyHip Femoral anterior guide positioning average deviation was between −0.91 mm and 1.44 mm (resection height), −1.25° and 1.41° (anteversion), and −0.85° and 0.82° (inclination); MyHip Femoral posterior guide positioning average deviation was between −0.47 mm and 0.67 mm (resection height), −1.33° and 1.50° (anteversion), −0.66° and 1.50° (inclination); MyHip Acetabular posterior guide had an average z-axis deviation from the mean value between −0.91° and 0.91°. All surgeries were successfully performed. The surgeon feels a good fitting and stability of the guide during each surgery. A preliminary analysis suggests optimal outcomes in terms of accurate prosthetic component positioning and reduction of occurrence of leg length inequality. Conclusion. Cadaveric sessions show intraobserver and intraobserver agreement, demonstrating reproducibility and repeatability in placement of MyHip patient specific cutting guides. Clinical experience confirms the advantages claimed by this technique, suggesting a possible reduction of complications usually linked to implant malpositioning, such as wear, impingement, risk of luxation


The Journal of Bone & Joint Surgery British Volume
Vol. 80-B, Issue 2 | Pages 321 - 324
1 Mar 1998
Bar-On E Meyer S Harati G Porat S

Ultrasonography of the hip was performed sequentially by two different examiners in 75 infants. The ultrasound strips were reviewed twice by three paediatric orthopaedic surgeons and classified by the Graf method. The intraobserver and interobserver agreement between the interpretations was analysed using simple and weighted kappa coefficients calculated for agreement on the Graf classification and for grouping as normal (types 1A to 2A), and abnormal requiring treatment (types 2B to 4). When examining the same ultrasound strip, intraobserver agreement for the Graf classification was substantial (mean kappa 0.61), but interobserver agreement was only moderate (kappa 0.50). For the grouping into normal and abnormal, the mean kappa value for intraobserver agreement was 0.67 and for interobserver agreement 0.57. Because of the significant differences in agreement between normal and abnormal hips, we analysed a subgroup of those with at least one abnormal interpretation. Intraobserver agreement within this subgroup showed moderate reliability (kappa 0.41), but interobserver agreement was only fair (kappa 0.28). Interpretations of two different strips performed sequentially showed significantly lower agreement with an intraobserver kappa value of 0.29 and an interobserver value of 0.28. In the subgroup with at least one abnormal reading, the intraobserver kappa was 0.09 and the interobserver 0.1. Our findings suggest that both the technique of performing ultrasonography and the interpretation of the image may influence the result