Advertisement for orthosearch.org.uk
Results 1 - 20 of 2012
Results per page:

Aims. Classifying trochlear dysplasia (TD) is useful to determine the treatment options for patients suffering from patellofemoral instability (PFI). There is no consensus on which classification system is more reliable and reproducible for the purpose of guiding clinicians’ management of PFI. There are also concerns about the validity of the Dejour Classification (DJC), which is the most widely used classification for TD, having only a fair reliability score. The Oswestry-Bristol Classification (OBC) is a recently proposed system of classification of TD, and the authors report a fair-to-good interobserver agreement and good-to-excellent intraobserver agreement in the assessment of TD. The aim of this study was to compare the reliability and reproducibility of these two classifications. Methods. In all, six assessors (four consultants and two registrars) independently evaluated 100 axial MRIs of the patellofemoral joint (PFJ) for TD and classified them according to OBC and DJC. These assessments were again repeated by all raters after four weeks. The inter- and intraobserver reliability scores were calculated using Cohen’s kappa and Cronbach’s α. Results. Both classifications showed good to excellent interobserver reliability with high α scores. The OBC classification showed a substantial intraobserver agreement (mean kappa 0.628; p < 0.005) whereas the DJC showed a moderate agreement (mean kappa 0.572; p < 0.005). There was no significant difference in the kappa values when comparing the assessments by consultants with those by registrars, in either classification system. Conclusion. This large study from a non-founding institute shows both classification systems to be reliable for classifying TD based on axial MRIs of the PFJ, with the simple-to-use OBC having a higher intraobserver reliability score than that of the DJC. Cite this article: Bone Jt Open 2023;4(7):532–538


Bone & Joint Research
Vol. 12, Issue 5 | Pages 313 - 320
8 May 2023
Saiki Y Kabata T Ojima T Kajino Y Kubo N Tsuchiya H

Aims. We aimed to assess the reliability and validity of OpenPose, a posture estimation algorithm, for measurement of knee range of motion after total knee arthroplasty (TKA), in comparison to radiography and goniometry. Methods. In this prospective observational study, we analyzed 35 primary TKAs (24 patients) for knee osteoarthritis. We measured the knee angles in flexion and extension using OpenPose, radiography, and goniometry. We assessed the test-retest reliability of each method using intraclass correlation coefficient (1,1). We evaluated the ability to estimate other measurement values from the OpenPose value using linear regression analysis. We used intraclass correlation coefficients (2,1) and Bland–Altman analyses to evaluate the agreement and error between radiography and the other measurements. Results. OpenPose had excellent test-retest reliability (intraclass correlation coefficient (1,1) = 1.000). The R. 2. of all regression models indicated large correlations (0.747 to 0.927). In the flexion position, the intraclass correlation coefficients (2,1) of OpenPose indicated excellent agreement (0.953) with radiography. In the extension position, the intraclass correlation coefficients (2,1) indicated good agreement of OpenPose and radiography (0.815) and moderate agreement of goniometry with radiography (0.593). OpenPose had no systematic error in the flexion position, and a 2.3° fixed error in the extension position, compared to radiography. Conclusion. OpenPose is a reliable and valid tool for measuring flexion and extension positions after TKA. It has better accuracy than goniometry, especially in the extension position. Accurate measurement values can be obtained with low error, high reproducibility, and no contact, independent of the examiner’s skills. Cite this article: Bone Joint Res 2023;12(5):313–320


The Bone & Joint Journal
Vol. 102-B, Issue 8 | Pages 1041 - 1047
1 Aug 2020
Hamoodi Z Singh J Elvey MH Watts AC

Aims. The Wrightington classification system of fracture-dislocations of the elbow divides these injuries into six subtypes depending on the involvement of the coronoid and the radial head. The aim of this study was to assess the reliability and reproducibility of this classification system. Methods. This was a blinded study using radiographs and CT scans of 48 consecutive patients managed according to the Wrightington classification system between 2010 and 2018. Four trauma and orthopaedic consultants, two post CCT fellows, and one speciality registrar based in the UK classified the injuries. The seven observers reviewed preoperative radiographs and CT scans twice, with a minimum four-week interval. Radiographs and CT scans were reviewed separately. Inter- and intraobserver reliability were calculated using Fleiss and Cohen kappa coefficients. The Landis and Koch criteria were used to interpret the strength of the kappa values. Validity was assessed by calculating the percentage agreement against intraoperative findings. Results. Of the 48 patients, three (6%) had type A injury, 11 (23%) type B, 16 (33%) type B+, 16 (33%) Type C, two (4%) type D+, and none had a type D injury. All 48 patients had anteroposterior (AP) and lateral radiographs, 44 had 2D CT scans, and 39 had 3D reconstructions. The interobserver reliability kappa value was 0.52 for radiographs, 0.71 for 2D CT scans, and 0.73 for a combination of 2D and 3D reconstruction CT scans. The median intraobserver reliability was 0.75 (interquartile range (IQR) 0.62 to 0.79) for radiographs, 0.77 (IQR 0.73 to 0.94) for 2D CT scans, and 0.89 (IQR 0.77 to 0.93) for the combination of 2D and 3D reconstruction. Validity analysis showed that accuracy significantly improved when using CT scans (p = 0.018 and p = 0.028 respectively). Conclusion. The Wrightington classification system is a reliable and valid method of classifying fracture-dislocations of the elbow. CT scans are significantly more accurate than radiographs when identifying the pattern of injury, with good intra- and interobserver reproducibility. Cite this article: Bone Joint J 2020;102-B(8):1041–1047


Bone & Joint Research
Vol. 8, Issue 8 | Pages 357 - 366
1 Aug 2019
Zhang B Sun H Zhan Y He Q Zhu Y Wang Y Luo C

Objectives. CT-based three-column classification (TCC) has been widely used in the treatment of tibial plateau fractures (TPFs). In its updated version (updated three-column concept, uTCC), a fracture morphology-based injury mechanism was proposed for effective treatment guidance. In this study, the injury mechanism of TPFs is further explained, and its inter- and intraobserver reliability is evaluated to perfect the uTCC. Methods. The radiological images of 90 consecutive TPF patients were collected. A total of 47 men (52.2%) and 43 women (47.8%) with a mean age of 49.8 years (. sd. 12.4; 17 to 77) were enrolled in our study. Among them, 57 fractures were on the left side (63.3%) and 33 were on the right side (36.7%); no bilateral fracture existed. Four observers were chosen to classify or estimate independently these randomized cases according to the Schatzker classification, TCC, and injury mechanism. With two rounds of evaluation, the kappa values were calculated to estimate the inter- and intrareliability. Results. The overall inter- and intraobserver agreements of the injury mechanism were substantial (κ. inter. = 0.699, κ. intra. = 0.749, respectively). The initial position and the force direction, which are two components of the injury mechanism, had substantial agreement for both inter-reliability or intrareliability. The inter- and intraobserver agreements were lower in high-energy fractures (Schatzker types IV to VI; κ. inter. = 0.605, κ. intra. = 0.721) compared with low-energy fractures (Schatzker types I to III; κ. inter. = 0.81, κ. intra. = 0.832). The inter- and intraobserver agreements were relatively higher in one-column fractures (κ. inter. = 0.759, κ. intra. = 0.801) compared with two-column and three-column fractures. Conclusion. The complete theory of injury mechanism of TPFs was first put forward to make the TCC consummate. It demonstrates substantial inter- and intraobserver agreement generally. Furthermore, the injury mechanism can be promoted clinically. Cite this article: B-B. Zhang, H. Sun, Y. Zhan, Q-F. He, Y. Zhu, Y-K. Wang, C-F. Luo. Reliability and repeatability of tibial plateau fracture assessment with an injury mechanism-based concept. Bone Joint Res 2019;8:357–366. DOI: 10.1302/2046-3758.88.BJR-2018-0331.R1


The Journal of Bone & Joint Surgery British Volume
Vol. 94-B, Issue 1 | Pages 32 - 36
1 Jan 2012
Nho J Lee Y Kim HJ Ha Y Suh Y Koo K

A variety of radiological methods of measuring version of the acetabular component after total hip replacement (THR) have been described. The aim of this study was to evaluate the reliability and validity of six methods (those of Lewinnek; Widmer; Hassan et al; Ackland, Bourne and Uhthoff; Liaw et al; and Woo and Morrey) that are currently in use. In 36 consecutive patients who underwent THR, version of the acetabular component was measured by three independent examiners on plain radiographs using these six methods and compared with measurements using CT scans. The intra- and interobserver reliabilities of each measurement were estimated. All measurements on both radiographs and CT scans had excellent intra- and interobserver reliability and the results from each of the six methods correlated well with the CT measurements. However, measurements made using the methods of Widmer and of Ackland, Bourne and Uhthoff were significantly different from the CT measurements (both p < 0.001), whereas measurements made using the remaining four methods were similar to the CT measurements. With regard to reliability and convergent validity, we recommend the use of the methods described by Lewinnek, Hassan et al, Liaw et al and Woo and Morrey for measurement of version of the acetabular component


Bone & Joint Research
Vol. 5, Issue 8 | Pages 347 - 352
1 Aug 2016
Nuttall J Evaniew N Thornley P Griffin A Deheshi B O’Shea T Wunder J Ferguson P Randall RL Turcotte R Schneider P McKay P Bhandari M Ghert M

Objectives. The diagnosis of surgical site infection following endoprosthetic reconstruction for bone tumours is frequently a subjective diagnosis. Large clinical trials use blinded Central Adjudication Committees (CACs) to minimise the variability and bias associated with assessing a clinical outcome. The aim of this study was to determine the level of inter-rater and intra-rater agreement in the diagnosis of surgical site infection in the context of a clinical trial. Materials and Methods. The Prophylactic Antibiotic Regimens in Tumour Surgery (PARITY) trial CAC adjudicated 29 non-PARITY cases of lower extremity endoprosthetic reconstruction. The CAC members classified each case according to the Centers for Disease Control (CDC) criteria for surgical site infection (superficial, deep, or organ space). Combinatorial analysis was used to calculate the smallest CAC panel size required to maximise agreement. A final meeting was held to establish a consensus. Results. Full or near consensus was reached in 20 of the 29 cases. The Fleiss kappa value was calculated as 0.44 (95% confidence interval (CI) 0.35 to 0.53), or moderate agreement. The greatest statistical agreement was observed in the outcome of no infection, 0.61 (95% CI 0.49 to 0.72, substantial agreement). Panelists reached a full consensus in 12 of 29 cases and near consensus in five of 29 cases when CDC criteria were used (superficial, deep or organ space). A stable maximum Fleiss kappa of 0.46 (95% CI 0.50 to 0.35) at CAC sizes greater than three members was obtained. Conclusions. There is substantial agreement among the members of the PARITY CAC regarding the presence or absence of surgical site infection. Agreement on the level of infection, however, is more challenging. Additional clinical information routinely collected by the prospective PARITY trial may improve the discriminatory capacity of the CAC in the parent study for the diagnosis of infection. Cite this article: J. Nuttall, N. Evaniew, P. Thornley, A. Griffin, B. Deheshi, T. O’Shea, J. Wunder, P. Ferguson, R. L. Randall, R. Turcotte, P. Schneider, P. McKay, M. Bhandari, M. Ghert. The inter-rater reliability of the diagnosis of surgical site infection in the context of a clinical trial. Bone Joint Res 2016;5:347–352. DOI: 10.1302/2046-3758.58.BJR-2016-0036.R1


The Bone & Joint Journal
Vol. 102-B, Issue 4 | Pages 478 - 484
1 Apr 2020
Daniels AM Wyers CE Janzing HMJ Sassen S Loeffen D Kaarsemaker S van Rietbergen B Hannemann PFW Poeze M van den Bergh JP

Aims

Besides conventional radiographs, the use of MRI, CT, and bone scintigraphy is frequent in the diagnosis of a fracture of the scaphoid. However, which techniques give the best results remain unknown. The investigation of a new imaging technique initially requires an analysis of its precision. The primary aim of this study was to investigate the interobserver agreement of high-resolution peripheral quantitative CT (HR-pQCT) in the diagnosis of a scaphoid fracture. A secondary aim was to investigate the interobserver agreement for the presence of other fractures and for the classification of scaphoid fracture.

Methods

Two radiologists and two orthopaedic trauma surgeons evaluated HR-pQCT scans of 31 patients with a clinically-suspected scaphoid fracture. The observers were asked to determine the presence of a scaphoid or other fracture and to classify the scaphoid fracture based on the Herbert classification system. Fleiss kappa statistics were used to calculate the interobserver agreement for the diagnosis of a fracture. Intraclass correlation coefficients (ICCs) were used to assess the agreement for the classification of scaphoid fracture.


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_6 | Pages 38 - 38
2 May 2024
Buadooh KJ Holmes B Ng A
Full Access

The Revision Hip Complexity Classification (RHCC) was developed by modified Delphi system in 2022 to provide a comprehensive, reproducible framework for the multidisciplinary discussion of complex revision hip surgery. The aim of this study was to assess the validity, intra-relater and inter-relater reliability of the RHCC. Radiographs and clinical vignettes of 20 consecutive patients who had undergone revision of Total Hip Arthroplasty (THA) at our unit during the previous 12-month period were provided to observers. Five observers, comprising 3 revision hip consultants, 1 hip fellow and 1 ST3-8 registrar were familiarised with the RHCC. Each revision THA case was classified on two separate occasions by each observer, with a mean time between assessments of 42.6 days (24–57). Inter-observer reliability was assessed using the Fleiss™ Kappa statistic and percentage agreement. Intra-observer reliability was assessed using the Cohen Kappa statistic. Validity was assessed using percentage agreement and Cohen Kappa comparing observers to the RHCC web-based application result. All observers were blinded to patient notes, operation notes and post-operative radiographs throughout the process. Inter-observer reliability showed fair agreement in both rounds 1 and 2 of the survey (0.296 and 0.353 respectively), with a percentage agreement of 69% and 75%. Inter-observer reliability was highest in H3-type revisions with kappa values of 0.577 and 0.441. Mean intra-observer reliability showed moderate agreement with a kappa value of 0.446 (0.369 to 0.773). Validity percentage agreement was 44% and 39% respectively, with mean kappa values of 0.125 and 0.046 representing only slight agreement. This study demonstrates that classification using the RHCC without utilisation of the web-based application is unsatisfactory, showing low validity and reliability. Reliability was higher for more complex H3-type cases. The use of the RHCC web app is recommended to ensure the accurate and reliable classification of revision THA cases


Orthopaedic Proceedings
Vol. 104-B, Issue SUPP_14 | Pages 12 - 12
1 Dec 2022
Maggini E Bertoni G Guizzi A Vittone G Manni F Saccomanno M Milano G
Full Access

Glenoid and humeral head bone defects have long been recognized as major determinants in recurrent shoulder instability as well as main predictors of outcomes after surgical stabilization. However, a universally accepted method to quantify them is not available yet. The purpose of the present study is to describe a new CT method to quantify bipolar bone defects volume on a virtually generated 3D model and to evaluate its reproducibility. A cross-sectional observational study has been conducted. Forty CT scans of both shoulders were randomly selected from a series of exams previously acquired on patients affected by anterior shoulder instability. Inclusion criterion was unilateral anterior shoulder instability with at least one episode of dislocation. Exclusion criteria were: bilateral shoulder instability; posterior or multidirectional instability, previous fractures and/or surgery to both shoulders; congenital or acquired inflammatory, neurological, or degenerative diseases. For all patients, CT exams of both shoulders were acquired at the same time following a standardized imaging protocol. The CT data sets were analysed on a standard desktop PC using the software 3D Slicer. Computer-based reconstruction of the Hill-Sachs and glenoid bone defect were performed through Boolean subtraction of the affected side from the contralateral one, resulting in a virtually generated bone fragment accurately fitting the defect. The volume of the bone fragments was then calculated. All measurements were conducted by two fellowship-trained orthopaedic shoulder surgeons. Each measurement was performed twice by one observer to assess intra-observer reliability. Inter and intra-observer reliability were calculated. Intraclass Correlation Coefficients (ICC) were calculated using a two-way random effect model and evaluation of absolute agreement. Confidence intervals (CI) were calculated at 95% confidence level for reliability coefficients. Reliability values range from 0 (no agreement) to 1 (maximum agreement). The study included 34 males and 6 females. Mean age (+ SD) of patients was 36.7 + 10.10 years (range: 25 – 73 years). A bipolar bone defect was observed in all cases. Reliability of humeral head bone fragment measurements showed excellent intra-observer agreement (ICC: 0.92, CI 95%: 0.85 – 0.96) and very good interobserver agreement (ICC: 0.89, CI 95%: 0.80 – 0.94). Similarly, glenoid bone loss measurement resulted in excellent intra-observer reliability (ICC: 0.92, CI 95%: 0.85 – 0.96) and very good inter-observer agreement (ICC: 0.84, CI 95%:0.72 – 0.91). In conclusion, matching affected and intact contralateral humeral head and glenoid by reconstruction on a computer-based virtual model allows identification of bipolar bone defects and enables quantitative determination of bone loss


Orthopaedic Proceedings
Vol. 104-B, Issue SUPP_13 | Pages 36 - 36
1 Dec 2022
Benavides B Cornell D Schneider P Hildebrand K
Full Access

Heterotopic ossification (HO) is a well-known complication of traumatic elbow injuries. The reported rates of post-traumatic HO formation vary from less than 5% with simple elbow dislocations, to greater than 50% in complex fracture-dislocations. Previous studies have identified fracture-dislocations, delayed surgical intervention, and terrible triad injuries as risk factors for HO formation. There is, however, a paucity of literature regarding the accuracy of diagnosing post-traumatic elbow HO. Therefore, the purpose of our study was to determine the inter-rater reliability of HO diagnosis using standard radiographs of the elbow at 52 weeks post-injury, as well as to report on the rate of mature compared with immature HO. We hypothesized inter-rater reliability would be poor among raters for HO formation. Prospectively collected data from a large clinical trial was reviewed by three independent reviewers (one senior orthopedic resident, one senior radiology resident, and one expert upper extremity orthopedic surgeon). Each reviewer examined anonymized 52-week post-injury radiographs of the elbow and recorded: 1. the presence or absence of HO, 2. the location of HO, 3. the size of the HO (in cm, if present), and 4. the maturity of the HO formation. Maturity was defined by consensus prior to image review and defined as an area of well-defined cortical and medullary bone outside the cortical borders of the humerus, ulna, or radius. Immature lesions were defined as an area of punctate calcification with an ill-defined cloud-like density outside the cortical borders of the humerus, ulna or radius. Data were collected using a standardized online data collection form (CognizantMD, Toronto, ON, CA). Inter-rater reliability was calculated using Fleiss’ Kappa statistic and a multivariate logistic regression analysis was performed to identify risk factors for HO formation in general, as well as mature HO at 52 weeks post injury. Statistical analysis was performed using RStudio (version1.4, RStudio, Boston, MA, USA). A total of 79 radiographs at the 52-week follow-up were reviewed (54% male, mean age 50, age SD 14, 52% operatively treated). Inter-rater reliability using Fleiss’ Kappa was k= 0.571 (p = 0.0004) indicating moderate inter-rater reliability among the three reviewers. The rate of immature HO at 52 weeks was 56%. The multivariate logistic regression analysis identified male sex as a significant risk factor for HO development (OR 5.29, 1.55-20.59 CI, p = 0.011), but not for HO maturity at 52 weeks. Age, time to surgery, and operative intervention were not found to be significant predictors for either HO formation or maturity of the lesion in this cohort. Our study demonstrates moderate inter-rater reliability in determining the presence of HO at 52 weeks post-elbow injury. There was a high rate (56%) of immature HO at 52-week follow-up. We also report the finding of male sex as a significant risk factor for post traumatic HO development. Future research directions could include investigation into possible male predominance for traumatic HO formation, as well as improving inter-rater reliability through developing a standardized and validated classification system for reporting the radiographic features of HO formation around the elbow


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_7 | Pages 62 - 62
4 Apr 2023
Rashid M Islam R Marsden S Trompeter A Teoh K
Full Access

A number of classification systems exist for posterior malleolus fractures of the ankle. The reliability of these classification systems remains unclear. The primary aim of this study was to evaluate the reliability of three commonly utilised fracture classification systems of the posterior malleolus. 60 patients across 2 hospitals sustaining an unstable ankle fracture with a posterior malleolus fragment were identified. All patients underwent radiographs and computed tomography of their injured ankle. 9 surgeons including pre-ST3 level, ST3-8 level, and consultant level applied the Haraguchi, Rammelt, and Mason & Molloy classifications to these patients, at two timepoints, at least 4 weeks apart. The order was randomised between assessments. Inter-rater reliability was assessed using Fleiss’ kappa and 95% confidence intervals (CI). Intra-rater reliability was assessed using Cohen's Kappa and standard error (SE). Inter-rater reliability (Fleiss’ Kappa) was calculated for the Haraguchi classification as 0.522 (95% CI 0.490 – 0.553), for the Rammelt classification as 0.626 (95% CI 0.600 – 0.652), and the Mason & Molloy classification as 0.541 (95% CI 0.514 – 0.569). Intra-rater reliability (Cohen's Kappa) was 0.764 (SE 0.034) for the Haraguchi, 0.763 (SE 0.031) for the Rammelt, 0.688 (SE 0.035) for the Mason & Molloy classification. This study reports the inter-rater and intra-rater reliability for three classification systems for posterior malleolus fractures. Based on definitions by Landis & Koch (1977), inter-rater reliability was rated as ‘moderate’ for the Haraguchi and Mason & Molloy classifications; and ‘substantial’ for the Rammelt classification. Similarly, the intra-rater reliability was rated as ‘substantial’ for all three classifications


Bone & Joint Open
Vol. 3, Issue 11 | Pages 913 - 920
18 Nov 2022
Dean BJF Berridge A Berkowitz Y Little C Sheehan W Riley N Costa M Sellon E

Aims. The evidence demonstrating the superiority of early MRI has led to increased use of MRI in clinical pathways for acute wrist trauma. The aim of this study was to describe the radiological characteristics and the inter-observer reliability of a new MRI based classification system for scaphoid injuries in a consecutive series of patients. Methods. We identified 80 consecutive patients with acute scaphoid injuries at one centre who had presented within four weeks of injury. The radiographs and MRI scans were assessed by four observers, two radiologists, and two hand surgeons, using both pre-existing classifications and a new MRI based classification tool, the Oxford Scaphoid MRI Assessment Rating Tool (OxSMART). The OxSMART was used to categorize scaphoid injuries into three grades: contusion (grade 1); unicortical fracture (grade 2); and complete bicortical fracture (grade 3). Results. In total there were 13 grade 1 injuries, 11 grade 2 injuries, and 56 grade 3 injuries in the 80 consecutive patients. The inter-observer reliability of the OxSMART was substantial (Kappa = 0.711). The inter-observer reliability of detecting an obvious fracture was moderate for radiographs (Kappa = 0.436) and MRI (Kappa = 0.543). Only 52% (29 of 56) of the grade 3 injuries were detected on plain radiographs. There were two complications of delayed union, both of which occurred in patients with grade 3 injuries, who were promptly treated with cast immobilization. There were no complications in the patients with grade 1 and 2 injuries and the majority of these patients were treated with early mobilization as pain allowed. Conclusion. This MRI based classification tool, the OxSMART, is reliable and clinically useful in managing patients with acute scaphoid injuries. Cite this article: Bone Jt Open 2022;3(11):913–920


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_12 | Pages 27 - 27
23 Jun 2023
Chen K Wu J Xu L Han X Chen X
Full Access

To propose a modified approach to measuring femoro-epiphyseal acetabular roof (FEAR) index while still abiding by its definition and biomechanical basis, and to compare the reliabilities of the two methods. To propose a classification for medial sourcil edges. We retrospectively reviewed a consecutive series of patients treated with periacetabular osteotomy and/or hip arthroscopy. A modified FEAR index was defined. Lateral center-edge angle, Sharp's angle, Tonnis angle on all hips, as well as FEAR index with original and modified approaches were measured. Intra- and inter-observer reliability were calculated as intraclass correlation coefficients (ICC) for FEAR index with both approaches and other alignments. A classification was proposed to categorize medial sourcil edges. ICC for the two approaches across different sourcil groups were also calculated. After reviewing 411 patients, 49 were finally included. Thirty-two patients (40 hips) were identified as having borderline dysplasia defined by an LCEA of 18 to 25 degrees. Intra-observer ICC for the modified method were good to excellent for borderline hips; poor to excellent for DDH; moderate to excellent for normal hips. As for inter-observer reliability, modified approach outperformed original approach with moderate to good inter-observer reliability (DDH group, ICC=0.636; borderline dysplasia group, ICC=0.813; normal hip group, ICC=0.704). The medial sourcils were classified to 3 groups upon its morphology. Type II(39.0%) and III(43.9%) sourcils were the dominant patterns. The sourcil classification had substantial intra-observer agreement (observer 4, kappa=0.68; observer 1, kappa=0.799) and moderate inter-observer agreement (kappa=0.465). Modified approach to FEAR index possessed greater inter-observer reliability in all medial sourcil patterns. The modified FEAR index has better intra- and inter-observer reliability compared with the original approach. Type II and III sourcils accounts for the majority to which only the modified approach is applicable


Orthopaedic Proceedings
Vol. 106-B, Issue SUPP_11 | Pages 12 - 12
4 Jun 2024
Chapman J Choudhary Z Gupta S Airey G Mason L
Full Access

Introduction. Treatment pathways of 5. th. metatarsal fractures are commonly directed based on fracture classification, with Jones types for example, requiring closer observation and possibly more aggressive management. Primary objective. To investigate the reliability of assessment of subtypes of 5. th. metatarsal fractures by different observers. Methods. Patients were identified from our prospectively collected database. We included all patient referred to our virtual fracture clinic with a suspected or confirmed 5. th. metatarsal fracture. Plain AP radiographs were reviewed by two observers, who were initially trained on the 5. th. metatarsal classification identification. Zones were defined as Zone 1.1, 1.2, 1.3, 2, 3, diaphyseal shaft (DS), distal metaphysis (DM) and head. An inter-observer reliability analysis using Cohen's Kappa coefficient was carried out, and degree of observer agreement described using Landis & Koch's description. All data was analysed using IBM SPSS v.27. Results. 878 patients were identified. The two observers had moderate agreement when identifying fractures in all zones, apart from metatarsal head fractures, which scored substantial agreement (K=.614). Zones 1.1 (K=.582), 2 (K=.536), 3 (K=.601) and DS (K=.544) all tended towards but did not achieve substantial agreement. Whilst DS fractures achieved moderate agreement, there was an apparent difficulty with distal DS, resulting in a lot of cross over with DM (DS 210 vs 109; DM 76 vs 161). Slight agreement with the next highest adjacent zone was found when injuries were thought to be in zones 1.2, 1.3 and 2 (K=0.17, 0.115 and 0.152 respectively). Conclusions. Reliability of sub-categorising 5. th. metatarsal fractures using standardised instructions conveys moderate to substantial agreement in most cases. If the region of the fracture is going to be used in an algorithm to guide a management plan and clinical follow up during a virtual clinic review, defining fractures of zones 1–3 needs careful consideration


Bone & Joint Open
Vol. 4, Issue 5 | Pages 363 - 369
22 May 2023
Amen J Perkins O Cadwgan J Cooke SJ Kafchitsas K Kokkinakis M

Aims. Reimers migration percentage (MP) is a key measure to inform decision-making around the management of hip displacement in cerebral palsy (CP). The aim of this study is to assess validity and inter- and intra-rater reliability of a novel method of measuring MP using a smart phone app (HipScreen (HS) app). Methods. A total of 20 pelvis radiographs (40 hips) were used to measure MP by using the HS app. Measurements were performed by five different members of the multidisciplinary team, with varying levels of expertise in MP measurement. The same measurements were repeated two weeks later. A senior orthopaedic surgeon measured the MP on picture archiving and communication system (PACS) as the gold standard and repeated the measurements using HS app. Pearson’s correlation coefficient (r) was used to compare PACS measurements and all HS app measurements and assess validity. Intraclass correlation coefficient (ICC) was used to assess intra- and inter-rater reliability. Results. All HS app measurements (from 5 raters at week 0 and week 2 and PACS rater) showed highly significant correlation with the PACS measurements (p < 0.001). Pearson’s correlation coefficient (r) was constantly over 0.9, suggesting high validity. Correlation of all HS app measures from different raters to each other was significant with r > 0.874 and p < 0.001, which also confirms high validity. Both inter- and intra-rater reliability were excellent with ICC > 0.9. In a 95% confidence interval for repeated measurements, the deviation of each specific measurement was less than 4% MP for single measurer and 5% for different measurers. Conclusion. The HS app provides a valid method to measure hip MP in CP, with excellent inter- and intra-rater reliability across different medical and allied health specialties. This can be used in hip surveillance programmes by interdisciplinary measurers. Cite this article: Bone Jt Open 2023;4(5):363–369


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_14 | Pages 3 - 3
10 Oct 2023
Verma S Malaviya S Barker S
Full Access

Technological advancements in orthopaedic surgery have mainly focused on increasing precision during the operation however, there have been few developments in post-operative physiotherapy. We have developed a computer vision program using machine learning that can virtually measure the range of movement of a joint to track progress after surgery. This data can be used by physiotherapists to change patients’ exercise regimes with more objectively and help patients visualise the progress that they have made. In this study, we tested our program's reliability and validity to find a benchmark for future use on patients. We compared 150 shoulder joint angles, measured using a goniometer, and those calculated by our program called ArmTracking in a group of 10 participants (5 males and 5 females). Reliability was tested using adjusted R squared and validity was tested using 95% limits of agreement. Our clinically acceptable limit of agreement was ± 10° for ArmTracking to be used interchangeably with goniometry. ArmTracking showed excellent overall reliability of 97.1% when all shoulder movements were combined but there were lower scores for some movements like shoulder extension at 75.8%. There was moderate validity shown when all shoulder movements were combined at 9.6° overestimation and 18.3° underestimation. Computer vision programs have a great potential to be used in telerehabilitation to collect useful information as patients carry out prescribed exercises at home. However, they need to be trained well for precise joint detections to reduce the range of errors in readings


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_7 | Pages 8 - 8
4 Apr 2023
Fridberg M Ghaffari A Husum H Rahbek O Kold S
Full Access

There is no consensus on how to evaluate and grade pin site infection. A precise, objective and reliable pin site infectious score is warranted. The literature was reviewed for pin site infection classification systems, The Modified Gordon Score (MGS) grade 0-6 was used. The aim was to test the reliability of The Modified Gordon Infection Score. The observed agreement and inter-rater reliability were investigated between nurse and doctors. MGS was performed in the outpatient clinic at Aalborg University Hospital, Denmark on 1472 pin sites in 119 patients by one nurse and one of three orthopaedic surgeons blinded to each other's judgement. The data was stored in a Red Cap Database for further statistical analysis. The observed agreement between the nurse and the 3 orthopaedic surgeons was evaluated with a one-way random-effect model with interclass correlation with absolute agreement. Furthermore the observed agreement for each of the 3 surgeons with the nurse was calculated. The distribution of MGS infection grade in the 1472 pin sites was: Grade 0; n=1372, Grade 1; n=32, Grade 2; n=39, Grade 3; n=24, Grade 4; n=5, Grade 5; n=0, Grade 6; n=0. The observed agreement between the nurse and the surgeons was calculated as 98%. The ICC estimated between nurse and the surgeons was 0,8943 (ICC >0,85 = reliable). The grading was done by three different doctors with an agreement with the nurse as follows. Rater1 (n=416) =99,5 %, Rater2 (n=1440) =97,4%, Rater3 (n=1440) =96,6%. A limitation to this study is that the dataset represents mostly clean pin sites with MGS 0. Only 100 pin sites had signs of superficial infection MGS 1-4 none above 4. We found that the MGS infection score is highly reliable for low grade infections but we cannot conclude on reliability in severe infections


The Bone & Joint Journal
Vol. 103-B, Issue 8 | Pages 1339 - 1344
1 Aug 2021
Jain S Mohrir G Townsend O Lamb JN Palan J Aderinto J Pandit H

Aims. This aim of this study was to assess the reliability and validity of the Unified Classification System (UCS) for postoperative periprosthetic femoral fractures (PFFs) around cemented polished taper-slip (PTS) stems. Methods. Radiographs of 71 patients with a PFF admitted consecutively at two centres between 25 February 2012 and 19 May 2020 were collated by an independent investigator. Six observers (three hip consultants and three trainees) were familiarized with the UCS. Each PFF was classified on two separate occasions, with a mean time between assessments of 22.7 days (16 to 29). Interobserver reliability for more than two observers was assessed using percentage agreement and Fleiss’ kappa statistic. Intraobserver reliability between two observers was calculated with Cohen kappa statistic. Validity was tested on surgically managed UCS type B PFFs where stem stability was documented in operation notes (n = 50). Validity was assessed using percentage agreement and Cohen kappa statistic between radiological assessment and intraoperative findings. Kappa statistics were interpreted using Landis and Koch criteria. All six observers were blinded to operation notes and postoperative radiographs. Results. Interobserver reliability percentage agreement was 58.5% and the overall kappa value was 0.442 (moderate agreement). Lowest kappa values were seen for type B fractures (0.095 to 0.360). The mean intraobserver reliability kappa value was 0.672 (0.447 to 0.867), indicating substantial agreement. Validity percentage agreement was 65.7% and the mean kappa value was 0.300 (0.160 to 0.4400) indicating only fair agreement. Conclusion. This study demonstrates that the UCS is unsatisfactory for the classification of PFFs around PTS stems, and that it has considerably lower reliability and validity than previously described for other stem types. Radiological PTS stem loosening in the presence of PFF is poorly defined and formal intraoperative testing of stem stability is recommended. Cite this article: Bone Joint J 2021;103-B(8):1339–1344


Orthopaedic Proceedings
Vol. 105-B, Issue SUPP_4 | Pages 3 - 3
3 Mar 2023
Roy K Joshi P Ali I Shenoy P Syed A Barlow D Malek I Joshi Y
Full Access

Classifying trochlear dysplasia (TD) is useful to determine the treatment options for patients suffering from patellofemoral instability (PFI). There is no consensus on which classification system is more reliable and reproducible for this purpose to guide clinicians in order to treat PFI. There are also concerns about validity of the Dejour classification (DJC), which is the most widely used classification for TD, having only a fair reliability score. The Oswestry-Bristol classification (OBC) is a recently proposed system of classification of TD and the authors report a fair-to-good interobserver agreement and good-to-excellent intra-observer agreement in the assessment of TD. The aim of this study was to compare the reliability and reproducibility of these two classifications. 6 assessors (4 consultants and 2 registrars) independently evaluated 100 magnetic resonance axial images of the patella-femoral joint for TD and classified them according to OBC and DJC. These assessments were again repeated by all raters after 4 weeks. The inter and intra-observer reliability scores were calculated using Cohen's kappa and Cronbach's alpha. Both classifications showed good to excellent interobserver reliability with high alpha scores. The OBC classification showed a substantial intra-observer agreement (mean kappa 0.628)[p<0.005] whereas the DJC showed a moderate agreement (mean kappa 0.572) [p<0.005]. There was no significant difference in the kappa values when comparing the assessments by consultants to those by registrars, in either classification systems. This large study from a non-founding institute shows both classification systems to be reliable for classifying TD based on magnetic resonance axial images of the patella-femoral joint, with the simple to use OBC having a higher intra-observer reliability score compared to the DJC


Bone & Joint Open
Vol. 5, Issue 6 | Pages 524 - 531
24 Jun 2024
Woldeyesus TA Gjertsen J Dalen I Meling T Behzadi M Harboe K Djuv A

Aims. To investigate if preoperative CT improves detection of unstable trochanteric hip fractures. Methods. A single-centre prospective study was conducted. Patients aged 65 years or older with trochanteric hip fractures admitted to Stavanger University Hospital (Stavanger, Norway) were consecutively included from September 2020 to January 2022. Radiographs and CT images of the fractures were obtained, and surgeons made individual assessments of the fractures based on these. The assessment was conducted according to a systematic protocol including three classification systems (AO/Orthopaedic Trauma Association (OTA), Evans Jensen (EVJ), and Nakano) and questions addressing specific fracture patterns. An expert group provided a gold-standard assessment based on the CT images. Sensitivities and specificities of surgeons’ assessments were estimated and compared in regression models with correlations for the same patients. Intra- and inter-rater reliability were presented as Cohen’s kappa and Gwet’s agreement coefficient (AC1). Results. We included 120 fractures in 119 patients. Compared to radiographs, CT increased the sensitivity of detecting unstable trochanteric fractures from 63% to 70% (p = 0.028) and from 70% to 76% (p = 0.004) using AO/OTA and EVJ, respectively. Compared to radiographs alone, CT increased the sensitivity of detecting a large posterolateral trochanter major fragment or a comminuted trochanter major fragment from 63% to 76% (p = 0.002) and from 38% to 55% (p < 0.001), respectively. CT improved intra-rater reliability for stability assessment using EVJ (AC1 0.68 to 0.78; p = 0.049) and for detecting a large posterolateral trochanter major fragment (AC1 0.42 to 0.57; p = 0.031). Conclusion. A preoperative CT of trochanteric fractures increased detection of unstable fractures using the AO/OTA and EVJ classification systems. Compared to radiographs, CT improved intra-rater reliability when assessing fracture stability and detecting large posterolateral trochanter major fragments. Cite this article: Bone Jt Open 2024;5(6):524–531