Abstract
Introduction: There are a number of classification systems for inter trochanteric fractures of the proximal femur but none that have been universally accepted. For a classification to be successful, it should have excellent reliability and reproducibility among all reviewers in the interpretation of the radiographs. Although the Tronzo classification system is used for inter trochanteric fractures, its reliability had not been tested yet.
Aims: The purpose of this paper is to present the interobserver and intraobserver reliability of the Tronzo classification for intertrochanteric fractures of the femur.
Methods: The radiographs of 50 patients with inter trochanteric fractures were classified by seven observers according to Tronzo’s classification. Three observers were consultant orthopaedic surgeons with a minimum 12 years orthopaedic experience and four were orthopaedic residents. All observers worked independently. The observers repeated the measurements three weeks later without reference to the previous assessments. Intra- and inter-observer agreement was evaluated using the weighted kappa (k) coefficient of Cohen as calculated by the Stata computer package.
Results: For time1, the inter-observer is 0.19 (95% CI 0.05 to 0.43) and for time 2 it is 0.20 (95% CI 0.06 to 0.44): jointly the kappa estimate is 0.20 (95% CI 0.09 to 0.36).
For the intra-observer reliability, the kappa is sightly higher, as one would expect, although it is still only 0.41 (95% CI 0.25 to 0.55).
Overall, the inter-observer reliability is slight (and at best, fair) and the intra-observer reliability is moderate. For clinical use a kappa of 0.8 is strongly recommended and clearly this was not achieved.
Discussion: Tronzo’s classification is simple, easy to use and is predictive of the method of reduction unlike the AO/ASIF classification that is more complicated with several groups and subgroups. However there is poor interobsever reliability as shown in our study. This suggests that comparison of results between studies using the Tronzo classification is not reliable enough to be of use. It should be stressed that reliability studies are not a measure of the accuracy of the classification. There is no right or wrong response in grading each radiograph. The analysis purely measures the reproducibility of the response between several observers.
Intraobserver reliability was moderate in our series, which suggest that individuals could use the Tronzo classification to document their results over a period of time to monitor long-term outcomes and to compare treatment modalities in the same studies.
Theses abstracts were prepared by Professor Roger Lemaire. Correspondence should be addressed to EFORT Central Office, Freihofstrasse 22, CH-8700 Küsnacht, Switzerland.