Clinical researchers use Pfirrmann classification for grading intervertebral disc degeneration radiologically. Basic researchers have access to morphology and instead use the Thompson score. The aim of this study was to assess the inter-observer reliability of both classifications, along with their correlation. We obtained T2-weighted MR images of 80 human lumbar intervertebral discs with various stages of degeneration to assess the Pfirrmann-score. Then the discs were dissected midsagittally to obtain the Thompson-score. The observers were typical users of both grading systems: a spine surgeon, radiology resident, orthopaedic resident, and a basic scientist, all experts on intervertebral disc degeneration. Cohen's kappa (CK) was used to determine inter-observer reliability, and intra-class correlation (ICC) as a measure for the variation between the outcomes. For the Thompson score, the average CK was 0.366 and ICC score 0.873. The average inter-observer reliability for the Pfirrmann score was 0.214 (CK) and 0.790 (ICC). Comparing the grading systems, the intra-observer agreement was 0.240 (CK) and 0.685 (ICC).Purpose of study and background
Methods and Results