To assess the reliability of the pre-operative measurement methods used in the management of the hallux valgus deformity, five observers assessed 50 pre-operative standing foot radiographs on two occasions in order to assess the reliability of radiological hallux valgus assessment using the inter-metatarsal angle (IMA), hallux valgus angle (HVA) and joint congruency. Five published methods of angle measurements described by Hawkins, Venning and Hardy, Mitchell, Miller and Nestor were used. Kappa statistics were used to assess the reliability of the diagnosis of congruency. Regarding IMA and HVA, mean values between the methods were assessed by one-way ANOVA. The differences between the methods and observers were assessed by two-way ANOVA.
The mean IMA and HVA measurements varied significantly between methods on both occasions (p<
0.0001). Mitchell’s method had the lowest and Miller’s the highest mean values. Analysis of variance showed both method and observer variations were significant for IMA. But HVA measurements differed significantly only by observers.
The five different methods of measuring hallux valgus (HVA) and intermetatarsal angles (IMA) and the diagnosis of congruency of first MTP joint were studied on 50 pre-operative standing foot radiographs, to test if these methods were reliable and the results reproducible enough to be used in a treatment algorithm for hallux valgus. Analysis of variance (ANOVA) was used to examine the difference between the five methods and between the five observers. Kappa test was used to measure agreement in diagnosing congruency between two occasions. The mean IMA and HVA varied significantly (p<
0.00001). The ANOVA model showed that method and observer variations were both significant for IMA; there was no significant difference between measurement methods for HVA. Congruency had good (k=0.608) intraobserver and fair (k=0.261) interobserver reliability. A second IMA measurement will lie between 4.2° less and 4.6° more than the first IMA measurement 95% of the time. A second HVA measurement will lie between 6° less and 5.6° more than the first HVA measurement 95% of the time. Overall, there was no advantage to any of the measurement methods, although some observers were better than others. All methods had considerable inter- and intra-observer variability that makes these measurements unreliable.