Three radiological methods are commonly used to assess the outcome of total hip replacement (THR). They aim to record the appearance of lucent areas and migration of the prosthesis in a reproducible manner. Two of them were designed to monitor the implant through time and one to grade the quality of cementing. We have measured the level of inter- and intraobserver agreement in all three systems. We randomised 30 patients to receive either finger packing or retrograde gun cementing during Charnley hip replacements. The postoperative departmental radiographs were evaluated in a blinded study by two orthopaedic trainees, two consultants and two experts in THR. The trainees and consultants repeated the exercise at least two weeks later. We used the unweighted kappa statistic to establish the levels of agreement. In general, intraobserver agreement was moderate but interobserver agreement was poor, with levels similar to or less than those expected by chance. Our results indicate that such systems cannot provide reliable data from centres in different parts of the world, with various levels of surgeon evaluating radiographs at differing time intervals. We discuss the problem and suggest some methods of improvement.