Researchers and clinicians measuring outcomes following total ankle replacement (TAR) are challenged by the wide range of outcome measures used in the literature without consensus as to which are valid, reliable, and responsive in this population. This review identifies region- or joint-specific outcome measures used for evaluating TAR outcomes and synthesises evidence for their measurement properties. A standard search strategy was conducted of electronic databases MEDLINE, EMBASE and CINAHL (to June 2015) to identify foot/ankle measures in use. A best evidence synthesis approach was taken to critically appraise measurement properties [COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN)] of identified measures. The review was restricted to English publications and excluded cross-cultural adaptations. Measurement properties collected from each article were coded for validity, reliability, responsiveness, or interpretability. Clinimetric evidence exists for identified measures tested in non-TAR populations, but were not the focus of this review. The search identified 14 studies to include in the best evidence synthesis with 32 articles providing clinimetric evidence for eight of the measures (one CBO, seven PRO), however only five measures were tested in a TAR population (Foot Function Index, Ankle Osteoarthritis Scale, American Orthopaedic Foot and Ankle Society Ankle-Hindfoot Scale [AOFAS], Foot and Ankle Outcome Score, Self-Reported Foot and Ankle Score). Five studies provided clinimetric evidence in a TAR population and their methodological quality was assessed: (1) Validity—two good quality studies examining different measures provide moderate evidence supporting construct validity (FFI, AOS, AOFAS self-reported items; SEFAS); (2) Reliability—two good quality studies examining different measures provide moderate evidence supporting internal consistency and test-retest reliability (FFI, AOS, AOFAS self-reported items; FAOS, SEFAS); (3) Responsiveness—three poor quality studies, thus unknown evidence for responsiveness; (4) Interpretability—two studies provide interpretability values (AOS, FFI, AOFAS self-reported items; AOS). This review offers a basis for choosing the most appropriate instrument for evaluating TAR outcomes. Numerous outcome measures were identified with evidence supporting their use in populations with various foot/ankle conditions, but none have strong evidence supporting use in a TAR population. Measures must have adequate clinimetric properties in all patient groups in which they are applied. Evidence supporting or critiquing an instrument should not be based on studies with poor quality methodology, as identified by this review. Further testing in a TAR population would benefit identified measures with emphasis on adequate sample sizes, testing a priori hypotheses, and evaluating their content validity for a TAR population