Abstract
Over 8000 total hip arthroplasties (THA) in the UK were revised in 2019, half for aseptic loosening. It is believed that Artificial Intelligence (AI) could identify or predict failing THA and result in early recognition of poorly performing implants and reduce patient suffering.
The aim of this study is to investigate whether Artificial Intelligence based machine learning (ML) / Deep Learning (DL) techniques can train an algorithm to identify and/or predict failing uncemented THA.
Consent was sought from patients followed up in a single design, uncemented THA implant surveillance study (2010–2021). Oxford hip scores and radiographs were collected at yearly intervals. Radiographs were analysed by 3 observers for presence of markers of implant loosening/failure: periprosthetic lucency, cortical hypertrophy, and pedestal formation.
DL using the RGB ResNet 18 model, with images entered chronologically, was trained according to revision status and radiographic features. Data augmentation and cross validation were used to increase the available training data, reduce bias, and improve verification of results.
184 patients consented to inclusion. 6 (3.2%) patients were revised for aseptic loosening. 2097 radiographs were analysed: 21 (11.4%) patients had three radiographic features of failure.
166 patients were used for ML algorithm testing of 3 scenarios to detect those who were revised. 1) The use of revision as an end point was associated with increased variability in accuracy. The area under the curve (AUC) was 23–97%. 2) Using 2/3 radiographic features associated with failure was associated with improved results, AUC: 75–100%. 3) Using 3/3 radiographic features, had less variability, reduced AUC of 73%, but 5/6 patients who had been revised were identified (total 66 identified).
The best algorithm identified the greatest number of revised hips (5/6), predicting failure 2–8 years before revision, before all radiographic features were visible and before a significant fall in the Oxford Hip score. True-Positive: 0.77, False Positive: 0.29.
ML algorithms can identify failing THA before visible features on radiographs or before PROM scores deteriorate. This is an important finding that could identify failing THA early.