Abstract
Introduction
Shoulder arthroplasty (SA) has been performed with different types of implants, each requiring different replacement systems. However, data on previously utilized implant types are not always available before revision surgery, which is paramount to determining the appropriate equipment and procedure. Therefore, this meta-analysis aimed to evaluate the accuracy of the AI models in classifying SA implant types.
Methods
This systematic review was conducted in Pubmed, Embase, SCOPUS, and Web of Science from inception to December 2023, according to PRISMA guidelines. Peer-reviewed research evaluating the accuracy of AI-based tools on upper-limb X-rays for recognizing and categorizing SA implants was included. In addition to the overall meta-analysis, subgroup analysis was performed according to the type of AI model applied (CNN (Convolutional neural network), non-CNN, or Combination of both) and the similarity of utilized datasets between studies.
Results
13 articles were eligible for inclusion in this meta-analysis (including 138 different tests assessing models’ efficacy). Our meta-analysis demonstrated an overall sensitivity and specificity of 0.891 (95% CI:0.866-0.912) and 0.549 (95% CI:0.532,0.566) for classifying implants in SA, respectively. The results of our subgroup analyses were as follows: CNN-subgroup: a sensitivity of 0.898 (95% CI:0.873-0.919) and a specificity of 0.554 (95% CI:0.537,0.570), Non-CNN subgroup: a sensitivity of 0.809 (95% CI:0.665-0.900) and specificity of 0.522 (95% CI:0.440,0.603), combined subgroup: a sensitivity of 0.891 (95% CI:0.752-0.957) and a specificity of 0.547 (95% CI:0.463,0.629).
Studies using the same dataset demonstrated an overall sensitivity and specificity of 0.881 (95% CI:0.856-0.903) and 0.542 (95% CI:0.53,0.554), respectively. Studies that used other datasets showed an overall sensitivity and specificity of 0.995 (95% CI:969,0.999) and 0.678 (95% CI:0.234, 0.936), respectively.
Conclusion
AI-based classification of shoulder implant types can be considered a sensitive method. Our study showed the potential role of using CNN-based models and different datasets to enhance accuracy, which could be investigated in future studies.