Despite the vast quantities of published artificial intelligence (AI) algorithms that target trauma and orthopaedic applications, very few progress to inform clinical practice. One key reason for this is the lack of a clear pathway from development to deployment. In order to assist with this process, we have developed the Clinical Practice Integration of Artificial Intelligence (CPI-AI) framework – a five-stage approach to the clinical practice adoption of AI in the setting of trauma and orthopaedics, based on the IDEAL principles ( Cite this article:
To examine whether natural language processing (NLP) using a clinically based large language model (LLM) could be used to predict patient selection for total hip or total knee arthroplasty (THA/TKA) from routinely available free-text radiology reports. Data pre-processing and analyses were conducted according to the Artificial intelligence to Revolutionize the patient Care pathway in Hip and knEe aRthroplastY (ARCHERY) project protocol. This included use of de-identified Scottish regional clinical data of patients referred for consideration of THA/TKA, held in a secure data environment designed for artificial intelligence (AI) inference. Only preoperative radiology reports were included. NLP algorithms were based on the freely available GatorTron model, a LLM trained on over 82 billion words of de-identified clinical text. Two inference tasks were performed: assessment after model-fine tuning (50 Epochs and three cycles of k-fold cross validation), and external validation.Aims
Methods
To examine whether Natural Language Processing (NLP) using a state-of-the-art clinically based Large Language Model (LLM) could predict patient selection for Total Hip Arthroplasty (THA), across a range of routinely available clinical text sources. Data pre-processing and analyses were conducted according to the Ai to Revolutionise the patient Care pathway in Hip and Knee arthroplasty (ARCHERY) project protocol ( There were 3911, 1621 and 1503 patient text documents included from the sources of referral letters, radiology reports and clinic letters respectively. All letter sources displayed significant class imbalance, with only 15.8%, 24.9%, and 5.9% of patients linked to the respective text source documentation having undergone surgery. Untrained model performance was poor, with F1 scores (harmonic mean of precision and recall) of 0.02, 0.38 and 0.09 respectively. This did however improve with model training, with mean scores (range) of 0.39 (0.31–0.47), 0.57 (0.48–0.63) and 0.32 (0.28–0.39) across the 5 folds of cross-validation. Performance deteriorated on external validation across all three groups but remained highest for the radiology report cohort. Even with further training on a large cohort of routinely collected free-text data a clinical LLM fails to adequately perform clinical inference in NLP tasks regarding identification of those selected to undergo THA. This likely relates to the complexity and heterogeneity of free-text information and the way that patients are determined to be surgical candidates.
The extended wait that most patients are now experiencing for hip and knee arthroplasty has raised questions about whether reliance on waiting time as the primary driver for prioritization is ethical, and if other additional factors should be included in determining surgical priority. Our Prioritization of THose aWaiting hip and knee ArthroplastY (PATHWAY) project will explore which perioperative factors are important to consider when prioritizing those on the waiting list for hip and knee arthroplasty, and how these factors should be weighted. The final product will include a weighted benefit score that can be used to aid in surgical prioritization for those awaiting elective primary hip and knee arthroplasty. There will be two linked work packages focusing on opinion from key stakeholders (patients and surgeons). First, an online modified Delphi process to determine a consensus set of factors that should be involved in patient prioritization. This will be performed using standard Delphi methodology consisting of multiple rounds where following initial individual rating there is feedback, discussion, and further recommendations undertaken towards eventual consensus. The second stage will then consist of a Discrete Choice Experiment (DCE) to allow for priority setting of the factors derived from the Delphi through elicitation of weighted benefit scores. The DCE consists of several choice tasks designed to elicit stakeholder preference regarding included attributes (factors).Aims
Methods
There is increasing popularity in the use of artificial intelligence and machine-learning techniques to provide diagnostic and prognostic models for various aspects of Trauma & Orthopaedic surgery. However, correct interpretation of these models is difficult for those without specific knowledge of computing or health data science methodology. Lack of current reporting standards leads to the potential for significant heterogeneity in the design and quality of published studies. We provide an overview of machine-learning techniques for the lay individual, including key terminology and best practice reporting guidelines. Cite this article:
Patients' perspective and experience is heavily modulated by their understanding of their pre-operative disability along with their overall coping strategy and life philosophy. Given that evidence-based practice is relying on patient-reported outcomes more and more, the orthopaedic community must be diligent in differentiating patients that may have the same objective outcome but vary widely on a patient-reported subjective basis. In clinical practice, patient selection is often a sensitive, experience-based decision process that screens for catastrophization, recognizing that certain patients will not benefit from a simple surgery. It is well appreciated that patient's catastrophization can affect their subjective outcome but there is little reported literature on this abstract concept. The study set out to determine if post-operative outcomes correlated with pre-operative catastrophization scales. This current study set out to look at a cohort of complex consecutive foot and ankle cases and describe the relationship between Patient Catastrophizing Score (PCS) and multiple functional outcomes that are used commonly in foot and ankle specifically (SF-12 & FAOS). The PCS has three subcategory rumination, helplessness and magnification. A single institution undertook recruitment in consecutive patients within three surgeon's practice. In the end, 46 patients were found to be eligible in the study with an average age of 54.72 ± 14.41 years-old, a majority female 30 / 46 (65.22%), a minority employed at the pre-operative visit 19/46 (41%) and with an average BMI of 26.2 ± 5.56. We found that the mental component of the SF12 had a statistically significant negative effect with the rumination score (r=−1.03) (p = 0.01) and the helplessness score (r=−1.05) (0.001). There was no statistically significant effect for the physical component of the SF-12. Looking at the FAOS Pain component, it correlated was significantly with the PCS rumination (Multivariate : r= −7.6 (p=0.002) Univariate: r=−2 (0.03)) and helplessness (Multivariate : r=−6.73 (p=0.01) Univariate: r=−1.5 (p=0.03)). Otherwise the FAOS ADL component showed correlation as well with the PCS rumination (Multivariate: r=−4.67 (p=0.02) Univariate : r=−1.85 (p=0.01)), helplessness (Multivariate r=−5.89 (p = 0.01) Univariate r=−1.81 (p = 0.001)) and total score (Multivariate : r=3.74 (p=0.02) Univariate r=−0.75 (p=0.01)). The FAOS Quality of life component was statistically significant for the rumination score (Univariate r=−11.59) (p < 0.05) and the helplessness score (Univariate r=−9.65) (p = 0.002) also the PCS total (Univariate r=8.54) (p = 0.0003). As layed out in our hypothesis, this study did show an association between an increase patient catastrophizing score pre-operatively and a worse outcome in the following scores: Mental component of SF12, FAOS Pain, FAOS ADL and FAOS Quality of life components. This is an association and no causality can be proven within the limits of this current pilot study, but remains alarming. In elective surgeries, catastrophization should be screened for using the PCS form and potentially modulated pre-operatively with the help of allied health therapist while a patient is on the waitlist.
Radiographic assessment of acetabular fragment positioning during periacetabular osteotomy (PAO) is of paramount importance. Plain radiographic examination is time and resource intensive. Fluoroscopic based assessment is increasingly utilized but can introduce distortion. Our purpose was to determine the correlation of intraoperative fluoroscopy-based measurements with a fluoroscopic tool that corrects for distortion with postoperative plain-film measurements. We performed a prospective validation study on 32 PAO's (28 patients) performed by a single academic surgeon. Preoperative standing radiographs, intraoperative fluoroscopic images, and postoperative standing radiographs were evaluated with lateral center edge angle (LCEA), acetabular index (AI), posterior wall sign (PWS), and anterior center edge angle (ACEA). Intraoperative fluoroscopy was adjusted to account for pelvic inclination. The fluoroscopic GRID was utilized in all cases (Phantom MSK Hip Preservation, OrthoGrid LLC, Salt Lake City, UT). Intraoperative fluoroscopic measurements were compared to preoperative and postoperative standing radiographs at 6 weeks using linear regression applied in MINITAB.Introduction
Methods