Natural Language Processing (NLP) offers an automated method to extract data from unstructured free text fields for arthroplasty registry participation. Our objective was to investigate how accurately NLP can be used to extract structured clinical data from unstructured clinical notes when compared with manual data extraction. A group of 1,000 randomly selected clinical and hospital notes from eight different surgeons were collected for patients undergoing primary arthroplasty between 2012 and 2018. In all, 19 preoperative, 17 operative, and two postoperative variables of interest were manually extracted from these notes. A NLP algorithm was created to automatically extract these variables from a training sample of these notes, and the algorithm was tested on a random test sample of notes. Performance of the NLP algorithm was measured in Statistical Analysis System (SAS) by calculating the accuracy of the variables collected, the ability of the algorithm to collect the correct information when it was indeed in the note (sensitivity), and the ability of the algorithm to not collect a certain data element when it was not in the note (specificity).Aims
Methods