Electronic healthcare records (EHRs) are a rich source of information with a range of uses in secondary research. In the United Kingdom, there is no pan-national or nationally accepted marker indicating veteran status across all healthcare services. This presents significant obstacles to determining the healthcare needs of veterans using EHRs. To address this issue, we developed the Military Service Identification Tool (MSIT), using an iterative two-staged approach. In the first stage, a Structured Query Language approach was developed to identify veterans using a keyword rule-based approach. This informed the second stage, which was the development of the MSIT using machine learning, which, when tested, obtained an accuracy of 0.97, a positive predictive value of 0.90, a sensitivity of 0.91, and a negative predictive value of 0.98. To further validate the performance of the MSIT, the present study sought to verify the accuracy of the EHRs that trained the MSIT models. To achieve this, we surveyed 902 patients of a local specialist mental healthcare service, with 146 (16.2%) being asked if they had or had not served in the Armed Forces. In total 112 (76.7%) reported that they had not served, and 34 (23.3%) reported that they had served in the Armed Forces (accuracy: 0.84, sensitivity: 0.82, specificity: 0.91). The MSIT has the potential to be used for identifying veterans in the UK from free-text clinical documents and future use should be explored.
Keywords: United Kingdom; armed forces; electronic health records; mental health; military service identification tool; national health service; secondary mental healthcare; veterans.