Objectives: Analysis of routinely collected electronic health data is a key tool for long-term condition research and practice for hospitalised patients. This requires accurate and complete ascertainment of a broad range of diagnoses, something not always recorded on an admission document at a single point in time. This study aimed to ascertain how far back in time electronic hospital records need to be interrogated to capture long-term condition diagnoses.
Design: Retrospective observational study of routinely collected hospital electronic health record data.
Setting: Queen Elizabeth Hospital Birmingham (UK)-linked data held by the PIONEER acute care data hub.
Participants: Patients whose first recorded admission for chronic obstructive pulmonary disease (COPD) exacerbation (n=560) or acute stroke (n=2142) was between January and December 2018 and who had a minimum of 10 years of data prior to the index date.
Outcome measures: We identified the most common International Classification of Diseases version 10-coded diagnoses received by patients with COPD and acute stroke separately. For each diagnosis, we derived the number of patients with the diagnosis recorded at least once over the full 10-year lookback period, and then compared this with shorter lookback periods from 1 year to 9 years prior to the index admission.
Results: Seven of the top 10 most common diagnoses in the COPD dataset reached >90% completeness by 6 years of lookback. Atrial fibrillation and diabetes were >90% coded with 2-3 years of lookback, but hypertension and asthma completeness continued to rise all the way out to 10 years of lookback. For stroke, 4 of the top 10 reached 90% completeness by 5 years of lookback; angina pectoris was >90% coded at 7 years and previous transient ischaemic attack completeness continued to rise out to 10 years of lookback.
Conclusion: A 7-year lookback captures most, but not all, common diagnoses. Lookback duration should be tailored to the conditions being studied.
Keywords: Electronic Health Records; Hospitals; Information Extraction.
© Author(s) (or their employer(s)) 2024. Re-use permitted under CC BY. Published by BMJ.