Objective: To create a natural language processing (NLP) algorithm to identify transgender patients in electronic health records.
Design: We developed an NLP algorithm to identify patients (keyword + billing codes). Patients were manually reviewed, and their health care services categorized by billing code.
Setting: Vanderbilt University Medical Center.
Participants: 234 adult and pediatric transgender patients.
Main outcome measures: Number of transgender patients correctly identified and categorization of health services utilized.
Results: We identified 234 transgender patients of whom 50% had a diagnosed mental health condition, 14% were living with HIV, and 7% had diabetes. Largely driven by hormone use, nearly half of patients attended the Endocrinology/Diabetes/Metabolism clinic. Many patients also attended the Psychiatry, HIV, and/or Obstetrics/Gynecology clinics. The false positive rate of our algorithm was 3%.
Conclusions: Our novel algorithm correctly identified transgender patients and provided important insights into health care utilization among this marginalized population.
Keywords: Electronic Health Records; Natural Language Processing; Transgender; Utilization.