Housing is an environmental social determinant of health that is linked to mortality and clinical outcomes. We developed a lexicon of housing-related concepts and rule-based natural language processing methods for identifying these housing-related concepts within clinical text. We piloted our methods on several test cohorts: a synthetic cohort generated by ChatGPT for initial infrastructure testing, a cohort with substance use disorders (SUD), and a cohort diagnosed with problems related to housing and economic circumstances (HEC). Our methods successfully identified housing concepts in our ChatGPT notes (recall = 1.0, precision = 1.0), our SUD population (recall = 0.9798, precision = 0.9898), and our HEC population (recall = N/A, precision = 0.9160).
Keywords: Housing instability; natural language processing; social determinants of health.
© The Author(s) 2024.