Associations Between Natural Language Processing-Enriched Social Determinants of Health and Suicide Death Among US Veterans

JAMA Netw Open. 2023 Mar 1;6(3):e233079. doi: 10.1001/jamanetworkopen.2023.3079.

Abstract

Importance: Social determinants of health (SDOHs) are known to be associated with increased risk of suicidal behaviors, but few studies use SDOHs from unstructured electronic health record notes.

Objective: To investigate associations between veterans' death by suicide and recent SDOHs, identified using structured and unstructured data.

Design, setting, and participants: This nested case-control study included veterans who received care under the US Veterans Health Administration from October 1, 2010, to September 30, 2015. A natural language processing (NLP) system was developed to extract SDOHs from unstructured clinical notes. Structured data yielded 6 SDOHs (ie, social or familial problems, employment or financial problems, housing instability, legal problems, violence, and nonspecific psychosocial needs), NLP on unstructured data yielded 8 SDOHs (social isolation, job or financial insecurity, housing instability, legal problems, barriers to care, violence, transition of care, and food insecurity), and combining them yielded 9 SDOHs. Data were analyzed in May 2022.

Exposures: Occurrence of SDOHs over a maximum span of 2 years compared with no occurrence of SDOH.

Main outcomes and measures: Cases of suicide death were matched with 4 controls on birth year, cohort entry date, sex, and duration of follow-up. Suicide was ascertained by National Death Index, and patients were followed up for up to 2 years after cohort entry with a study end date of September 30, 2015. Adjusted odds ratios (aORs) and 95% CIs were estimated using conditional logistic regression.

Results: Of 6 122 785 veterans, 8821 committed suicide during 23 725 382 person-years of follow-up (incidence rate 37.18 per 100 000 person-years). These 8821 veterans were matched with 35 284 control participants. The cohort was mostly male (42 540 [96.45%]) and White (34 930 [79.20%]), with 6227 (14.12%) Black veterans. The mean (SD) age was 58.64 (17.41) years. Across the 5 common SDOHs, NLP-extracted SDOH, on average, retained 49.92% of structured SDOHs and covered 80.03% of all SDOH occurrences. SDOHs, obtained by structured data and/or NLP, were significantly associated with increased risk of suicide. The 3 SDOHs with the largest effect sizes were legal problems (aOR, 2.66; 95% CI, 2.46-2.89), violence (aOR, 2.12; 95% CI, 1.98-2.27), and nonspecific psychosocial needs (aOR, 2.07; 95% CI, 1.92-2.23), when obtained by combining structured data and NLP.

Conclusions and relevance: In this study, NLP-extracted SDOHs, with and without structured SDOHs, were associated with increased risk of suicide among veterans, suggesting the potential utility of NLP in public health studies.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Case-Control Studies
  • Female
  • Humans
  • Male
  • Middle Aged
  • Natural Language Processing
  • Social Determinants of Health
  • Suicide* / psychology
  • Veterans* / psychology