Assessing the validity of a data driven segmentation approach: A 4 year longitudinal study of healthcare utilization and mortality

PLoS One. 2018 Apr 5;13(4):e0195243. doi: 10.1371/journal.pone.0195243. eCollection 2018.

Abstract

Background: Segmentation of heterogeneous patient populations into parsimonious and relatively homogenous groups with similar healthcare needs can facilitate healthcare resource planning and development of effective integrated healthcare interventions for each segment. We aimed to apply a data-driven, healthcare utilization-based clustering analysis to segment a regional health system patient population and validate its discriminative ability on 4-year longitudinal healthcare utilization and mortality data.

Methods: We extracted data from the Singapore Health Services Electronic Health Intelligence System, an electronic medical record database that included healthcare utilization (inpatient admissions, specialist outpatient clinic visits, emergency department visits, and primary care clinic visits), mortality, diseases, and demographics for all adult Singapore residents who resided in and had a healthcare encounter with our regional health system in 2012. Hierarchical clustering analysis (Ward's linkage) and K-means cluster analysis using age and healthcare utilization data in 2012 were applied to segment the selected population. These segments were compared using their demographics (other than age) and morbidities in 2012, and longitudinal healthcare utilization and mortality from 2013-2016.

Results: Among 146,999 subjects, five distinct patient segments "Young, healthy"; "Middle age, healthy"; "Stable, chronic disease"; "Complicated chronic disease" and "Frequent admitters" were identified. Healthcare utilization patterns in 2012, morbidity patterns and demographics differed significantly across all segments. The "Frequent admitters" segment had the smallest number of patients (1.79% of the population) but consumed 69% of inpatient admissions, 77% of specialist outpatient visits, 54% of emergency department visits, and 23% of primary care clinic visits in 2012. 11.5% and 31.2% of this segment has end stage renal failure and malignancy respectively. The validity of cluster-analysis derived segments is supported by discriminative ability for longitudinal healthcare utilization and mortality from 2013-2016. Incident rate ratios for healthcare utilization and Cox hazards ratio for mortality increased as patient segments increased in complexity. Patients in the "Frequent admitters" segment accounted for a disproportionate healthcare utilization and 8.16 times higher mortality rate.

Conclusion: Our data-driven clustering analysis on a general patient population in Singapore identified five patient segments with distinct longitudinal healthcare utilization patterns and mortality risk to provide an evidence-based segmentation of a regional health system's healthcare needs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Ambulatory Care
  • Cluster Analysis
  • Delivery of Health Care / statistics & numerical data*
  • Delivery of Health Care / trends*
  • Electronic Health Records
  • Emergency Service, Hospital
  • Female
  • Health Care Costs
  • Health Resources / statistics & numerical data
  • Hospitalization
  • Humans
  • Inpatients
  • Longitudinal Studies
  • Male
  • Outpatients
  • Patient Acceptance of Health Care
  • Reproducibility of Results
  • Singapore / epidemiology

Grants and funding

This research received grant funding from SingHealth Foundation Health Services Research (Aging) Startup Grant SHF/HSRAg004/2015 and SingHealth Nurturing Clinician Scientist Award Academic Clinical Programme Funding FY 2016 Cycle 2. URL: https://research.singhealth.com.sg/Pages/ResearchGrants.aspx. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.