Using a data-driven approach for the development and evaluation of phenotype algorithms for systemic lupus erythematosus

PLoS One. 2023 Feb 16;18(2):e0281929. doi: 10.1371/journal.pone.0281929. eCollection 2023.

Abstract

Background: Systemic lupus erythematosus (SLE) is a chronic autoimmune disease of unknown origin. The objective of this research was to develop phenotype algorithms for SLE suitable for use in epidemiological studies using empirical evidence from observational databases.

Methods: We used a process for empirically determining and evaluating phenotype algorithms for health conditions to be analyzed in observational research. The process started with a literature search to discover prior algorithms used for SLE. We then used a set of Observational Health Data Sciences and Informatics (OHDSI) open-source tools to refine and validate the algorithms. These included tools to discover codes for SLE that may have been missed in prior studies and to determine possible low specificity and index date misclassification in algorithms for correction.

Results: We developed four algorithms using our process: two algorithms for prevalent SLE and two for incident SLE. The algorithms for both incident and prevalent cases are comprised of a more specific version and a more sensitive version. Each of the algorithms corrects for possible index date misclassification. After validation, we found the highest positive predictive value estimate for the prevalent, specific algorithm (89%). The highest sensitivity estimate was found for the sensitive, prevalent algorithm (77%).

Conclusion: We developed phenotype algorithms for SLE using a data-driven approach. The four final algorithms may be used directly in observational studies. The validation of these algorithms provides researchers an added measure of confidence that the algorithms are selecting subjects correctly and allows for the application of quantitative bias analysis.

MeSH terms

  • Algorithms
  • Databases, Factual
  • Humans
  • Lupus Erythematosus, Systemic* / diagnosis
  • Lupus Erythematosus, Systemic* / epidemiology
  • Predictive Value of Tests

Grants and funding

No sources of funding were used to conduct this study or prepare this manuscript. Johnson and Johnson will be the sponsor of Open Access, if applicable. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.