The Challenges of Creating a Gold Standard for De-identification Research

Allen C Browne; Mehmet Kayaalp; Zeyno A Dodd; Pamela Sagan; Clement J McDonald

The Challenges of Creating a Gold Standard for De-identification Research

AMIA Annu Symp Proc. 2014 Nov 14:2014:353-8. eCollection 2014.

Authors

Allen C Browne¹, Mehmet Kayaalp¹, Zeyno A Dodd¹, Pamela Sagan¹, Clement J McDonald¹

Affiliation

¹ Lister Hill National Center for Biomedical Communications, U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD.

PMID: 25954338
PMCID: PMC4420002

Abstract

We created a Gold Standard corpus comprised over 20,000 records of annotated narrative clinical reports for use in the training and evaluation of NLM Scrubber, a de-identification software system for medical records. Our experience with designing the corpus demonstrated the conceptual complexity of the task.

Publication types

Research Support, N.I.H., Intramural

MeSH terms

Confidentiality*
Electronic Health Records*
Health Insurance Portability and Accountability Act
Humans
Software*
United States

Grants and funding

Intramural NIH HHS/United States