Objective: To assess the quality of automated diagnoses extracted from medical care databases by the Vaccine Safety Datalink (VSD) study.
Methods: Two methods are used to assess quality of VSD diagnosis data. The first method compares common automated and abstracted diagnostic categories ("outcomes") in 1-2% simple random samples of study populations. The second method estimates positive predictive values of automated diagnosis codes used to identify potential cases of rare conditions (e.g., acute ataxia) for inclusion in nested case-control medical record abstraction studies.
Results: There was good agreement (64-68%) between automated and abstracted outcomes in the 1-2% simple random samples at 3 of the 4 VSD sites and poor agreement (44%) at 1 site. Overall at 3 sites, 56% of children with automated cerebella ataxia codes (ICD-9 = 334) and 22% with "lack of coordination" codes (ICD-9 = 781.3) met objective clinical criteria for acute ataxia.
Conclusions: The misclassification error rates for automated screening outcomes substantially reduce the power of screening analyses and limit usefulness of screening analyses to moderate to strong vaccine-outcome associations. Medical record verification of outcomes is needed for definitive assessments.