Purpose: The Vanderbilt DNA Databank (BioVU) is a biorepository that currently contains >80,000 DNA samples linked to electronic medical records. Although BioVU is a valuable source of samples and phenotypes for genetic association studies, it is unclear whether the administratively assigned race/ethnicity in BioVU can accurately describe and be used as a proxy for genetic ancestry.
Methods: We genotyped 360 single nucleotide polymorphisms on the Illumina DNA Test Panel containing ancestry informative markers in 1910 BioVU samples with observer-reported ancestry and 384 samples from the Multiple Sclerosis Genetics Group with self-reported ancestry. Genetic ancestry was inferred for all individuals using Structure 2.2.
Results: More than 98% of observer-reported European Americans were genetically inferred to have at least 60% European ancestry. Ninety-three percent of observer-reported African Americans were genetically inferred to be predominantly of African ancestry. We determined that the concordance of observer-reported race/ethnicity and inferred genetic ancestry was not significantly different from that of self-reported race/ethnicity in either population (P = 0.09 and 0.94 in European Americans and African Americans, respectively).
Conclusions: Observer-reported race/ethnicity for European Americans and African Americans approximates genetic ancestry as well as self-reported race/ethnicity, making biorepositories linked to electronic medical records such as BioVU a viable source of DNA samples for future large-scale genetic association studies.