Achieving diverse representation in biomedical data is critical for healthcare equity. Failure to do so perpetuates health disparities and exacerbates biases that may harm patients with underrepresented ancestral backgrounds. We present a quantitative assessment of representation in datasets used across human genomics, including genome-wide association studies (GWASs), pharmacogenomics, clinical trials, and direct-to-consumer (DTC) genetic testing. We suggest that relative proportions of ancestries represented in datasets, compared to the global census population, provide insufficient representation of global ancestral genetic diversity. Some populations have greater proportional representation in data relative to their population size and the genomic diversity present in their ancestral haplotypes. As insights from genomics become increasingly integrated into evidence-based medicine, strategic inclusion and effective mechanisms to ensure representation of global genomic diversity in datasets are imperative.
Keywords: GWAS; ancestry; bias; direct-to-consumer; diversity; equity; genomics; global datasets; healthcare; inclusion; pharmacogenetics; representation.
Copyright © 2024 The Author(s). Published by Elsevier Inc. All rights reserved.