Non-random error in genotype calling procedures: implications for family-based and case-control genome-wide association studies

Richard J L Anney; Elaine Kenny; Colm T O'Dushlaine; Jessica Lasky-Su; Barbara Franke; Derek W Morris; Benjamin M Neale; Philip Asherson; Stephen V Faraone; Michael Gill

doi:10.1002/ajmg.b.30836

Non-random error in genotype calling procedures: implications for family-based and case-control genome-wide association studies

Am J Med Genet B Neuropsychiatr Genet. 2008 Dec 5;147B(8):1379-86. doi: 10.1002/ajmg.b.30836.

Authors

Richard J L Anney¹, Elaine Kenny, Colm T O'Dushlaine, Jessica Lasky-Su, Barbara Franke, Derek W Morris, Benjamin M Neale, Philip Asherson, Stephen V Faraone, Michael Gill

Affiliation

¹ Neuropsychiatric Genetics Research Group, Department of Psychiatry, Trinity College Dublin, Dublin, Ireland. [email protected]

Abstract

The considerable data-handling requirements for genome wide association studies (GWAS) prohibit individual calling of genotypes and create a reliance on sophisticated "genotype-calling algorithms." Despite their obvious utility, the current genotyping platforms and calling-algorithms used are not without their limitations. Specifically, some genotypes are not called due to the ambiguity of the data. Any bias in the missing data could create spurious results. Using data from the Genetic Analysis Information Network (GAIN) we observed that missing genotypes are not randomly distributed throughout the homozygous and heterozygous groups. Using simulation, we examined whether the level and type of missingness observed might influence deviation from the null-hypothesis under common case-control and family-based statistical approaches. Under a case-control model, where missingness is present in a case group but not the controls, we observed bias giving rise to genome-wide significant type-I error for missingness as low as 3%. The family-based association simulations show close to nominal type-I error at 4% genotype missingness. These findings have important implications to study design, quality-control procedures and reporting of findings in GWAS.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Algorithms
Alleles
Bias
Case-Control Studies
Chi-Square Distribution
Child
Cluster Analysis
Computer Simulation
Family*
Gene Dosage
Genetic Markers
Genetics, Population
Genome, Human*
Genome-Wide Association Study*
Genotype*
Haplotypes
Heterozygote
Homozygote
Humans
Linkage Disequilibrium
Oligonucleotide Array Sequence Analysis
Parents
Polymorphism, Single Nucleotide

Substances

Genetic Markers

Abstract

Publication types

MeSH terms

Substances

Grants and funding