Single-nucleotide polymorphism discovery and validation in high-density SNP array for genetic analysis in European white oaks

Mol Ecol Resour. 2015 Nov;15(6):1446-59. doi: 10.1111/1755-0998.12407. Epub 2015 Apr 6.

Abstract

An Illumina Infinium SNP genotyping array was constructed for European white oaks. Six individuals of Quercus petraea and Q. robur were considered for SNP discovery using both previously obtained Sanger sequences across 676 gene regions (1371 in vitro SNPs) and Roche 454 technology sequences from 5112 contigs (6542 putative in silico SNPs). The 7913 SNPs were genotyped across the six parental individuals, full-sib progenies (one within each species and two interspecific crosses between Q. petraea and Q. robur) and three natural populations from south-western France that included two additional interfertile white oak species (Q. pubescens and Q. pyrenaica). The genotyping success rate in mapping populations was 80.4% overall and 72.4% for polymorphic SNPs. In natural populations, these figures were lower (54.8% and 51.9%, respectively). Illumina genotype clusters with compression (shift of clusters on the normalized x-axis) were detected in ~25% of the successfully genotyped SNPs and may be due to the presence of paralogues. Compressed clusters were significantly more frequent for SNPs showing a priori incorrect Illumina genotypes, suggesting that they should be considered with caution or discarded. Altogether, these results show a high experimental error rate for the Infinium array (between 15% and 20% of SNPs potentially unreliable and 10% when excluding all compressed clusters), and recommendations are proposed when applying this type of high-throughput technique. Finally, results on diversity levels and shared polymorphisms across targeted white oaks and more distant species of the Quercus genus are discussed, and perspectives for future comparative studies are proposed.

Keywords: SNP detection; cluster compression; genotyping; genotyping error rates; infinium; oaks.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • France
  • Genetic Markers*
  • Genetic Variation*
  • Genotype
  • Genotyping Techniques / methods*
  • Polymorphism, Single Nucleotide*
  • Quercus / classification*
  • Quercus / genetics*

Substances

  • Genetic Markers