Background: High-density single-nucleotide polymorphism (SNP) genotyping arrays are a powerful tool for genome-wide association studies and can give valuable insight into patterns of population structure and linkage disequilibrium (LD). In this study we used the Brassica 60kSNP Illumina consortium genotyping array to assess the influence of selection and breeding for important agronomic traits on LD and haplotype structure in a diverse panel of 203 Chinese semi-winter rapeseed (Brassica napus) breeding lines.
Results: Population structure and principal coordinate analysis, using a subset of the SNPs, revealed diversification into three subpopulations and one mixed population, reflecting targeted introgressions from external gene pools during breeding. Pairwise LD analysis within the A- and C-subgenomes of allopolyploid B. napus revealed that mean LD, at a threshold of r2=0.1, decayed on average around ten times more rapidly in the A-subgenome (0.25-0.30 Mb) than in the C-subgenome (2.00-2.50 Mb). A total of 3,097 conserved haplotype blocks were detected over a total length of 182.49 Mb (15.17% of the genome). The mean size of haplotype blocks was considerably longer in the C-subgenome (102.85 Kb) than in the A-subgenome (33.51 Kb), and extremely large conserved haplotype blocks were found on a number of C-genome chromosomes. Comparative sequence analysis revealed conserved blocks containing homoloeogous quantitative trait loci (QTL) for seed erucic acid and glucosinolate content, two key seed quality traits under strong agronomic selection. Interestingly, C-subgenome QTL were associated with considerably greater conservation of LD than their corresponding A-subgenome homoeologues.
Conclusions: The data we present in this paper provide evidence for strong selection of large chromosome regions associated with important rapeseed seed quality traits conferred by C-subgenome QTL. This implies that an increase in genetic diversity and recombination within the C-genome is particularly important for breeding. The resolution of genome-wide association studies is also expected to vary greatly across different genome regions.