Background: Genomic selection and estimation of genomic breeding values (GBV) are widely used in cattle and plant breeding. Several studies have attempted to detect population subdivision by investigating the structure of the genomic relationship matrix G. However, the question of how these effects influence GBV estimation using genomic best linear unbiased prediction (GBLUP) has received little attention.
Methods: We propose a simple method to decompose G into two independent covariance matrices, one describing the covariance that results from systematic differences in allele frequencies between groups at the pedigree base (G A (*) ) and the other describing genomic relationships (G S) corrected for these differences. Using this decomposition and Fst statistics, we examined whether observed genetic distances between genotyped subgroups within populations resulted from the heterogeneous genetic structure present at the base of the pedigree and/or from breed divergence. Using this decomposition, we tested three models in a forward prediction validation scenario on six traits using Brown Swiss and dual-purpose Fleckvieh cattle data. Model 0 (M0) used both components and is equivalent to the model using the standard G-matrix. Model 1 (M1) used G S only and model 2 (M2), an extension of M1, included a fixed genetic group effect. Moreover, we analyzed the matrix of contributions of each base group (Q) and estimated the effects and prediction errors of each base group using M0 and M1.
Results: The proposed decomposition of G helped to examine the relative importance of the effects of base groups and segregation in a given population. We found significant differences between the effects of base groups for each breed. In forward prediction, differences between models in terms of validation reliability of estimated direct genomic values were small but predictive power was consistently lowest for M1. The relative advantage of M0 or M2 in prediction depended on breed, trait and genetic composition of the validation group. Our approach presents a general analogy with the use of genetic groups in conventional animal models and provides proof that standard GBLUP using G yields solutions equivalent to M0, where base groups are considered as correlated random effects within the additive genetic variance assigned to the genetic base.