Many important traits in plants, animals, and microbes are polygenic and challenging to improve through traditional marker-assisted selection. Genomic prediction addresses this by incorporating all genetic data in a mixed model framework. The primary method for predicting breeding values is genomic best linear unbiased prediction, which uses the realized genomic relationship or kinship matrix (K) to connect genotype to phenotype. Genomic relationship matrices share information among entries to estimate the observed entries' genetic values and predict unobserved entries' genetic values. One of the main parameters of such models is genomic variance (σg2), or the variance of a trait associated with a genome-wide sample of DNA polymorphisms, and genomic heritability (hg2); however, the seminal papers introducing different forms of K often do not discuss their effects on the model estimated variance components despite their importance in genetic research and breeding. Here, we discuss the effect of several standard methods for calculating the genomic relationship matrix on estimates of σg2 and hg2. With current approaches, we found that the genomic variance tends to be either overestimated or underestimated depending on the scaling and centering applied to the marker matrix (Z), the value of the average diagonal element of K, and the assortment of alleles and heterozygosity (H) in the observed population. Using the average semivariance, we propose a new matrix, KASV, that directly yields accurate estimates of σg2 and hg2 in the observed population and produces best linear unbiased predictors equivalent to routine methods in plants and animals.
Keywords: average semivariance; genomic best linear unbiased predictor; genomic heritability; genomic relatedness; genomic variance; linear mixed model.
© The Author(s) 2022. Published by Oxford University Press on behalf of Genetics Society of America.