How to evaluate the accuracy of quantitative trait prediction is crucial to choose the best model among several possible choices in plant breeding. Pearson's correlation coefficient (PCC), serving as a metric for quantifying the strength of the linear association between two variables, is widely used to evaluate the accuracy of the quantitative trait prediction models, and generally performs well in most circumstances. However, PCC may not always offer a comprehensive view of predictive accuracy, especially in cases involving nonlinear relationships or complex dependencies in machine learning-based methods. It has been found that many papers on quantitative trait prediction solely use PCC as a single metric to evaluate the accuracy of their models, which is insufficient and limited from a formal perspective. This study addresses this crucial issue by presenting a typical example and conducting a comparative analysis of PCC and nine other evaluation metrics using four traditional methods and four machine learning-based methods, thereby contributing to the improvement of practical applicability and reliability of plant quantitative trait prediction models. It is recommended to employ PCC in conjunction with other evaluation metrics in a targeted manner based on specific application scenarios to reduce the likelihood of drawing misleading conclusions.
Keywords: Pearson’s correlation coefficient; evaluation metric; genomic selection; quantitative trait prediction; regression prediction.
Copyright © 2024 Pan, Liu, Han, Zhang, Zhao, Li and Wang.