Phenotyping cotton leaf chlorophyll via in situ hyperspectral reflectance sensing, spectral vegetation indices, and machine learning

Front Plant Sci. 2024 Nov 21:15:1495593. doi: 10.3389/fpls.2024.1495593. eCollection 2024.

Abstract

Cotton (Gossypium hirsutum L.) leaf chlorophyll (Chl) has been targeted as a phenotype for breeding selection to improve cotton tolerance to environmental stress. However, high-throughput phenotyping methods based on hyperspectral reflectance sensing are needed to rapidly screen cultivars for chlorophyll in the field. The objectives of this study were to deploy a cart-based field spectroradiometer to measure cotton leaf reflectance in two field experiments over four growing seasons at Maricopa, Arizona and to evaluate 148 spectral vegetation indices (SVI's) and 14 machine learning methods (MLM's) for estimating leaf chlorophyll from spectral information. Leaf tissue was sampled concurrently with reflectance measurements, and laboratory processing provided leaf Chl a, Chl b, and Chl a+b as both areas-basis (µg cm-2) and mass-basis (mg g-1) measurements. Leaf reflectance along with several data transformations involving spectral derivatives, log-inverse reflectance, and SVI's were evaluated as MLM input. Models trained with 2019-2020 data performed poorly in tests with 2021-2022 data (e.g., RMSE=23.7% and r2 = 0.46 for area-basis Chl a+b), indicating difficulty transferring models between experiments. Performance was more satisfactory when training and testing data were based on a random split of all data from both experiments (e.g., RMSE=10.5% and r2 = 0.88 for area basis Chl a+b), but performance beyond the conditions of the present study cannot be guaranteed. Performance of SVI's was in the middle (e.g., RMSE=16.2% and r2 = 0.69 for area-basis Chl a+b), and SVI's provided more consistent error metrics compared to MLM's. Ensemble MLM's which combined estimates from several base estimators (e.g., random forest, gradient booting, and AdaBoost regressors) and a multi-layer perceptron neural network method performed best among MLM's. Input features based on spectral derivatives or SVI's improved MLM's performance compared to inputting reflectance data. Spectral reflectance data and SVI's involving red edge radiation were the most important inputs to MLM's for estimation of cotton leaf chlorophyll. Because MLM's struggled to perform beyond the constraints of their training data, SVI's should not be overlooked as practical plant trait estimators for high-throughput phenotyping, whereas MLM's offer great opportunity for data mining to develop more robust indices.

Keywords: chlorophyll; cotton; high-throughput; machine learning; mapping population; phenomics; spectral index; spectroradiometer.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. Cotton Incorporated (Project No. 17-642 and 13-738) contributed partial funding for this research, while most of the funding was provided by the USDA Agricultural Research Service (Project No. 2020-13660-008-000-D, 2020-2100-013-000-D, and 3098-21600-001-000-D). This research also used resources provided by the SCINet project of the USDA Agricultural Research Service (Project No. 0201-88888-003-000-D and 0201-88888-002-000-D). USDA is an equal opportunity employer and provider.