This research aims to develop a predictive model to discriminate milk produced from a cattle diet either based on grass or not using milk mid-infrared spectrometry and the month of testing (an indirect indicator of the feeding ration). The dataset contained 3,377,715 spectra collected between 2011 and 2021 from 2449 farms and 3 grazing traits defined following the month of testing. Records from 30% of the randomly selected farms were kept in the calibration set, and the remaining records were used to validate the models. Around 90% of the records were correctly discriminated. This accuracy is very good, as some records could be erroneously assigned. The probability of belonging to the GRASS modality allowed confirmation of the model's ability to detect the transition period even if the model was not trained on this data. Indeed, the probability increased from the spring to the summer and then decreased. The discrimination was mainly explained by the changes in the milk fat, mineral, and protein compositions. A hierarchical clustering from the averaged probability per farm and year highlighted 12 groups illustrating different management practices. The probability of belonging to the GRASS class could be used in a tool counting the number of grazing days.
Keywords: composition; grass; grazing; mid-infrared; milk; spectrometry; spectrum.