Machine Learning-Based Tomato Fruit Shape Classification System

Plants (Basel). 2024 Aug 23;13(17):2357. doi: 10.3390/plants13172357.

Abstract

Fruit shape significantly impacts the quality and commercial value of tomatoes (Solanum lycopersicum L.). Precise grading is essential to elucidate the genetic basis of fruit shape in breeding programs, cultivar descriptions, and variety registration. Despite this, fruit shape classification is still primarily based on subjective visual inspection, leading to time-consuming and labor-intensive processes prone to human error. This study presents a novel approach incorporating machine learning techniques to establish a robust fruit shape classification system. We trained and evaluated seven supervised machine learning algorithms by leveraging a public dataset derived from the Tomato Analyzer tool and considering the current four classification systems as label variables. Subsequently, based on class-specific metrics, we derived a novel classification framework comprising seven discernible shape classes. The results demonstrate the superiority of the Support Vector Machine model in terms of its accuracy, surpassing human classifiers across all classification systems. The new classification system achieved the highest accuracy, averaging 88%, and maintained a similar performance when validated with an independent dataset. Positioned as a common standard, this system contributes to standardizing tomato fruit shape classification, enhancing accuracy, and promoting consensus among researchers. Its implementation will serve as a valuable tool for overcoming bias in visual classification, thereby fostering a deeper understanding of consumer preferences and facilitating genetic studies on fruit shape morphometry.

Keywords: feature extraction; morphology recognition; support vector machine.

Grants and funding

This research was funded by Agencia Nacional de Promoción Científica y Tecnológica, Argentina grant number FONCyT PICT 2018-00824, PICT-2021-GRF-TI- 508 00481 and Consejo Nacional de Investigaciones Científicas y Técnicas, Argentina grant number PUE0043, PIP-509 3189 and Universidad Nacional de Rosario, Argentina grant number 80020190300004UR.