Fruit shape significantly impacts the quality and commercial value of tomatoes (Solanum lycopersicum L.). Precise grading is essential to elucidate the genetic basis of fruit shape in breeding programs, cultivar descriptions, and variety registration. Despite this, fruit shape classification is still primarily based on subjective visual inspection, leading to time-consuming and labor-intensive processes prone to human error. This study presents a novel approach incorporating machine learning techniques to establish a robust fruit shape classification system. We trained and evaluated seven supervised machine learning algorithms by leveraging a public dataset derived from the Tomato Analyzer tool and considering the current four classification systems as label variables. Subsequently, based on class-specific metrics, we derived a novel classification framework comprising seven discernible shape classes. The results demonstrate the superiority of the Support Vector Machine model in terms of its accuracy, surpassing human classifiers across all classification systems. The new classification system achieved the highest accuracy, averaging 88%, and maintained a similar performance when validated with an independent dataset. Positioned as a common standard, this system contributes to standardizing tomato fruit shape classification, enhancing accuracy, and promoting consensus among researchers. Its implementation will serve as a valuable tool for overcoming bias in visual classification, thereby fostering a deeper understanding of consumer preferences and facilitating genetic studies on fruit shape morphometry.
Keywords: feature extraction; morphology recognition; support vector machine.