The work aims to leverage computer vision and artificial intelligence technologies to quantify key components in food distribution services. Specifically, it focuses on dish counting, content identification, and portion size estimation in a dining hall setting. An RGB camera is employed to capture the tray delivery process in a self-service restaurant, providing test images for plate counting and content identification algorithm comparison, using standard evaluation metrics. The approach utilized the YOLO architecture, a widely recognized deep learning model for object detection and computer vision. The model is trained on labeled image data, and its performance is assessed using a precision-recall curve at a confidence threshold of 0.5, achieving a mean average precision (mAP) of 0.873, indicating robust overall performance. The weight estimation procedure combines computer vision techniques to measure food volume using both RGB and depth cameras. Subsequently, density models specific to each food type are applied to estimate the detected food weight. The estimation model's parameters are calibrated through experiments that generate volume-to-weight conversion tables for different food items. Validation of the system was conducted using rice and chicken, yielding error margins of 5.07% and 3.75%, respectively, demonstrating the feasibility and accuracy of the proposed method.
Keywords: artificial intelligence; computer vision; deep learning; food weight estimation.