Explainable machine learning-assisted origin identification: Chemical profiling of five lotus (Nelumbo nucifera Gaertn.) parts

Food Chem. 2023 Mar 15;404(Pt A):134517. doi: 10.1016/j.foodchem.2022.134517. Epub 2022 Oct 6.

Abstract

Five homologous lotus parts, namely, the leaf, stamen, plumule, flower and leaf base, are all ancient nutrient sources, but their chemical differences are poorly understood. Identification of these parts of origin could contribute to determining reasonable edible and/or medicinal applications without misuse/waste risk. The present work aimed to investigate the feasibility of using metabolic profiles coupled with explainable machine learning (ML) for tracing lotus parts of origin. Assisted with molecular networking, 151 compounds were systematically annotated through an untargeted metabolomics approach. Twenty-eight representative constituents were subsequently quantified for the construction of the ML algorithm. Because most ML algorithms are data-driven black boxes with opaque inner workings, the SHaply Additive exPlanation technique was innovatively used to understand model outputs. By offering an integral analytical platform for phytochemical characterization and information interpretation, these results could serve as a basis for an explainable tool for identification of the specific lotus part of origin.

Keywords: Explainable machine learning; Lotus; Molecular networking; Origin identification; UHPLC-Q-Orbitrap HRMS.

MeSH terms

  • Flowers
  • Lotus* / chemistry
  • Machine Learning
  • Nelumbo* / chemistry
  • Phytochemicals

Substances

  • Phytochemicals