Anti-nutrient factors are inherently present in almost all major crops, which impede the absorption of crucial vitamins and minerals upon human consumption. The commonly found anti-nutrients in food crops are saponins, tannins, lectins, and phytates etc. Currently, there is a lack of computational server for identification of proteins that encode for anti-nutritional factors in plants. Consequently, this study represents a computational approach aimed at distinguishing between proteins encoding anti-nutritional factors and those providing essential nutrients. In this work, machine learning algorithms have been employed to identify plant specific anti-nutrient factor proteins from protein sequences by using compositional features. Achieving a five-fold cross-validation training performance of 94.34% AUC-ROC and 94.13% AUC-PR with extreme gradient boosting surpasses the performance of other methods such as support vector machine, random forest, and adaptive boosting. These results suggest the proposed approach is highly reliable in predicting plant-specific anti-nutritional factor proteins. The resulting prediction models have led to the development of an online server named ANPS, freely available at https://nipb-bi.icar.gov.in .
Keywords: ANPS; Anti-nutrients; Machine learning; Prediction; Web server.
© 2024. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.