Ensemble learning based on bi-directional gated recurrent unit and convolutional neural network with word embedding module for bioactive peptide prediction

Food Chem. 2024 Dec 12:468:142464. doi: 10.1016/j.foodchem.2024.142464. Online ahead of print.

Abstract

Bioactive peptides, as small protein fragments, are essential mediators of diverse physiological activities, such as antimicrobial, anti-inflammatory, anticancer, antioxidant, and immunomodulatory functions. Despite their substantial potential in pharmaceuticals and the food industry, conventional methods for peptide classification and activity prediction are limited by high costs, time-intensive procedures, and extensive data processing requirements. Here, we present BioPepPred-DLEmb, a novel computational model integrating Convolutional Neural Networks (CNNs) and Bidirectional Gated Recurrent Units (BiGRUs), augmented with natural language processing to encode amino acids into information-dense vectors. Evaluated across nine bioactive peptide datasets, BioPepPred-DLEmb demonstrates superior predictive accuracy (0.909) and sensitivity (0.911) compared to traditional methods. Through UMAP visualization and Kplogo analysis, the model effectively differentiates peptide activity states and identifies key biomarkers. The predicted antimicrobial peptides (Pred-AMPs) exhibit potent efficacy in vitro, achieving low micromolar inhibitory concentrations (2-16 μmol/L) against pathogens such as Escherichia coli and Acinetobacter baumannii. These findings establish a robust foundation for bioactive peptide development, with implications for advancements in precision medicine, personalized therapies, and functional food innovations.

Keywords: BiGRU; Bioactive peptide; Bioinformatics visualization; CNN; Word embedding.