Efficient prediction of anticancer peptides through deep learning

PeerJ Comput Sci. 2024 Jul 19:10:e2171. doi: 10.7717/peerj-cs.2171. eCollection 2024.

Abstract

Background: Cancer remains one of the leading causes of mortality globally, with conventional chemotherapy often resulting in severe side effects and limited effectiveness. Recent advancements in bioinformatics and machine learning, particularly deep learning, offer promising new avenues for cancer treatment through the prediction and identification of anticancer peptides.

Objective: This study aimed to develop and evaluate a deep learning model utilizing a two-dimensional convolutional neural network (2D CNN) to enhance the prediction accuracy of anticancer peptides, addressing the complexities and limitations of current prediction methods.

Methods: A diverse dataset of peptide sequences with annotated anticancer activity labels was compiled from various public databases and experimental studies. The sequences were preprocessed and encoded using one-hot encoding and additional physicochemical properties. The 2D CNN model was trained and optimized using this dataset, with performance evaluated through metrics such as accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC).

Results: The proposed 2D CNN model achieved superior performance compared to existing methods, with an accuracy of 0.87, precision of 0.85, recall of 0.89, F1-score of 0.87, and an AUC-ROC value of 0.91. These results indicate the model's effectiveness in accurately predicting anticancer peptides and capturing intricate spatial patterns within peptide sequences.

Conclusion: The findings demonstrate the potential of deep learning, specifically 2D CNNs, in advancing the prediction of anticancer peptides. The proposed model significantly improves prediction accuracy, offering a valuable tool for identifying effective peptide candidates for cancer treatment.

Future work: Further research should focus on expanding the dataset, exploring alternative deep learning architectures, and validating the model's predictions through experimental studies. Efforts should also aim at optimizing computational efficiency and translating these predictions into clinical applications.

Keywords: Anticancer peptides; Artificial intelli-gence; Biological sequence analysis; Disease diagnosis; Image classification; Machine learning; Natural language processing; Neural networks; Protein identification.

Grants and funding

This research is funded by the European University of Atlantic. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.