LPI-CSFFR: Combining serial fusion with feature reuse for predicting LncRNA-protein interactions

Comput Biol Chem. 2022 Aug:99:107718. doi: 10.1016/j.compbiolchem.2022.107718. Epub 2022 Jun 27.

Abstract

Long non-coding RNAs (LncRNAs) play important roles in a series of life activities, and they function primarily with proteins. The wet experimental-based methods in lncRNA-protein interactions (lncRPIs) study are time-consuming and expensive. In this study, we propose for the first time a novel feature fusion method, the LPI-CSFFR, to train and predict LncRPIs based on a Convolutional Neural Network (CNN) with feature reuse and serial fusion in sequences, secondary structures, and physicochemical properties of proteins and lncRNAs. The experimental results indicate that LPI-CSFFR achieves excellent performance on the datasets RPI1460 and RPI1807 with an accuracy of 83.7 % and 98.1 %, respectively. We further compare LPI-CSFFR with the state-of-the-art existing methods on the same benchmark datasets to evaluate the performance. In addition, to test the generalization performance of the model, we independently test sample pairs of five model organisms, where Mus musculus are the highest prediction accuracy of 99.5 %, and we find multiple hotspot proteins after constructing an interaction network. Finally, we test the predictive power of the LPI-CSFFR for sample pairs with unknown interactions. The results indicate that LPI-CSFFR is promising for predicting potential LncRPIs. The relevant source code and the data used in this study are available at https://github.com/JianjunTan-Beijing/LPI-CSFFR.

Keywords: Convolution neural network; Feature reuse; LncRNA-protein interactions; Serial fusion.

MeSH terms

  • Animals
  • Computational Biology / methods
  • Mice
  • Neural Networks, Computer
  • Proteins / metabolism
  • RNA, Long Noncoding* / metabolism
  • Software

Substances

  • Proteins
  • RNA, Long Noncoding