iNR-2L: A two-level sequence-based predictor developed via Chou's 5-steps rule and general PseAAC for identifying nuclear receptors and their families

Genomics. 2020 Jan;112(1):276-285. doi: 10.1016/j.ygeno.2019.02.006. Epub 2019 Feb 16.

Abstract

Nuclear receptor proteins (NRPs) perform a vital role in regulating gene expression. With the rapidity growth of NRPs in post-genomic era, it is highly recommendable to identify NRPs and their sub-families accurately from their primary sequences. Several conventional methods have been used for discrimination of NRPs and their sub-families, but did not achieve considerable results. In a sequel, a two-level new computational model "iNR-2 L" is developed. Two discrete methods namely: Dipeptide Composition and Tripeptide Composition were used to formulate NRPs sequences. Further, both the descriptor spaces were merged to construct hybrid space. Furthermore, feature selection technique minimum redundancy and maximum relevance was employed in order to select salient features as well as reduce the noise and redundancy. The experiential outcomes exhibited that the proposed model iNR-2 L achieved outstanding results. It is anticipated that the proposed computational model might be a practical and effective tool for academia and research community.

Keywords: Dipeptide composition; Nuclear receptor proteins; SVM; Tripeptide composition; mRMR.

Publication types

  • Evaluation Study

MeSH terms

  • Computational Biology / methods
  • Dipeptides / chemistry
  • Neural Networks, Computer
  • Oligopeptides / chemistry
  • Receptors, Cytoplasmic and Nuclear / chemistry*
  • Receptors, Cytoplasmic and Nuclear / classification*
  • Sequence Analysis, Protein / methods*
  • Support Vector Machine

Substances

  • Dipeptides
  • Oligopeptides
  • Receptors, Cytoplasmic and Nuclear