Bacterial cytochrome P450s: a bioinformatics odyssey of substrate discovery

Front Microbiol. 2024 Feb 7:15:1343029. doi: 10.3389/fmicb.2024.1343029. eCollection 2024.

Abstract

Bacterial P450 cytochromes (BacCYPs) are versatile heme-containing proteins responsible for oxidation reactions on a wide range of substrates, contributing to the production of valuable natural products with limitless biotechnological potential. While the sequencing of microbial genomes has provided a wealth of BacCYP sequences, functional characterization lags behind, hindering our understanding of their roles. This study employs a comprehensive approach to predict BacCYP substrate specificity, bridging the gap between sequence and function. We employed an integrated approach combining sequence and functional data analysis, genomic context exploration, 3D structural modeling with molecular docking, and phylogenetic clustering. The research begins with an in-depth analysis of BacCYP sequence diversity and structural characteristics, revealing conserved motifs and recurrent residues in the active site. Phylogenetic analysis identifies distinct groups within the BacCYP family based on sequence similarity. However, our study reveals that sequence alone does not consistently predict substrate specificity, necessitating additional perspectives. The study delves into the genetic context of BacCYPs, utilizing neighboring gene information to infer potential substrates, a method proven very effective in many cases. Molecular docking is employed to assess BacCYP-substrate interactions, confirming potential substrates and providing insights into selectivity. Finally, a comprehensive strategy is proposed for predicting BacCYP substrates, involving all the evaluated approaches. The effectiveness of this strategy is demonstrated with two case studies, highlighting its potential for substrate discovery.

Keywords: data-driven analysis; functional characterization; genomic analysis; molecular docking; structural features; substrate discovery.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. Fellowship supported from CONICET to CC and CS. Fellowship supported from Agencia Nacional de Promoción Científica y Tecnológica (ANPCyT) to GS and JP. MM and DP are members of CONICET. This work was supported by Agencia Nacional de Promoción Científica y Tecnológica (ANPCyT, PICT-2018-04663 to DP).