Background: Cytochrome P450 monooxygenases (termed CYPs or P450s) are hemoproteins ubiquitously found across all kingdoms, playing a central role in intracellular metabolism, especially in metabolism of drugs and xenobiotics. The explosive growth of genome sequencing brings a new set of challenges and issues for researchers, such as a systematic investigation of CYPs across all kingdoms in terms of identification, classification, and pan-CYPome analyses. Such investigation requires an automated tool that can handle an enormous amount of sequencing data in a timely manner.
Results: CYPminer was developed in the Python language to facilitate rapid, comprehensive analysis of CYPs from genomes of all kingdoms. CYPminer consists of two procedures i) to generate the Genome-CYP Matrix (GCM) that lists all occurrences of CYPs across the genomes, and ii) to perform analyses and visualization of the GCM, including pan-CYPomes (pan- and core-CYPome), CYP co-occurrence networks, CYP clouds, and genome clustering data. The performance of CYPminer was evaluated with three datasets from fungal and bacterial genome sequences.
Conclusions: CYPminer completes CYP analyses for large-scale genomes from all kingdoms, which allows systematic genome annotation and comparative insights for CYPs. CYPminer also can be extended and adapted easily for broader usage.
Keywords: CYP classification; CYP co-occurrence network; CYP identification; Cytochrome P450; Data analysis; Pan-CYPome; Python; Software.