Interpretation of machine learning-based prediction models and functional metagenomic approach to identify critical genes in HBCD degradation

J Hazard Mater. 2024 Dec 25:486:136976. doi: 10.1016/j.jhazmat.2024.136976. Online ahead of print.

Abstract

Hexabromocyclododecane (HBCD) poses significant environmental risks, and identifying HBCD-degrading microbes and their enzymatic mechanisms is challenging due to the complexity of microbial interactions and metabolic pathways. This study aimed to identify critical genes involved in HBCD biodegradation through two approaches: functional annotation of metagenomes and the interpretation of machine learning-based prediction models. Our functional analysis revealed a rich metabolic potential in Chiang Chun soil (CCS) metagenomes, particularly in carbohydrate metabolism. Among the machine learning algorithms tested, random forest models outperformed others, especially when trained on datasets reflecting the degradation patterns of species like Dehalococcoides mccartyi and Pseudomonas aeruginosa. These models highlighted enzymes such as EC 1.8.3.2 (thiol oxidase) and EC 4.1.1.43 (phenylpyruvate decarboxylase) as inhibitors of degradation, while EC 2.7.1.83 (pseudouridine kinase) was linked to enhanced degradation. This dual-methodology approach not only deepens our understanding of microbial functions in HBCD degradation but also provides an unbiased view of the microbial and enzymatic interactions involved, offering a more targeted and effective bioremediation strategy.

Keywords: Biodegradation; Hexabromocyclododecane (HBCD); Machine Learning; Metagenomics.