Evaluation of Machine Learning Based QSAR Models for the Classification of Lung Surfactant Inhibitors

Environ Health (Wash). 2024 Sep 20;2(12):912-917. doi: 10.1021/envhealth.4c00118. eCollection 2024 Dec 20.

Abstract

Inhaled chemicals can cause dysfunction in the lung surfactant, a protein-lipid complex with critical biophysical and biochemical functions. This inhibition has many structure-related and dose-dependent mechanisms, making hazard identification challenging. We developed quantitative structure-activity relationships for predicting lung surfactant inhibition using machine learning. Logistic regression, support vector machines, random forest, gradient-boosted trees, prior-data-fitted networks, and multilayer perceptron were evaluated as methods. Multilayer perceptron had the strongest performance with 96% accuracy and an F1 score of 0.97. Support vector machines and logistic regression also performed well with lower computation costs. This serves as a proof-of-concept for efficient hazard screening in the emerging area of lung surfactant inhibition.