Machine learning-enhanced molecular network reveals global exposure to hundreds of unknown PFAS

Sci Adv. 2024 May 24;10(21):eadn1039. doi: 10.1126/sciadv.adn1039. Epub 2024 May 23.

Abstract

Unknown forever chemicals like per- and polyfluoroalkyl substances (PFASs) are difficult to identify. Current platforms designed for metabolites and natural products cannot capture the diverse structural characteristics of PFAS. Here, we report an automatic PFAS identification platform (APP-ID) that screens for PFAS in environmental samples using an enhanced molecular network and identifies unknown PFAS structures using machine learning. Our networking algorithm, which enhances characteristic fragment matches, has lower false-positive rate (0.7%) than current algorithms (2.4 to 46%). Our support vector machine model identified unknown PFAS in test set with 58.3% accuracy, surpassing current software. Further, APP-ID detected 733 PFASs in real fluorochemical wastewater, 39 of which are previously unreported in environmental media. Retrospective screening of 126 PFASs against public data repository from 20 countries show PFAS substitutes are prevalent worldwide.

MeSH terms

  • Algorithms
  • Environmental Exposure
  • Environmental Monitoring / methods
  • Fluorocarbons* / chemistry
  • Humans
  • Machine Learning*
  • Support Vector Machine
  • Wastewater / chemistry
  • Water Pollutants, Chemical / analysis

Substances

  • Fluorocarbons
  • Water Pollutants, Chemical
  • Wastewater