We report the application of machine learning techniques to expedite classification and analysis of protein unfolding trajectories from force spectroscopy data. Using kernel methods, logistic regression, and triplet loss, we developed a workflow called Forced Unfolding and Supervised Iterative Online (FUSION) learning where a user classifies a small number of repeatable unfolding patterns encoded as images, and a machine is tasked with identifying similar images to classify the remaining data. We tested the workflow using two case studies on a multidomain XMod-Dockerin/Cohesin complex, validating the approach first using synthetic data generated with a Monte Carlo algorithm and then deploying the method on experimental atomic force spectroscopy data. FUSION efficiently separated traces that passed quality filters from unusable ones, classified curves with high accuracy, and identified unfolding pathways that were undetected by the user. This study demonstrates the potential of machine learning to accelerate data analysis and generate new insights in protein biophysics.
Keywords: atomic force microscopy; data analysis; iterative screening; machine learning; single-molecule biophysics.