CiRCus: A Framework to Enable Classification of Complex High-Throughput Experiments

J Proteome Res. 2019 Apr 5;18(4):1486-1493. doi: 10.1021/acs.jproteome.8b00724. Epub 2019 Mar 22.

Abstract

Despite the increasing use of high-throughput experiments in molecular biology, methods for evaluating and classifying the acquired results have not kept pace, requiring significant manual efforts to do so. Here, we present CiRCus, a framework to generate custom machine learning models to classify results from high-throughput proteomics binding experiments. We show the experimental procedure that guided us to the layout of this framework as well as the usage of the framework on an example data set consisting of 557 166 protein/drug binding curves achieving an AUC of 0.9987. By applying our classifier to the data, only 6% of the data might require manual investigation. CiRCus bundles two applications, a minimal interface to label a training data set (CindeR) and an interface for the generation of random forest classifiers with optional optimization of pretrained models (CurveClassification). CiRCus is available on https://github.com/kusterlab accompanied by an in-depth user manual and video tutorial.

Keywords: classification; competition binding; kinobeads; labeling; machine learning; proteomics.

MeSH terms

  • Algorithms
  • Binding, Competitive / physiology
  • Databases, Protein
  • High-Throughput Screening Assays / methods*
  • Machine Learning*
  • Protein Binding
  • Proteins / chemistry
  • Proteins / metabolism
  • Proteomics / methods*
  • Software*

Substances

  • Proteins