The era of big data: Genome-scale modelling meets machine learning

Comput Struct Biotechnol J. 2020 Oct 16:18:3287-3300. doi: 10.1016/j.csbj.2020.10.011. eCollection 2020.

Abstract

With omics data being generated at an unprecedented rate, genome-scale modelling has become pivotal in its organisation and analysis. However, machine learning methods have been gaining ground in cases where knowledge is insufficient to represent the mechanisms underlying such data or as a means for data curation prior to attempting mechanistic modelling. We discuss the latest advances in genome-scale modelling and the development of optimisation algorithms for network and error reduction, intracellular constraining and applications to strain design. We further review applications of supervised and unsupervised machine learning methods to omics datasets from microbial and mammalian cell systems and present efforts to harness the potential of both modelling approaches through hybrid modelling.

Keywords: Cell metabolism; Chinese hamster ovary cells; Flux balance analysis; Hybrid modelling; Principal component analysis; Recombinant protein production; Strain optimisation.

Publication types

  • Review