Current pathway synthesis tools identify possible pathways that can be added to a host to produce the desired target molecule through the exploration of abstract metabolic and reaction network space. However, not many of these tools explore gene-level information required to physically realize the identified synthesis pathways, and none explore enzyme-host compatibility. Developing tools that address this disconnect between abstract reactions/metabolic design space and physical genetic sequence design space will enable expedited experimental efforts that avoid exploring unprofitable synthesis pathways. This work describes a workflow, termed Probabilistic Pathway Assembly with Solubility Confidence Scores (ProPASS), which links synthesis pathway construction with the exploration of the physical design space as imposed by the availability of enzymes with predicted characterized activities within the host. Predicted protein solubility propensity scores are used as a confidence level to quantify the compatibility of each pathway enzyme with the host Escherichia coli (E. coli). This study also presents a database, termed Protein Solubility Database (ProSol DB), which provides solubility confidence scores in E. coli for 240,016 characterized enzymes obtained from UniProtKB/Swiss-Prot. The utility of ProPASS is demonstrated by generating genetic implementations of heterologous synthesis pathways in E. coli that target several commercially useful biomolecules.
Keywords: metabolic engineering; pathway design; pathway implementation; solubility; synthesis pathway; synthetic biology.
© 2019 Wiley Periodicals, Inc.