The genetic blueprint for the essential functions of life is encoded in DNA, which is translated into proteins-the engines driving most of our metabolic processes. Recent advancements in genome sequencing have unveiled a vast diversity of protein families, but compared with the massive search space of all possible amino acid sequences, the set of known functional families is minimal. One could say nature has a limited protein "vocabulary." A major question for computational biologists, therefore, is whether this vocabulary can be expanded to include useful proteins that went extinct long ago or have never evolved (yet). By merging evolutionary algorithms, machine learning, and bioinformatics, we can develop highly customized "designer proteins." We dub the new subfield of computational evolution, which employs evolutionary algorithms with DNA string representations, biologically accurate molecular evolution, and bioinformatics-informed fitness functions, Evolutionary Algorithms Simulating Molecular Evolution.
Keywords: artificial intelligence; biotechnology; computational biology; computational evolution; evolutionary algorithms; genetic programming; molecular evolution; proteomics.
© The Author(s) 2024. Published by Oxford University Press.