The Historical Evolution and Significance of Multiple Sequence Alignment in Molecular Structure and Function Prediction

Biomolecules. 2024 Nov 29;14(12):1531. doi: 10.3390/biom14121531.

Abstract

Multiple sequence alignment (MSA) has evolved into a fundamental tool in the biological sciences, playing a pivotal role in predicting molecular structures and functions. With broad applications in protein and nucleic acid modeling, MSAs continue to underpin advancements across a range of disciplines. MSAs are not only foundational for traditional sequence comparison techniques but also increasingly important in the context of artificial intelligence (AI)-driven advancements. Recent breakthroughs in AI, particularly in protein and nucleic acid structure prediction, rely heavily on the accuracy and efficiency of MSAs to enhance remote homology detection and guide spatial restraints. This review traces the historical evolution of MSA, highlighting its significance in molecular structure and function prediction. We cover the methodologies used for protein monomers, protein complexes, and RNA, while also exploring emerging AI-based alternatives, such as protein language models, as complementary or replacement approaches to traditional MSAs in application tasks. By discussing the strengths, limitations, and applications of these methods, this review aims to provide researchers with valuable insights into MSA's evolving role, equipping them to make informed decisions in structural prediction research.

Keywords: RNA; deep learning; function prediction; multiple sequence alignment; pairwise sequence alignment; protein complex; protein language model; protein monomer; protein structure prediction.

Publication types

  • Review

MeSH terms

  • Artificial Intelligence
  • Computational Biology / methods
  • Humans
  • Models, Molecular
  • Proteins* / chemistry
  • RNA / chemistry
  • Sequence Alignment* / methods

Substances

  • Proteins
  • RNA