Genomics 2 Proteins portal: a resource and discovery tool for linking genetic screening outputs to protein sequences and structures

Nat Methods. 2024 Oct;21(10):1947-1957. doi: 10.1038/s41592-024-02409-0. Epub 2024 Sep 18.

Abstract

Recent advances in AI-based methods have revolutionized the field of structural biology. Concomitantly, high-throughput sequencing and functional genomics have generated genetic variants at an unprecedented scale. However, efficient tools and resources are needed to link disparate data types-to 'map' variants onto protein structures, to better understand how the variation causes disease, and thereby design therapeutics. Here we present the Genomics 2 Proteins portal ( https://g2p.broadinstitute.org/ ): a human proteome-wide resource that maps 20,076,998 genetic variants onto 42,413 protein sequences and 77,923 structures, with a comprehensive set of structural and functional features. Additionally, the Genomics 2 Proteins portal allows users to interactively upload protein residue-wise annotations (for example, variants and scores) as well as the protein structure beyond databases to establish the connection between genomics to proteins. The portal serves as an easy-to-use discovery tool for researchers and scientists to hypothesize the structure-function relationship between natural or synthetic variations and their molecular phenotypes.

MeSH terms

  • Amino Acid Sequence
  • Databases, Protein*
  • Genetic Testing / methods
  • Genetic Variation
  • Genomics* / methods
  • Humans
  • Protein Conformation
  • Proteins / chemistry
  • Proteins / genetics
  • Proteome / genetics
  • Software

Substances

  • Proteins
  • Proteome