Incorporation of a unified protein abundance dataset into the Saccharomyces genome database

Robert S Nash; Shuai Weng; Kalpana Karra; Edith D Wong; Stacia R Engel; J Michael Cherry; SGD Project

doi:10.1093/database/baaa008

Incorporation of a unified protein abundance dataset into the Saccharomyces genome database

Database (Oxford). 2020 Jan 1:2020:baaa008. doi: 10.1093/database/baaa008.

Authors

Robert S Nash¹, Shuai Weng¹, Kalpana Karra¹, Edith D Wong¹, Stacia R Engel¹, J Michael Cherry¹; SGD Project

Affiliation

¹ Department of Genetics, Stanford University, 3165 Porter Drive, Palo Alto, CA 94304, USA.

Abstract

The identification and accurate quantitation of protein abundance has been a major objective of proteomics research. Abundance studies have the potential to provide users with data that can be used to gain a deeper understanding of protein function and regulation and can also help identify cellular pathways and modules that operate under various environmental stress conditions. One of the central missions of the Saccharomyces Genome Database (SGD; https://www.yeastgenome.org) is to work with researchers to identify and incorporate datasets of interest to the wider scientific community, thereby enabling hypothesis-driven research. A large number of studies have detailed efforts to generate proteome-wide abundance data, but deeper analyses of these data have been hampered by the inability to compare results between studies. Recently, a unified protein abundance dataset was generated through the evaluation of more than 20 abundance datasets, which were normalized and converted to common measurement units, in this case molecules per cell. We have incorporated these normalized protein abundance data and associated metadata into the SGD database, as well as the SGD YeastMine data warehouse, resulting in the addition of 56 487 values for untreated cells grown in either rich or defined media and 28 335 values for cells treated with environmental stressors. Abundance data for protein-coding genes are displayed in a sortable, filterable table on Protein pages, available through Locus Summary pages. A median abundance value was incorporated, and a median absolute deviation was calculated for each protein-coding gene and incorporated into SGD. These values are displayed in the Protein section of the Locus Summary page. The inclusion of these data has enhanced the quality and quantity of protein experimental information presented at SGD and provides opportunities for researchers to access and utilize the data to further their research.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Databases, Genetic
Genome, Fungal / genetics*
Genomics / methods
Internet
Proteome / genetics
Proteome / metabolism
Proteomics / methods
Saccharomyces cerevisiae / genetics*
Saccharomyces cerevisiae / metabolism
Saccharomyces cerevisiae Proteins / genetics*
Saccharomyces cerevisiae Proteins / metabolism
User-Computer Interface

Substances

Proteome
Saccharomyces cerevisiae Proteins

Abstract

Publication types

MeSH terms

Substances

Grants and funding