Genomic and proteomic data were integrated into the proteogenomic workflow to identify coding genomic variants of Human Embryonic Kidney 293 (HEK-293) cell line at the proteome level. Shotgun proteome data published by Geiger et al. (2012), Chick et al. (2015), and obtained in this work for HEK-293 were searched against the customized genomic database generated using exome data published by Lin et al. (2014). Overall, 112 unique variants were identified at the proteome level out of ∼1200 coding variants annotated in the exome. Seven identified variants were shared between all the three considered proteomic datasets, and 27 variants were found in any two datasets. Some of the found variants belonged to widely known genomic polymorphisms originated from the germline, while the others were more likely resulting from somatic mutations. At least, eight of the proteins bearing amino acid variants were annotated as cancer-related ones, including p53 tumor suppressor. In all the considered shotgun datasets, the variant peptides were at the ratio of 1:2.5 less likely being identified than the wild-type ones compared with the corresponding theoretical peptides. This can be explained by the presence of the so-called "passenger" mutations in the genes, which were never expressed in HEK-293 cells. All MS data have been deposited in the ProteomeXchange with the dataset identifier PXD002613 (http://proteomecentral.proteomexchange.org/dataset/PXD002613).
Keywords: Bioinformatics; Exome; HEK-293 cell line; Proteogenomics; Shotgun proteomics; Single nucleotide variant.
© 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.