Significance analysis of proteomic data generated by LC-MS/MS is challenging owing to great data variability originated from biological, operational, and instrumental variations. Protein quantification by LC-MS/MS either in absolute or relative scale is often highly skewed, which put limitations on model-based statistical inference. For this purpose, we have developed an alternative nonparametric statistical algorithm (named IQR algorithm) for significance analysis of temporal proteomic data and have successfully applied our strategy in finding gefitinib-targeted transcription factors and coregulators in Epidermal Growth Factor (EGF)-stimulated HeLa cells. Our strategy relies on a reference group composed of more than a dozen of datasets collected at different experimental times, thus, accurately captures biological variations measured in quartile scale. The algorithm considers six categories and calculates signal strength when performing significance analysis of proteins of different abundances. This stratified strategy allows confident identification of well-characterized EGF responders (e.g. EGR1, JUN, FOSB, BHLHE40, NR4A1, and NR4A2) and unexplored gefitinib induced transcription factors and coregulators in HeLa cells. Gene set enrichment analysis has validated ErbB signaling pathway as the major inhibitory target of gefitinib. The identification of several gefitinib-inducible transcription factors implicates alternative signaling pathways as potential druggable pathways in gefitinib-resistant or insensitive patients.
Keywords: EGF signaling; HeLa; gefitinib; mass spectrometry; transcription factors.
© 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.