Background: Aberrant protein localization is a prominent feature in many human diseases and can have detrimental effects on the function of specific tissues and organs. High-throughput technologies, which continue to advance with iterations of automated equipment and the development of bioinformatics, enable the acquisition of large-scale data that are more pattern-rich, allowing for the use of a wider range of methods to extract useful patterns and knowledge from them.
Methods: The proposed sc2promap (Spatial and Channel for SubCellular Protein Localization Mapping) model, designed to proficiently extract meaningful features from a vast repository of single-channel grayscale protein images for the purposes of protein localization analysis and clustering. Sc2promap incorporates a prediction head component enriched with supplementary protein annotations, along with the integration of a spatial-channel attention mechanism within the encoder to enables the generation of high-resolution protein localization maps that encapsulate the fundamental characteristics of cells, including elemental cellular localizations such as nuclear and non-nuclear domains.
Results: Qualitative and quantitative comparisons were conducted across internal and external clustering evaluation metrics, as well as various facets of the clustering results. The study also explored different components of the model. The research outcomes conclusively indicate that, in comparison to previous methods, Sc2promap exhibits superior performance.
Conclusions: The amalgamation of the attention mechanism and prediction head components has led the model to excel in protein localization clustering and analysis tasks.
General significance: The model effectively enhances the capability to extract features and knowledge from protein fluorescence images.
Keywords: Clustering; Endogenous labeled proteins; Large-scale protein image analysis; Spatial and channel attention; Subcellular protein localization; Vector quantization.
Copyright © 2024 Elsevier B.V. All rights reserved.