Joint coordinate attention mechanism and instance normalization for COVID online comments text classification

PeerJ Comput Sci. 2024 Aug 19:10:e2240. doi: 10.7717/peerj-cs.2240. eCollection 2024.

Abstract

Background: The majority of extant methodologies for text classification prioritize the extraction of feature representations from texts with high degrees of distinction, a process that may result in computational inefficiencies. To address this limitation, the current study proposes a novel approach by directly leveraging label information to construct text representations. This integration aims to optimize the use of label data alongside textual content.

Methods: The methodology initiated with separate pre-processing of texts and labels, followed by encoding through a projection layer. This research then utilized a conventional self-attention model enhanced by instance normalization (IN) and Gaussian Error Linear Unit (GELU) functions to assess emotional valences in review texts. An advanced self-attention mechanism was further developed to enable the efficient integration of text and label information. In the final stage, an adaptive label encoder was employed to extract relevant label information from the combined text-label data efficiently.

Results: Empirical evaluations demonstrate that the proposed model achieves a significant improvement in classification performance, outperforming existing methodologies. This enhancement is quantitatively evidenced by its superior micro-F1 score, indicating the efficacy of integrating label information into text classification processes. This suggests that the model not only addresses computational inefficiencies but also enhances the accuracy of text classification.

Keywords: COVID online reviews; Coordinate attention mechanism; Gaussian Error Linear Unit; Instance normalization; Text classification.

Grants and funding

This research is supported by the Shandong Social Science Planning Fund Program (Grant No. 21BTQJ02). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.