Multi-feature fusion-based consumer perceived risk prediction and its interpretability study

PLoS One. 2025 Jan 3;20(1):e0316277. doi: 10.1371/journal.pone.0316277. eCollection 2025.

Abstract

E-commerce faces challenges such as content homogenization and high perceived risk among users. This paper aims to predict perceived risk in different contexts by analyzing review content and website information. Based on a dataset containing 262,752 online reviews, we employ the KeyBERT-TextCNN model to extract thematic features from the review content. Subsequently, we combine these thematic features with product and merchant characteristics. Using the PCA-K-medoids-XGBoost algorithm, we developed a predictive model for perceived risk. In the feature extraction phase, we identified 11 key features that influence perceived risk in online shopping. During the prediction phase, the model performs excellently across different sample types in the test set, achieving a precision (P) of 84%, a recall (R) of 86%, and an F1 score of 85%. Through the model's interpretability analysis, we find that quality, functionality, and price are key features affecting perceived risk for electronic products. In the case of skincare products, skin safety is the most critical feature. Additionally, there are significant differences in feature characteristics between high-risk samples and normal samples.

MeSH terms

  • Algorithms*
  • Consumer Behavior
  • Humans
  • Internet
  • Perception
  • Risk
  • Risk Assessment / methods

Grants and funding

This research was funded by the Project of Cultivation for Young Top-notch Talents of Beijing Municipal Institutions(BPHR202203235) and National Key Research and Development Program Project (2023YFC3304905). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.