DyGraphformer: Transformer combining dynamic spatio-temporal graph network for multivariate time series forecasting

Neural Netw. 2025 Jan:181:106776. doi: 10.1016/j.neunet.2024.106776. Epub 2024 Oct 17.

Abstract

Transformer-based models demonstrate tremendous potential for Multivariate Time Series (MTS) forecasting due to their ability to capture long-term temporal dependencies by using the self-attention mechanism. However, effectively modeling the spatial correlation cross series for MTS is a challenge for Transformer. Although Graph Neural Networks (GNN) are competent for modeling spatial dependencies across series, existing methods are based on the assumption of static relationships between variables, which do not align with the time-varying spatial dependencies in real-world series. Therefore, we propose DyGraphformer, which integrates graph convolution into Transformer to assist Transformer in effectively modeling spatial dependencies, while also dynamically inferring time-varying spatial dependencies by combining historical spatial information. In DyGraphformer, decoder module involving complex recursion is abandoned to accelerate model execution. First, the input is embedded using DSW (Dimension Segment Wise) through integrating its position and node level embedding to preserve temporal and spatial information. Then, the time self-attention layer and dynamic graph convolutional layer are constructed to capture temporal dependencies and spatial dependencies of multivariate time series, respectively. The dynamic graph convolutional layer utilizes Gated Recurrent Unit (GRU) to obtain historical spatial dependencies, and integrates the series features of the current time to perform graph structure inference in multiple subspaces. Specifically, to fully utilize the spatio-temporal information at different scales, DyGraphformer performs hierarchical encoder learning for the final forecasting. Extensive experimental results on seven real-world datasets demonstrate DyGraphformer outperforms state-of-the-art baseline methods, with comparisons including Transformer-based and GNN-based methods.

Keywords: Attention mechanism; Dynamic spatio-temporal graph; Graph neural networks; Multivariate time series; Transformer.

MeSH terms

  • Algorithms
  • Forecasting*
  • Multivariate Analysis
  • Neural Networks, Computer*
  • Spatio-Temporal Analysis
  • Time Factors