A skipping spectrum sensing scheme based on deep reinforcement learning for transform domain communication systems

Sci Rep. 2024 Dec 28;14(1):31463. doi: 10.1038/s41598-024-83140-w.

Abstract

Spectrum sensing is a key technology and prerequisite for Transform Domain Communication Systems (TDCS). The traditional approach typically involves selecting a working sub-band and maintaining it without further changes, with spectrum sensing being conducted periodically. However, this approach presents two main issues: on the one hand, if the selected working band has few idle channels, TDCS devices are unable to flexibly switch sub-bands, leading to reduced performance; on the other hand, periodic sensing consumes time and energy, limiting TDCS's transmission efficiency. In contrast to previous studies that unrealistically modeled the problem as a Markov Decision Process (MDP), this study accounts for the fact that TDCS devices cannot fully observe the entire spectrum state and must rely on historical observations, along with the current state of sub-bands, to make informed decisions. We innovatively model this as a Partially Observable Markov Decision Process (POMDP). Moreover, we consider both the number of skipped time slots and the selection of idle sub-bands, establishing distinct termination conditions for each action. By assigning different weights to balance sensing overhead and spectrum utilization while reducing conflicts, the algorithm's adaptability and performance are improved. To address the Q-value overestimation problem inherent in traditional Deep Recurrent Q-Network (DRQN) due to the use of a single network, we propose a DDRQN-BandShift strategy that combines Double Deep Q-Network (DDQN) and DRQN. Simulation results show that the proposed scheme significantly improves TDCS transmission efficiency while effectively reducing sensing costs.

Keywords: Double Deep Recurrent Q-Network; Dynamic spectrum access; Partially observable Markov decision process; Spectrum sensing; Transform domain communication system.