Recent advancements in the field of object tracking have been notably influenced by Siamese-based trackers, which have demonstrated considerable progress in their performance and application. Researchers frequently emphasize the precision of trackers, yet they tend to neglect the associated complexity. This oversight can restrict real-time performance, rendering these trackers inadequate for specific applications. This study presents a novel lightweight Siamese network tracker, termed SiamGCN, which incorporates global feature fusion alongside a lightweight network architecture to improve tracking performance on devices with limited resources. MobileNet-V3 was chosen as the backbone network for feature extraction, with modifications made to the stride of its final layer to enhance extraction efficiency. A global correlation module, which was founded on the Transformer architecture, was developed utilizing a multi-head cross-attention mechanism. This design enhances the integration of template and search region features, thereby facilitating more precise and resilient tracking capabilities. The model underwent evaluation across four prominent tracking benchmarks: VOT2018, VOT2019, LaSOT, and TrackingNet. The results indicate that SiamGCN achieves high tracking performance while simultaneously decreasing the number of parameters and computational costs. This results in significant benefits regarding processing speed and resource utilization.
Keywords: Siamese network; cross-attention; lightweight; object tracking.