Semantic Segmentation of Satellite Images for Landslide Detection Using Foreground-Aware and Multi-Scale Convolutional Attention Mechanism

Chih-Chang Yu; Yuan-Di Chen; Hsu-Yung Cheng; Chi-Lun Jiang

doi:10.3390/s24206539

Semantic Segmentation of Satellite Images for Landslide Detection Using Foreground-Aware and Multi-Scale Convolutional Attention Mechanism

Sensors (Basel). 2024 Oct 10;24(20):6539. doi: 10.3390/s24206539.

Authors

Chih-Chang Yu¹, Yuan-Di Chen², Hsu-Yung Cheng², Chi-Lun Jiang²

Affiliations

¹ Department of Information and Computer Engineering, Chung Yuan Christian University, Taoyuan City 320, Taiwan.
² Department of Computer Science and Information Engineering, National Central University, Taoyuan City 320, Taiwan.

Abstract

Advancements in satellite and aerial imagery technology have made it easier to obtain high-resolution remote sensing images, leading to widespread research and applications in various fields. Remote sensing image semantic segmentation is a crucial task that provides semantic and localization information for target objects. In addition to the large-scale variation issues common in most semantic segmentation datasets, aerial images present unique challenges, including high background complexity and imbalanced foreground-background ratios. However, general semantic segmentation methods primarily address scale variations in natural scenes and often neglect the specific challenges in remote sensing images, such as inadequate foreground modeling. In this paper, we present a foreground-aware remote sensing semantic segmentation model. The model introduces a multi-scale convolutional attention mechanism and utilizes a feature pyramid network architecture to extract multi-scale features, addressing the multi-scale problem. Additionally, we introduce a Foreground-Scene Relation Module to mitigate false alarms. The model enhances the foreground features by modeling the relationship between the foreground and the scene. In the loss function, a Soft Focal Loss is employed to focus on foreground samples during training, alleviating the foreground-background imbalance issue. Experimental results indicate that our proposed method outperforms current state-of-the-art general semantic segmentation methods and transformer-based methods on the LS dataset benchmark.

Keywords: convolutional attention mechanism; multi-scale features fusion; remote sensing; semantic segmentation.

Grants and funding

112-2221-E-008 -069 -MY3/National Science and Technology Council, Taiwan