An Iterative Pseudo Label Generation framework for semi-supervised hyperspectral image classification using the Segment Anything Model

Front Plant Sci. 2024 Dec 23:15:1515403. doi: 10.3389/fpls.2024.1515403. eCollection 2024.

Abstract

Hyperspectral image classification in remote sensing often encounters challenges due to limited annotated data. Semi-supervised learning methods present a promising solution. However, their performance is heavily influenced by the quality of pseudo labels. This limitation is particularly pronounced during the early stages of training, when the model lacks adequate prior knowledge. In this paper, we propose an Iterative Pseudo Label Generation (IPG) framework based on the Segment Anything Model (SAM) to harness structural prior information for semi-supervised hyperspectral image classification. We begin by using a small number of annotated labels as SAM point prompts to generate initial segmentation masks. Next, we introduce a spectral voting strategy that aggregates segmentation masks from multiple spectral bands into a unified mask. To ensure the reliability of pseudo labels, we design a spatial-information-consistency-driven loss function that optimizes IPG to adaptively select the most dependable pseudo labels from the unified mask. These selected pseudo labels serve as iterative point prompts for SAM. Following a suitable number of iterations, the resultant pseudo labels can be employed to enrich the training data for the classification model. Experiments conducted on the Indian Pines and Pavia University datasets demonstrate that even a simple 2D CNN based classification model trained with our generated pseudo labels significantly outperforms eight state-of-the-art hyperspectral image classification methods.

Keywords: Segment Anything Model; hyperspectral image classification; pseudo label generation; remote sensing; semi-supervised learning.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was supported in part by the National Key Research and Development Program of China under grant 2022ZD0117400, in part by the National Natural Science Foundation of China under Grant 62132002, and in part by the Fundamental Research Funds for the Central Universities under Grants 501QYJC2024115010 and 502GWXM2024115007.