Cloud detection sample generation algorithm for nighttime satellite imagery based on daytime data and machine learning application

Sci Rep. 2024 Nov 13;14(1):27917. doi: 10.1038/s41598-024-78889-z.

Abstract

Highly accurate nighttime cloud detection in satellite imagery is challenging due to the absence of visible to near-infrared (0.38-3 μm, VNI) data, which is critical for distinguishing clouds from other ground features. Fortunately, Machine learning (ML) techniques can more effectively leverage the limited wavelength information and show high-accuracy cloud detection based on vast sample volume. However, accurately distinguishing cloud pixels solely through thermal infrared bands (8-14 μm, TIR) is challenging, and acquiring numerous, high-quality and representative samples of nighttime images for ML proves to be unattainable in practice. Given the thermal infrared radiation transmission process and the fact that daytime and nighttime have the same source of radiance in TIR, we propose a sample generation idea that uses daytime images to provide samples of nighttime cloud detection, which is different from the traditional sample construction methods (e.g., manually label, simulation and transfer learning method), and can obtain samples effectively. Based on this idea, nighttime cloud detection experiments were carried out for MODIS, GF-5 (02) and Himawari-8 satellites, respectively. The results were validated by the Lidar cloud product and manual labels and show that our nighttime cloud detection result has higher accuracy than MYD35 (78.17%) and the ML model trained by nighttime manual labels (75.86%). The accuracy of the three sensors is 82.19%, 88.71%, and 79.34%, respectively. Moreover, we validated and discussed the performance of our algorithm on various surface types (vegetation, urban, barren and water). The results revealed that the accuracy of the three sensors over barren was found to be poor and varied with surface types but overall high. Our study can provide a novel perspective on nighttime cloud detection of muti-spectral satellite imagery.

Keywords: GF-5 (02); Himawari-8; LightGBM (LGB); MODIS (Aqua); Nighttime cloud detection; Thermal infrared (TIR).