Seatizen Atlas: a collaborative dataset of underwater and aerial marine imagery

Sci Data. 2025 Jan 14;12(1):67. doi: 10.1038/s41597-024-04267-z.

Abstract

Citizen Science initiatives have a worldwide impact on environmental research by providing data at a global scale and high resolution. Mapping marine biodiversity remains a key challenge to which citizen initiatives can contribute. Here we describe a dataset made of both underwater and aerial imagery collected in shallow tropical coastal areas by using various low cost platforms operated either by citizens or researchers. This dataset is regularly updated and contains >1.6 M images from the Southwest Indian Ocean. Most of images are geolocated, and some are annotated with 51 distinct classes (e.g. fauna, and habitats) to train AI models. The quality of these photos taken by action cameras along the trajectories of different platforms, is highly heterogeneous (due to varying speed, depth, turbidity, and perspectives) and well reflects the challenges of underwater image recognition. Data discovery and access rely on DOI assignment while data interoperability and reuse is ensured by complying with widely used community standards. The open-source data workflow is provided to ease contributions from anyone collecting pictures.