Global distribution and diversity of Chaetoceros (Bacillariophyta, Mediophyceae): integration of classical and novel strategies

PeerJ. 2019 Aug 19:7:e7410. doi: 10.7717/peerj.7410. eCollection 2019.

Abstract

Information on taxa distribution is a prerequisite for many research fields, and biological records are a major source of data contributing to biogeographic studies. The Global Biodiversity Information Facility (GBIF) and the Ocean Biogeographic Information System (OBIS) are important infrastructures facilitating free and open access to classical biological data from several sources in both temporal and spatial scales. Over the last ten years, high throughput sequencing (HTS) metabarcoding data have become available, which constitute a great source of detailed occurrence data. Among the global sampling projects that have contributed to such data are Tara Oceans and the Ocean Sampling Day (OSD). Integration of classical and metabarcoding data may aid a more comprehensive assessment of the geographic range of species, especially of microscopic ones such as protists. Rare, small and cryptic species are often ignored in surveys or mis-assigned with the classical approaches. Here we show how integration of data from various sources can contribute to insight in the biogeography and diversity at the genus- and species-level using Chaetoceros as study system, one of the most diverse and abundant genera among marine planktonic diatoms. Chaetoceros records were extracted from GBIF and OBIS and literature data were collected by means of a Google Scholar search. Chaetoceros references barcodes where mapped against the metabarcode datasets of Tara Oceans (210 sites) and OSD (144 sites). We compared the resolution of different data sources in determining the global distribution of the genus and provided examples, at the species level, of detection of cryptic species, endemism and cosmopolitan or restricted distributions. Our results highlighted at genus level a comparable picture from the different sources but a more complete assessment when data were integrated. Both the importance of the integration but also the challenges related to it were illustrated. Chaetoceros data collected in this study are organised and available in the form of tables and maps, providing a powerful tool and a baseline for further research in e.g., ecology, conservation and evolutionary biology.

Keywords: 18S rDNA; Biodiversity; Biogeography; Biological records; Chaetoceros; Global distribution; Marine diatoms; Metabarcoding; OSD; TARA.

Grants and funding

Daniele De Luca and Chetan C. Gaonkar were supported by a PhD fellowship from the Stazione Zoologica Anton Dohrn (http://www.szn.it) via the Open University (www.open.ac.uk). Roberta Piredda was supported by the project FIRB Biodiversitalia (RBAP10A2T4) funded by the Italian Ministry of Education, University and Research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.