Retrieval of long DNA reads from herbarium specimens

AoB Plants. 2023 Nov 8;15(6):plad074. doi: 10.1093/aobpla/plad074. eCollection 2023 Dec.

Abstract

High-throughput sequencing of herbarium specimens' DNA with short-read platforms has helped explore many biological questions. Here, for the first time, we investigate the potential of using herbarium specimens as a resource for long-read DNA sequencing technologies. We use target capture of 48 low-copy nuclear loci in 12 herbarium specimens of Silene as a basis for long-read sequencing using SMRT PacBio Sequel. The samples were collected between 1932 and 2019. A simple optimization of size selection protocol enabled the retrieval of both long DNA fragments (>1 kb) and long on-target reads for nine of them. The limited sampling size does not enable statistical evaluation of the influence of specimen age to the DNA fragmentation, but our results confirm that younger samples, that is, collected after 1990, are less fragmented and have better sequencing success than specimens collected before this date. Specimens collected between 1990 and 2019 yield between 167 and 3403 on-target reads > 1 kb. They enabled recovering between 34 loci and 48 (i.e. all loci recovered). Three samples from specimens collected before 1990 did not yield on-target reads > 1 kb. The four other samples collected before this date yielded up to 144 reads and recovered up to 25 loci. Young herbarium specimens seem promising for long-read sequencing. However, older ones have partly failed. Further exploration would be necessary to statistically test and understand the potential of older material in the quest for long reads. We would encourage greatly expanding the sampling size and comparing different taxonomic groups.

Keywords: Cross-contaminants; Silene; degraded DNA; herbarium specimens; long-read sequencing; target capture.

Publication types

  • Review