With over 25,000 species, the drivers of diversity in the Orchidaceae remain to be fully understood. Here, we outline a multitiered sequence capture strategy aimed at capturing hundreds of loci to enable phylogenetic resolution from subtribe to subspecific levels in orchids of the tribe Diurideae. For the probe design, we mined subsets of 18 transcriptomes, to give five target sequence sets aimed at the tribe (Sets 1 & 2), subtribe (Set 3), and within subtribe levels (Sets 4 & 5). Analysis included alternative de novo and reference-guided assembly, before target sequence extraction, annotation and alignment, and application of a homology-aware k-mer block phylogenomic approach, prior to maximum likelihood and coalescence-based phylogenetic inference. Our evaluation considered 87 taxa in two test data sets: 67 samples spanning the tribe, and 72 samples involving 24 closely related Caladenia species. The tiered design achieved high target loci recovery (>89%), with the median number of recovered loci in Sets 1-5 as follows: 212, 219, 816, 1024, and 1009, respectively. Interestingly, as a first test of the homologous k-mer approach for targeted sequence capture data, our study revealed its potential for enabling robust phylogenetic species tree inferences. Specifically, we found matching, and in one case improved phylogenetic resolution within species complexes, compared to conventional phylogenetic analysis involving target gene extraction. Our findings indicate that a customized multitiered sequence capture strategy, in combination with promising yet underutilized phylogenomic approaches, will be effective for groups where interspecific divergence is recent, but information on deeper phylogenetic relationships is also required.
Keywords: Orchidaceae; bioinfomatics/phyloinfomatics; phylogenetic theory and methods; phylogeography; transcriptomics.
© 2021 John Wiley & Sons Ltd.