Structural variation (SV)-based pan-genome and GWAS reveal the impacts of SVs on the speciation and diversification of allotetraploid cottons

Mol Plant. 2023 Apr 3;16(4):678-693. doi: 10.1016/j.molp.2023.02.004. Epub 2023 Feb 9.

Abstract

Structural variations (SVs) have long been described as being involved in the origin, adaption, and domestication of species. However, the underlying genetic and genomic mechanisms are poorly understood. Here, we report a high-quality genome assembly of Gossypium barbadense acc. Tanguis, a landrace that is closely related to formation of extra-long-staple (ELS) cultivated cotton. An SV-based pan-genome (Pan-SV) was then constructed using a total of 182 593 non-redundant SVs, including 2236 inversions, 97 398 insertions, and 82 959 deletions from 11 assembled genomes of allopolyploid cotton. The utility of this Pan-SV was then demonstrated through population structure analysis and genome-wide association studies (GWASs). Using segregation mapping populations produced through crossing ELS cotton and the landrace along with an SV-based GWAS, certain SVs responsible for speciation, domestication, and improvement in tetraploid cottons were identified. Importantly, some of the SVs presently identified as associated with the yield and fiber quality improvement had not been identified in previous SNP-based GWAS. In particular, a 9-bp insertion or deletion was found to associate with elimination of the interspecific reproductive isolation between Gossypium hirsutum and G. barbadense. Collectively, this study provides new insights into genome-wide, gene-scale SVs linked to important agronomic traits in a major crop species and highlights the importance of SVs during the speciation, domestication, and improvement of cultivated crop species.

Keywords: GWAS; QTL mapping; SV-based pan-genome; genome assembly; introgression; structural variations.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genome, Plant / genetics
  • Genome-Wide Association Study*
  • Gossypium* / genetics
  • Phenotype
  • Tetraploidy