Background: Copy number variation (CNV), a complex genomic rearrangement, has been extensively studied in humans and other organisms. In plants, CNVs of several genes were found to be responsible for various important traits; however, the cause and consequence of CNVs remains largely unknown. Recently released next-generation sequencing (NGS) data provide an opportunity for a genome-wide study of CNVs in rice.
Results: Here, by an NGS-based approach, we generated a CNV map comprising 9,196 deletions compared to the reference genome 'Nipponbare'. Using Oryza glaberrima as the outgroup, 80% of the CNV events turned out to be insertions in Nipponbare. There were 2,806 annotated genes affected by these CNV events. We experimentally validated 28 functional CNV genes including OsMADS56, BPH14, OsDCL2b and OsMADS30, implying that CNVs might have contributed to phenotypic variations in rice. Most CNV genes were found to be located in non-co-linear positions by comparison to O. glaberrima. One of the origins of these non-co-linear genes was genomic duplications caused by transposon activity or double-strand break repair. Comprehensive analysis of mutation mechanisms suggested an abundance of CNVs formed by non-homologous end-joining and mobile element insertion.
Conclusions: This study showed the impact and origin of copy number variations in rice on a genomic scale.
Keywords: CNV genes; Copy number variation (CNV); Mutation mechanisms; NGS-based survey; Oryza species.