Genome-wide analysis of NBS-LRR-encoding genes in Arabidopsis

Plant Cell. 2003 Apr;15(4):809-34. doi: 10.1105/tpc.009308.

Abstract

The Arabidopsis genome contains approximately 200 genes that encode proteins with similarity to the nucleotide binding site and other domains characteristic of plant resistance proteins. Through a reiterative process of sequence analysis and reannotation, we identified 149 NBS-LRR-encoding genes in the Arabidopsis (ecotype Columbia) genomic sequence. Fifty-six of these genes were corrected from earlier annotations. At least 12 are predicted to be pseudogenes. As described previously, two distinct groups of sequences were identified: those that encoded an N-terminal domain with Toll/Interleukin-1 Receptor homology (TIR-NBS-LRR, or TNL), and those that encoded an N-terminal coiled-coil motif (CC-NBS-LRR, or CNL). The encoded proteins are distinct from the 58 predicted adapter proteins in the previously described TIR-X, TIR-NBS, and CC-NBS groups. Classification based on protein domains, intron positions, sequence conservation, and genome distribution defined four subgroups of CNL proteins, eight subgroups of TNL proteins, and a pair of divergent NL proteins that lack a defined N-terminal motif. CNL proteins generally were encoded in single exons, although two subclasses were identified that contained introns in unique positions. TNL proteins were encoded in modular exons, with conserved intron positions separating distinct protein domains. Conserved motifs were identified in the LRRs of both CNL and TNL proteins. In contrast to CNL proteins, TNL proteins contained large and variable C-terminal domains. The extant distribution and diversity of the NBS-LRR sequences has been generated by extensive duplication and ectopic rearrangements that involved segmental duplications as well as microscale events. The observed diversity of these NBS-LRR proteins indicates the variety of recognition molecules available in an individual genotype to detect diverse biotic challenges.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Arabidopsis / genetics*
  • Arabidopsis Proteins / genetics*
  • Conserved Sequence / genetics
  • Evolution, Molecular
  • GTP-Binding Proteins / genetics
  • Gene Expression Profiling / methods
  • Gene Expression Regulation, Plant
  • Genome, Plant*
  • Immunity, Innate / genetics
  • Internet
  • Leucine-Rich Repeat Proteins
  • Molecular Sequence Data
  • Multigene Family / genetics
  • Mutation
  • Phylogeny
  • Physical Chromosome Mapping
  • Proteins / genetics*
  • Pseudogenes / genetics
  • Repetitive Sequences, Amino Acid
  • Sequence Analysis, DNA
  • Sequence Homology, Amino Acid

Substances

  • Arabidopsis Proteins
  • Leucine-Rich Repeat Proteins
  • Proteins
  • GTP-Binding Proteins