Mutation identification by direct comparison of whole-genome sequencing data from mutant and wild-type individuals using k-mers

Nat Biotechnol. 2013 Apr;31(4):325-30. doi: 10.1038/nbt.2515. Epub 2013 Mar 10.

Abstract

Genes underlying mutant phenotypes can be isolated by combining marker discovery, genetic mapping and resequencing, but a more straightforward strategy for mapping mutations would be the direct comparison of mutant and wild-type genomes. Applying such an approach, however, is hampered by the need for reference sequences and by mutational loads that confound the unambiguous identification of causal mutations. Here we introduce NIKS (needle in the k-stack), a reference-free algorithm based on comparing k-mers in whole-genome sequencing data for precise discovery of homozygous mutations. We applied NIKS to eight mutants induced in nonreference rice cultivars and to two mutants of the nonmodel species Arabis alpina. In both species, comparing pooled F2 individuals selected for mutant phenotypes revealed small sets of mutations including the causal changes. Moreover, comparing M3 seedlings of two allelic mutants unambiguously identified the causal gene. Thus, for any species amenable to mutagenesis, NIKS enables forward genetics without requiring segregating populations, genetic maps and reference sequences.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Alleles
  • Arabidopsis / metabolism
  • Arabis / genetics*
  • Base Pairing / genetics
  • Base Sequence
  • Chromosome Mapping
  • Crosses, Genetic
  • Ethyl Methanesulfonate
  • Flowers / genetics
  • Genes, Plant / genetics
  • Genome, Plant / genetics*
  • Molecular Sequence Data
  • Mutation / genetics*
  • Oryza / genetics*
  • Plant Proteins / genetics
  • Plant Proteins / metabolism
  • Reference Standards
  • Sequence Analysis, DNA / methods*
  • Sequence Deletion

Substances

  • Plant Proteins
  • Ethyl Methanesulfonate