Haplotype assembly of autotetraploid potato using integer linear programing

Bioinformatics. 2019 Sep 15;35(18):3279-3286. doi: 10.1093/bioinformatics/btz060.

Abstract

Summary: Haplotype assembly of polyploids is an open issue in plant genomics. Recent experimental studies on highly heterozygous autotetraploid potato have shown that available methods do not deliver satisfying results in practice. We propose an optimal method to assemble haplotypes of highly heterozygous polyploids from Illumina short-sequencing reads. Our method is based on a generalization of the existing minimum fragment removal model to the polyploid case and on new integer linear programs to reconstruct optimal haplotypes. We validate our methods experimentally by means of a combined evaluation on simulated and experimental data based on 83 previously sequenced autotetraploid potato cultivars. Results on simulated data show that our methods produce highly accurate haplotype assemblies, while results on experimental data confirm a sensible improvement over the state of the art.

Availability and implementation: Executables for Linux at http://github.com/Computational Genomics/HaplotypeAssembler.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Algorithms
  • Haplotypes
  • Programming, Linear
  • Sequence Analysis, DNA
  • Software
  • Solanum tuberosum*