Purpose: Interest in developing methods appropriate for mapping increasing amounts of genome-wide molecular data are increasing rapidly. There is also an increasing need for methods that are able to efficiently simulate such data.
Patients and methods: In this article, we provide a graph-theory approach to find the necessary and sufficient conditions for the existence of a phylogeny matrix with k nonidentical haplotypes, n single nucleotide polymorphisms (SNPs), and a population size of m for which the minimum allele frequency of each SNP is between two specific numbers a and b.
Results: We introduce an O(max(n(2), nm)) algorithm for the random construction of such a phylogeny matrix. The running time of any algorithm for solving this problem would be Ω (nm).
Conclusion: We have developed software, RAPPER, based on this algorithm, which is available at http://bioinf.cs.ipm.ir/softwares/RAPPER.
Keywords: minimum allele frequency (MAF); perfect phylogeny; recursive algorithm; tree.