Background: We propose a new, continuous model of the fractionation process (duplicate gene deletion after polyploidization) on the real line. The aim is to infer how much DNA is deleted at a time, based on segment lengths for alternating deleted (invisible) and undeleted (visible) regions.
Results: After deriving a number of analytical results for "one-sided" fractionation, we undertake a series of simulations that help us identify the distribution of segment lengths as a gamma with shape and rate parameters evolving over time. This leads to an inference procedure based on observed length distributions for visible and invisible segments.
Conclusions: We suggest extensions of this mathematical and simulation work to biologically realistic discrete models, including two-sided fractionation.
Keywords: Analysis of runs; Duplicate gene deletion; Genomics; Probability modeling; Whole genome duplication.