A continuous analog of run length distributions reflecting accumulated fractionation events

BMC Bioinformatics. 2016 Nov 11;17(Suppl 14):412. doi: 10.1186/s12859-016-1265-5.

Abstract

Background: We propose a new, continuous model of the fractionation process (duplicate gene deletion after polyploidization) on the real line. The aim is to infer how much DNA is deleted at a time, based on segment lengths for alternating deleted (invisible) and undeleted (visible) regions.

Results: After deriving a number of analytical results for "one-sided" fractionation, we undertake a series of simulations that help us identify the distribution of segment lengths as a gamma with shape and rate parameters evolving over time. This leads to an inference procedure based on observed length distributions for visible and invisible segments.

Conclusions: We suggest extensions of this mathematical and simulation work to biologically realistic discrete models, including two-sided fractionation.

Keywords: Analysis of runs; Duplicate gene deletion; Genomics; Probability modeling; Whole genome duplication.

MeSH terms

  • Gene Deletion
  • Gene Duplication
  • Genomics
  • Models, Theoretical*