Automatic Differentiation is no Panacea for Phylogenetic Gradient Computation

Mathieu Fourment; Christiaan J Swanepoel; Jared G Galloway; Xiang Ji; Karthik Gangavarapu; Marc A Suchard; Frederick A Matsen Iv

doi:10.1093/gbe/evad099

Automatic Differentiation is no Panacea for Phylogenetic Gradient Computation

Genome Biol Evol. 2023 Jun 1;15(6):evad099. doi: 10.1093/gbe/evad099.

Authors

Mathieu Fourment¹, Christiaan J Swanepoel^{2

3}, Jared G Galloway⁴, Xiang Ji⁵, Karthik Gangavarapu⁶, Marc A Suchard^{6

7

8}, Frederick A Matsen Iv^{4

9

10

11}

Affiliations

¹ Australian Institute for Microbiology and Infection, University of Technology Sydney, Ultimo, NSW, Australia.
² Centre for Computational Evolution, The University of Auckland, Auckland, New Zealand.
³ School of Computer Science, The University of Auckland, Auckland, New Zealand.
⁴ Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA.
⁵ Department of Mathematics, Tulane University, New Orleans, Louisiana, USA.
⁶ Department of Human Genetics, University of California, Los Angeles, California, USA.
⁷ Department of Computational Medicine, University of California, Los Angeles, California, USA.
⁸ Department of Biostatistics, University of California, Los Angeles, California, USA.
⁹ Department of Statistics, University of Washington, Seattle, Washington, USA.
¹⁰ Department of Genome Sciences, University of Washington, Seattle, Washington, USA.
¹¹ Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA.

Abstract

Gradients of probabilistic model likelihoods with respect to their parameters are essential for modern computational statistics and machine learning. These calculations are readily available for arbitrary models via "automatic differentiation" implemented in general-purpose machine-learning libraries such as TensorFlow and PyTorch. Although these libraries are highly optimized, it is not clear if their general-purpose nature will limit their algorithmic complexity or implementation speed for the phylogenetic case compared to phylogenetics-specific code. In this paper, we compare six gradient implementations of the phylogenetic likelihood functions, in isolation and also as part of a variational inference procedure. We find that although automatic differentiation can scale approximately linearly in tree size, it is much slower than the carefully implemented gradient calculation for tree likelihood and ratio transformation operations. We conclude that a mixed approach combining phylogenetic libraries with machine learning libraries will provide the optimal combination of speed and model flexibility moving forward.

Keywords: Bayesian inference; gradient; phylogenetics; variational inference.

Publication types

Research Support, Non-U.S. Gov't
Research Support, N.I.H., Extramural

MeSH terms

Algorithms
Likelihood Functions
Machine Learning*
Models, Statistical*
Phylogeny

Abstract

Publication types

MeSH terms

Grants and funding