Search | arXiv e-print repository

arXiv:1803.11240 [pdf, other]

Computationally efficient likelihood inference in exponential families when the maximum likelihood estimator does not exist

Authors: Daniel J. Eck, Charles J. Geyer

Abstract: In a regular full exponential family, the maximum likelihood estimator (MLE) need not exist in the traditional sense. However, the MLE may exist in the completion of the exponential family. Existing algorithms for finding the MLE in the completion solve many linear programs; they are slow in small problems and too slow for large problems. We provide new, fast, and scalable methodology for finding… ▽ More In a regular full exponential family, the maximum likelihood estimator (MLE) need not exist in the traditional sense. However, the MLE may exist in the completion of the exponential family. Existing algorithms for finding the MLE in the completion solve many linear programs; they are slow in small problems and too slow for large problems. We provide new, fast, and scalable methodology for finding the MLE in the completion of the exponential family. This methodology is based on conventional maximum likelihood computations which come close, in a sense, to finding the MLE in the completion of the exponential family. These conventional computations construct a likelihood maximizing sequence of canonical parameter values which goes uphill on the likelihood function until they meet a convergence criteria. Nonexistence of the MLE in this context results from a degeneracy of the canonical statistic of the exponential family, the canonical statistic is on the boundary of its support. There is a correspondance between this boundary and the null eigenvectors of the Fisher information matrix. Convergence of Fisher information along a likelihood maximizing sequence follows from cumulant generating function (CGF) convergence along a likelihood maximizing sequence, conditions for which are given. This allows for the construction of necessarily one-sided confidence intervals for mean value parameters when the MLE exists in the completion. We demonstrate our methodology on three examples in the main text and three additional examples in the Appendix. We show that when the MLE exists in the completion of the exponential family, our methodology provides statistical inference that is much faster than existing techniques. △ Less

Submitted 25 November, 2020; v1 submitted 29 March, 2018; originally announced March 2018.

arXiv:1705.03594 [pdf, ps, other]

Automatic Response Category Combination in Multinomial Logistic Regression

Authors: Bradley S. Price, Charles J. Geyer, Adam J. Rothman

Abstract: We propose a penalized likelihood method that simultaneously fits the multinomial logistic regression model and combines subsets of the response categories. The penalty is non differentiable when pairs of columns in the optimization variable are equal. This encourages pairwise equality of these columns in the estimator, which corresponds to response category combination. We use an alternating dire… ▽ More We propose a penalized likelihood method that simultaneously fits the multinomial logistic regression model and combines subsets of the response categories. The penalty is non differentiable when pairs of columns in the optimization variable are equal. This encourages pairwise equality of these columns in the estimator, which corresponds to response category combination. We use an alternating direction method of multipliers algorithm to compute the estimator and we discuss the algorithm's convergence. Prediction and model selection are also addressed. △ Less

Submitted 9 May, 2017; originally announced May 2017.

Comments: 19 Pages, 10 Tables

arXiv:1701.07910 [pdf, ps, other]

Combining Envelope Methodology and Aster Models for Variance Reduction in Life History Analyses

Authors: Daniel J. Eck, Charles J. Geyer, R. Dennis Cook

Abstract: Precise estimation of expected Darwinian fitness, the expected lifetime number of offspring of organism, is a central component of life history analysis. The aster model serves as a defensible statistical model for distributions of Darwinian fitness. The aster model is equipped to incorporate the major life stages an organism travels through which separately may effect Darwinian fitness. Envelope… ▽ More Precise estimation of expected Darwinian fitness, the expected lifetime number of offspring of organism, is a central component of life history analysis. The aster model serves as a defensible statistical model for distributions of Darwinian fitness. The aster model is equipped to incorporate the major life stages an organism travels through which separately may effect Darwinian fitness. Envelope methodology reduces asymptotic variability by establishing a link between unknown parameters of interest and the asymptotic covariance matrices of their estimators. It is known both theoretically and in applications that incorporation of envelope methodology reduces asymptotic variability. We develop an envelope framework, including a new envelope estimator, that is appropriate for aster analyses. The level of precision provided from our methods allows researchers to draw stronger conclusions about the driving forces of Darwinian fitness from their life history analyses than they could with the aster model alone. Our methods are illustrated on a simulated dataset and a life history analysis of \emph{Mimulus guttatus} flowers is provided. Useful variance reduction is obtained in both analyses. △ Less

Submitted 27 February, 2018; v1 submitted 26 January, 2017; originally announced January 2017.

Comments: Title changed from "An Application of Envelope Methodology and Aster Models" to "Combining Envelope Methodology and Aster Models for Variance Reduction in Life History Analyses"

arXiv:1311.7482 [pdf, ps, other]

doi 10.1214/13-AOAS653

Local adaptation and genetic effects on fitness: Calculations for exponential family models with random effects

Authors: Charles J. Geyer, Caroline E. Ridley, Robert G. Latta, Julie R. Etterson, Ruth G. Shaw

Abstract: Random effects are implemented for aster models using two approximations taken from Breslow and Clayton [J. Amer. Statist. Assoc. 88 (1993) 9-25]. Random effects are analytically integrated out of the Laplace approximation to the complete data log likelihood, giving a closed-form expression for an approximate missing data log likelihood. Third and higher derivatives of the complete data log likeli… ▽ More Random effects are implemented for aster models using two approximations taken from Breslow and Clayton [J. Amer. Statist. Assoc. 88 (1993) 9-25]. Random effects are analytically integrated out of the Laplace approximation to the complete data log likelihood, giving a closed-form expression for an approximate missing data log likelihood. Third and higher derivatives of the complete data log likelihood with respect to the random effects are ignored, giving a closed-form expression for second derivatives of the approximate missing data log likelihood, hence approximate observed Fisher information. This method is applicable to any exponential family random effects model. It is implemented in the CRAN package aster (R Core Team [R: A Language and Environment for Statistical Computing (2012) R Foundation for Statistical Computing], Geyer [R package aster (2012) http://cran.r-project.org/package=aster]). Applications are analyses of local adaptation in the invasive California wild radish (Raphanus sativus) and the slender wild oat (Avena barbata) and of additive genetic variance for fitness in the partridge pea (Chamaecrista fasciculata). △ Less

Submitted 29 November, 2013; originally announced November 2013.

Comments: Published in at http://dx.doi.org/10.1214/13-AOAS653 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS653

Journal ref: Annals of Applied Statistics 2013, Vol. 7, No. 3, 1778-1795

arXiv:1310.3892 [pdf, other]

Ridge Fusion in Statistical Learning

Authors: Bradley S. Price, Charles J. Geyer, Adam J. Rothman

Abstract: We propose a penalized likelihood method to jointly estimate multiple precision matrices for use in quadratic discriminant analysis and model based clustering. A ridge penalty and a ridge fusion penalty are used to introduce shrinkage and promote similarity between precision matrix estimates. Block-wise coordinate descent is used for optimization, and validation likelihood is used for tuning param… ▽ More We propose a penalized likelihood method to jointly estimate multiple precision matrices for use in quadratic discriminant analysis and model based clustering. A ridge penalty and a ridge fusion penalty are used to introduce shrinkage and promote similarity between precision matrix estimates. Block-wise coordinate descent is used for optimization, and validation likelihood is used for tuning parameter selection. Our method is applied in quadratic discriminant analysis and semi-supervised model based clustering. △ Less

Submitted 5 May, 2014; v1 submitted 14 October, 2013; originally announced October 2013.

Comments: 24 pages and 9 tables, 3 figures

arXiv:1302.6741 [pdf, ps, other]

doi 10.1214/12-AOS1048

Variable transformation to obtain geometric ergodicity in the random-walk Metropolis algorithm

Authors: Leif T. Johnson, Charles J. Geyer

Abstract: A random-walk Metropolis sampler is geometrically ergodic if its equilibrium density is super-exponentially light and satisfies a curvature condition [Stochastic Process. Appl. 85 (2000) 341-361]. Many applications, including Bayesian analysis with conjugate priors of logistic and Poisson regression and of log-linear models for categorical data result in posterior distributions that are not super-… ▽ More A random-walk Metropolis sampler is geometrically ergodic if its equilibrium density is super-exponentially light and satisfies a curvature condition [Stochastic Process. Appl. 85 (2000) 341-361]. Many applications, including Bayesian analysis with conjugate priors of logistic and Poisson regression and of log-linear models for categorical data result in posterior distributions that are not super-exponentially light. We show how to apply the change-of-variable formula for diffeomorphisms to obtain new densities that do satisfy the conditions for geometric ergodicity. Sampling the new variable and mapping the results back to the old gives a geometrically ergodic sampler for the original variable. This method of obtaining geometric ergodicity has very wide applicability. △ Less

Submitted 11 December, 2013; v1 submitted 27 February, 2013; originally announced February 2013.

Comments: Published in at http://dx.doi.org/10.1214/12-AOS1048 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org). With Corrections

Report number: IMS-AOS-AOS1048

Journal ref: Annals of Statistics 2012, Vol. 40, No. 6, 3050-3076

arXiv:1206.4762 [pdf, ps, other]

Asymptotics of Maximum Likelihood without the LLN or CLT or Sample Size Going to Infinity

Authors: Charles J. Geyer

Abstract: If the log likelihood is approximately quadratic with constant Hessian, then the maximum likelihood estimator (MLE) is approximately normally distributed. No other assumptions are required. We do not need independent and identically distributed data. We do not need the law of large numbers (LLN) or the central limit theorem (CLT). We do not need sample size going to infinity or anything going to i… ▽ More If the log likelihood is approximately quadratic with constant Hessian, then the maximum likelihood estimator (MLE) is approximately normally distributed. No other assumptions are required. We do not need independent and identically distributed data. We do not need the law of large numbers (LLN) or the central limit theorem (CLT). We do not need sample size going to infinity or anything going to infinity. Presented here is a combination of Le Cam style theory involving local asymptotic normality (LAN) and local asymptotic mixed normality (LAMN) and Cramér style theory involving derivatives and Fisher information. The main tool is convergence in law of the log likelihood function and its derivatives considered as random elements of a Polish space of continuous functions with the metric of uniform convergence on compact sets. We obtain results for both one-step-Newton estimators and Newton-iterated-to-convergence estimators. △ Less

Submitted 4 July, 2012; v1 submitted 20 June, 2012; originally announced June 2012.

arXiv:0901.0455 [pdf, ps, other]

Likelihood Inference in Exponential Families and Directions of Recession

Authors: Charles J. Geyer

Abstract: When in a full exponential family the maximum likelihood estimate (MLE) does not exist, the MLE may exist in the Barndorff-Nielsen completion of the family. We propose a practical algorithm for finding the MLE in the completion based on repeated linear programming using the R contributed package rcdd and illustrate it with two generalized linear model examples. When the MLE for the null hypothes… ▽ More When in a full exponential family the maximum likelihood estimate (MLE) does not exist, the MLE may exist in the Barndorff-Nielsen completion of the family. We propose a practical algorithm for finding the MLE in the completion based on repeated linear programming using the R contributed package rcdd and illustrate it with two generalized linear model examples. When the MLE for the null hypothesis lies in the completion, likelihood ratio tests of model comparison are almost unchanged from the usual case. Only the degrees of freedom need to be adjusted. When the MLE lies in the completion, confidence intervals are changed much more from the usual case. The MLE of the natural parameter can be thought of as having gone to infinity in a certain direction, which we call a generic direction of recession. We propose a new one-sided confidence interval which says how close to infinity the natural parameter may be. This maps to one-sided confidence intervals for mean values showing how close to the boundary of their support they may be. △ Less

Submitted 5 January, 2009; originally announced January 2009.

Comments: Submitted to the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-EJS-EJS_2008_349 MSC Class: 62F99 (Primary) 52B55 (Secondary)

arXiv:0708.2184 [pdf, ps, other]

doi 10.1214/009053606000001389

Monte Carlo likelihood inference for missing data models

Authors: Yun Ju Sung, Charles J. Geyer

Abstract: We describe a Monte Carlo method to approximate the maximum likelihood estimate (MLE), when there are missing data and the observed data likelihood is not available in closed form. This method uses simulated missing data that are independent and identically distributed and independent of the observed data. Our Monte Carlo approximation to the MLE is a consistent and asymptotically normal estimat… ▽ More We describe a Monte Carlo method to approximate the maximum likelihood estimate (MLE), when there are missing data and the observed data likelihood is not available in closed form. This method uses simulated missing data that are independent and identically distributed and independent of the observed data. Our Monte Carlo approximation to the MLE is a consistent and asymptotically normal estimate of the minimizer $θ^*$ of the Kullback--Leibler information, as both Monte Carlo and observed data sample sizes go to infinity simultaneously. Plug-in estimates of the asymptotic variance are provided for constructing confidence regions for $θ^*$. We give Logit--Normal generalized linear mixed model examples, calculated using an R package. △ Less

Submitted 16 August, 2007; originally announced August 2007.

Comments: Published at http://dx.doi.org/10.1214/009053606000001389 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS0194 MSC Class: 62F12 (Primary); 65C05 (Secondary)

Journal ref: Annals of Statistics 2007, Vol. 35, No. 3, 990-1011

Showing 1–9 of 9 results for author: Geyer, C J