In this paper, we develop a maximum-likelihood (ML) spatio-temporal blind source separation (BSS) algorithm, where the temporal dependencies are explained by assuming that each source is an autoregressive (AR) process and the distribution of the associated independent identically distributed (i.i.d.) innovations process is described using a mixture of Gaussians. Unlike most ML methods, the proposed algorithm takes into account both spatial and temporal information, optimization is performed using the expectation-maximization (EM) method, the source model is adapted to maximize the likelihood, and the update equations have a simple, analytical form. The proposed method, which we refer to as autoregressive mixture of Gaussians (AR-MOG), outperforms nine other methods for artificial mixtures of real audio. We also show results for using AR-MOG to extract the fetal cardiac signal from real magnetocardiographic (MCG) data.