The reconstruction of dynamic magnetic resonance data from an undersampled k-space has been shown to have a huge potential in accelerating the acquisition process of this imaging modality. With the introduction of compressed sensing (CS) theory, solutions for undersampled data have arisen which reconstruct images consistent with the acquired samples and compliant with a sparsity model in some transform domain. Fixed basis transforms have been extensively used as sparsifying transforms in the past, but recent developments in dictionary learning (DL) have been shown to outperform them by training an overcomplete basis that is optimal for a particular dataset. We present here an iterative algorithm that enables the application of DL for the reconstruction of cardiac cine data with Cartesian undersampling. This is achieved with local processing of spatio-temporal 3D patches and by independent treatment of the real and imaginary parts of the dataset. The enforcement of temporal gradients is also proposed as an additional constraint that can greatly accelerate the convergence rate and improve the reconstruction for high acceleration rates. The method is compared to and shown to systematically outperform k- t FOCUSS, a successful CS method that uses a fixed basis transform.