The coronavirus disease 2019 (COVID-19) has become a severe worldwide health emergency and is spreading at a rapid rate. Segmentation of COVID lesions from computed tomography (CT) scans is of great importance for supervising disease progression and further clinical treatment. As labeling COVID-19 CT scans is labor-intensive and time-consuming, it is essential to develop a segmentation method based on limited labeled data to conduct this task. In this paper, we propose a self-ensembled co-training framework, which is trained by limited labeled data and large-scale unlabeled data, to automatically extract COVID lesions from CT scans. Specifically, to enrich the diversity of unsupervised information, we build a co-training framework consisting of two collaborative models, in which the two models teach each other during training by using their respective predicted pseudo-labels of unlabeled data. Moreover, to alleviate the adverse impacts of noisy pseudo-labels for each model, we propose a self-ensembling strategy to perform consistency regularization for the up-to-date predictions of unlabeled data, in which the predictions of unlabeled data are gradually ensembled via moving average at the end of every training epoch. We evaluate our framework on a COVID-19 dataset containing 103 CT scans. Experimental results show that our proposed method achieves better performance in the case of only 4 labeled CT scans compared to the state-of-the-art semi-supervised segmentation networks.