Positron emission tomography (PET) has been widely used in clinical diagnosis for diseases and disorders. To obtain high-quality PET images requires a standard-dose radionuclide (tracer) injection into the human body, which inevitably increases risk of radiation exposure. One possible solution to this problem is to predict the standard-dose PET image from its low-dose counterpart and its corresponding multimodal magnetic resonance (MR) images. Inspired by the success of patch-based sparse representation (SR) in super-resolution image reconstruction, we propose a mapping-based SR (m-SR) framework for standard-dose PET image prediction. Compared with the conventional patch-based SR, our method uses a mapping strategy to ensure that the sparse coefficients, estimated from the multimodal MR images and low-dose PET image, can be applied directly to the prediction of standard-dose PET image. As the mapping between multimodal MR images (or low-dose PET image) and standard-dose PET images can be particularly complex, one step of mapping is often insufficient. To this end, an incremental refinement framework is therefore proposed. Specifically, the predicted standard-dose PET image is further mapped to the target standard-dose PET image, and then the SR is performed again to predict a new standard-dose PET image. This procedure can be repeated for prediction refinement of the iterations. Also, a patch selection based dictionary construction method is further used to speed up the prediction process. The proposed method is validated on a human brain dataset. The experimental results show that our method can outperform benchmark methods in both qualitative and quantitative measures.