In this study the quantitative and qualitative image quality (IQ) measurements with clinical judgement of IQ in positron emission tomography (PET) were compared. The limitations of IQ metrics and the proposed criteria of acceptability for PET scanners are discussed. Phantom and patient images were reconstructed using seven different iterative reconstruction protocols. For each reconstructed set of images, IQ was scored based both on the visual analysis and on the quantitative metrics. The quantitative physics metrics did not rank the reconstruction protocols in the same order as the clinicians' scoring of perceived IQ (R(s)=-0.54). Better agreement was achieved when comparing the clinical perception of IQ to the physicist's visual assessment of IQ in the phantom images (R(s)=+0.59). The closest agreement was seen between the quantitative physics metrics and the measurement of the standard uptake values (SUVs) in small tumours (R(s)=+0.92). Given the disparity between the clinical perception of IQ and the physics metrics a cautious approach to use of IQ measurements for determining suspension levels is warranted.