In the current pandemic, lung ultrasound (LUS) played a useful role in evaluating patients affected by COVID-19. However, LUS remains limited to the visual inspection of ultrasound data, thus negatively affecting the reliability and reproducibility of the findings. Moreover, many different imaging protocols have been proposed, most of which lacked proper clinical validation. To address these problems, we were the first to propose a standardized imaging protocol and scoring system. Next, we developed the first deep learning (DL) algorithms capable of evaluating LUS videos providing, for each video-frame, the score as well as semantic segmentation. Moreover, we have analyzed the impact of different imaging protocols and demonstrated the prognostic value of our approach. In this work, we report on the level of agreement between the DL and LUS experts, when evaluating LUS data. The results show a percentage of agreement between DL and LUS experts of 85.96% in the stratification between patients at high risk of clinical worsening and patients at low risk. These encouraging results demonstrate the potential of DL models for the automatic scoring of LUS data, when applied to high quality data acquired accordingly to a standardized imaging protocol.