Medical diagnosis can be performed in an automatic way with the use of computer-based systems or algorithms. Such systems are usually called diagnostic decision support systems (DDSSs) or medical diagnosis systems (MDSs). An evaluation of the performance of a DDSS called ML-DDSS has been performed in this paper. The methodology is based on clinical case resolution performed by physicians which is then used to evaluate the behavior of ML-DDSS. This methodology allows the calculation of values for several well-known metrics such as precision, recall, accuracy, specificity, and Matthews correlation coefficient (MCC). Analysis of the behavior of ML-DDSS reveals interesting results about the behavior of the system and of the physicians who took part in the evaluation process. Global results show how the ML-DDSS system would have significant utility if used in medical practice. The MCC metric reveals an improvement of about 30% in comparison with the experts, and with respect to sensitivity the system returns better results than the experts.