The administration of hemodialysis (HD) treatment leads to the continuous collection of a vast quantity of medical data. Many variables related to the patient health status, to the treatment, and to dialyzer settings can be recorded and stored at each treatment session. In this study a dataset of 42 variables and 1526 patients extracted from the Fresenius Medical Care database EuCliD was used to develop and apply a random forest predictive model for the prediction of cardiovascular events in the first year of HD treatment. A ridge-lasso logistic regression algorithm was then applied to the subset of variables mostly involved in the prediction model to get insights in the mechanisms underlying the incidence of cardiovascular complications in this high risk population of patients.