Thyroid nodules are very common, 5-15% of which are malignant. Despite the low mortality rate of well-differentiated thyroid cancer, some variants may behave aggressively, making nodule differentiation mandatory. Ultrasound and fine-needle aspiration biopsy are simple, safe, cost-effective and accurate diagnostic tools, but have some potential limits. Recently, machine learning (ML) approaches have been successfully applied to healthcare datasets to predict the outcomes of surgical procedures. The aim of this work is the application of ML to predict tumor histology (HIS), aggressiveness and post-surgical complications in thyroid patients. This retrospective study was conducted at the ENT Division of Eastern Piedmont University, Novara (Italy), and reported data about 1218 patients who underwent surgery between January 2006 and December 2018. For each patient, general information, HIS and outcomes are reported. For each prediction task, we trained ML models on pre-surgery features alone as well as on both pre- and post-surgery data. The ML pipeline included data cleaning, oversampling to deal with unbalanced datasets and exploration of hyper-parameter space for random forest models, testing their stability and ranking feature importance. The main results are (i) the construction of a rich, hand-curated, open dataset including pre- and post-surgery features (ii) the development of accurate yet explainable ML models. Results highlight pre-screening as the most important feature to predict HIS and aggressiveness, and that, in our population, having an out-of-range (Low) fT3 dosage at pre-operative examination is strongly associated with a higher aggressiveness of the disease. Our work shows how ML models can find patterns in thyroid patient data and could support clinicians to refine diagnostic tools and improve their accuracy.
Keywords: machine learning; surgical approach; surgical complication; thyroid cancer.