Machine Learning Model as a Useful Tool for Prediction of Thyroid Nodules Histology, Aggressiveness and Treatment-Related Complications

Valeria Dell'Era; Alan Perotti; Michele Starnini; Massimo Campagnoli; Maria Silvia Rosa; Irene Saino; Paolo Aluffi Valletti; Massimiliano Garzaro

doi:10.3390/jpm13111615

Machine Learning Model as a Useful Tool for Prediction of Thyroid Nodules Histology, Aggressiveness and Treatment-Related Complications

J Pers Med. 2023 Nov 17;13(11):1615. doi: 10.3390/jpm13111615.

Authors

Valeria Dell'Era¹, Alan Perotti², Michele Starnini^{2

3}, Massimo Campagnoli⁴, Maria Silvia Rosa¹, Irene Saino¹, Paolo Aluffi Valletti⁵, Massimiliano Garzaro⁵

Affiliations

¹ ENT Division, Novara Maggiore Hospital, 28100 Novara, Italy.
² CENTAI Institute, 10138 Turin, Italy.
³ Departament de Fisica, Universitat Politecnica de Catalunya, Campus Nord, 08034 Barcelona, Spain.
⁴ Department of Otorhinolaryngology, Ss. Trinità Hospital, 28021 Borgomanero, Italy.
⁵ ENT Division, Health Science Department, School of Medicine, Universitá del Piemonte Orientale, 28100 Novara, Italy.

Abstract

Thyroid nodules are very common, 5-15% of which are malignant. Despite the low mortality rate of well-differentiated thyroid cancer, some variants may behave aggressively, making nodule differentiation mandatory. Ultrasound and fine-needle aspiration biopsy are simple, safe, cost-effective and accurate diagnostic tools, but have some potential limits. Recently, machine learning (ML) approaches have been successfully applied to healthcare datasets to predict the outcomes of surgical procedures. The aim of this work is the application of ML to predict tumor histology (HIS), aggressiveness and post-surgical complications in thyroid patients. This retrospective study was conducted at the ENT Division of Eastern Piedmont University, Novara (Italy), and reported data about 1218 patients who underwent surgery between January 2006 and December 2018. For each patient, general information, HIS and outcomes are reported. For each prediction task, we trained ML models on pre-surgery features alone as well as on both pre- and post-surgery data. The ML pipeline included data cleaning, oversampling to deal with unbalanced datasets and exploration of hyper-parameter space for random forest models, testing their stability and ranking feature importance. The main results are (i) the construction of a rich, hand-curated, open dataset including pre- and post-surgery features (ii) the development of accurate yet explainable ML models. Results highlight pre-screening as the most important feature to predict HIS and aggressiveness, and that, in our population, having an out-of-range (Low) fT3 dosage at pre-operative examination is strongly associated with a higher aggressiveness of the disease. Our work shows how ML models can find patterns in thyroid patient data and could support clinicians to refine diagnostic tools and improve their accuracy.

Keywords: machine learning; surgical approach; surgical complication; thyroid cancer.

Grants and funding

This research received no external funding.