Multi-Modal Deep Learning for Assessing Surgeon Technical Skill

Kevin Kasa; David Burns; Mitchell G Goldenberg; Omar Selim; Cari Whyne; Michael Hardisty

doi:10.3390/s22197328

Multi-Modal Deep Learning for Assessing Surgeon Technical Skill

Sensors (Basel). 2022 Sep 27;22(19):7328. doi: 10.3390/s22197328.

Authors

Kevin Kasa¹, David Burns^{1

2

3}, Mitchell G Goldenberg⁴, Omar Selim⁵, Cari Whyne^{1

2

3}, Michael Hardisty^{1

3}

Affiliations

¹ Orthopaedic Biomechanics Lab, Holland Bone and Joint Program, Sunnybrook Research Institute, Toronto, ON M4N 3M5, Canada.
² Institute of Biomedical Engineering, University of Toronto, Toronto, ON M5S 1A1, Canada.
³ Division of Orthopaedic Surgery, Department of Surgery, University of Toronto, Toronto, ON M5S 1A1, Canada.
⁴ Division of Urology, Department of Surgery, University of Toronto, Toronto, ON M5S 1A1, Canada.
⁵ Department of Surgery, Royal Victoria Regional Health Center, Barrie, ON L4M 6M2, Canada.

Abstract

This paper introduces a new dataset of a surgical knot-tying task, and a multi-modal deep learning model that achieves comparable performance to expert human raters on this skill assessment task. Seventy-two surgical trainees and faculty were recruited for the knot-tying task, and were recorded using video, kinematic, and image data. Three expert human raters conducted the skills assessment using the Objective Structured Assessment of Technical Skill (OSATS) Global Rating Scale (GRS). We also designed and developed three deep learning models: a ResNet-based image model, a ResNet-LSTM kinematic model, and a multi-modal model leveraging the image and time-series kinematic data. All three models demonstrate performance comparable to the expert human raters on most GRS domains. The multi-modal model demonstrates the best overall performance, as measured using the mean squared error (MSE) and intraclass correlation coefficient (ICC). This work is significant since it demonstrates that multi-modal deep learning has the potential to replicate human raters on a challenging human-performed knot-tying task. The study demonstrates an algorithm with state-of-the-art performance in surgical skill assessment. As objective assessment of technical skill continues to be a growing, but resource-heavy, element of surgical education, this study is an important step towards automated surgical skill assessment, ultimately leading to reduced burden on training faculty and institutes.

Keywords: biomedical engineering; computer vision; deep learning; human activity recognition; machine learning; multi-modal; surgical education; surgical skills assessment.

MeSH terms

Algorithms
Clinical Competence
Deep Learning*
Humans
Surgeons*
Suture Techniques / education

Grants and funding

This research was funded by the Wyss Medical Foundation and Feldberg Chair for Spinal Research.