Background: Real-time prediction of the remaining surgery duration (RSD) is important for optimal scheduling of resources in the operating room.
Methods: We focus on the intraoperative prediction of RSD from laparoscopic video. An extensive evaluation of seven common deep learning models, a proposed one based on the Transformer architecture (TransLocal) and four baseline approaches, is presented. The proposed pipeline includes a CNN-LSTM for feature extraction from salient regions within short video segments and a Transformer with local attention mechanisms.
Results: Using the Cholec80 dataset, TransLocal yielded the best performance (mean absolute error (MAE) = 7.1 min). For long and short surgeries, the MAE was 10.6 and 4.4 min, respectively. Thirty minutes before the end of surgery MAE = 6.2 min, 7.2 and 5.5 min for all long and short surgeries, respectively.
Conclusions: The proposed technique achieves state-of-the-art results. In the future, we aim to incorporate intraoperative indicators and pre-operative data.
Keywords: artificial intelligence; cholecystectomy; deep learning; prediction; remaining surgery duration.
© 2024 The Authors. The International Journal of Medical Robotics and Computer Assisted Surgery published by John Wiley & Sons Ltd.