Temporally consistent sequence-to-sequence translation of cataract surgeries

Yannik Frisch; Moritz Fuchs; Anirban Mukhopadhyay

doi:10.1007/s11548-023-02925-y

Temporally consistent sequence-to-sequence translation of cataract surgeries

Int J Comput Assist Radiol Surg. 2023 Jul;18(7):1217-1224. doi: 10.1007/s11548-023-02925-y. Epub 2023 May 23.

Authors

Yannik Frisch¹, Moritz Fuchs², Anirban Mukhopadhyay²

Affiliations

¹ Computer Science, Technical University Darmstadt, Fraunhoferstraße 5, 64283, Darmstadt, Hessen, Germany. [email protected].
² Computer Science, Technical University Darmstadt, Fraunhoferstraße 5, 64283, Darmstadt, Hessen, Germany.

Abstract

Purpose: Image-to-image translation methods can address the lack of diversity in publicly available cataract surgery data. However, applying image-to-image translation to videos-which are frequently used in medical downstream applications-induces artifacts. Additional spatio-temporal constraints are needed to produce realistic translations and improve the temporal consistency of translated image sequences.

Methods: We introduce a motion-translation module that translates optical flows between domains to impose such constraints. We combine it with a shared latent space translation model to improve image quality. Evaluations are conducted regarding translated sequences' image quality and temporal consistency, where we propose novel quantitative metrics for the latter. Finally, the downstream task of surgical phase classification is evaluated when retraining it with additional synthetic translated data.

Results: Our proposed method produces more consistent translations than state-of-the-art baselines. Moreover, it stays competitive in terms of the per-image translation quality. We further show the benefit of consistently translated cataract surgery sequences for improving the downstream task of surgical phase prediction.

Conclusion: The proposed module increases the temporal consistency of translated sequences. Furthermore, imposed temporal constraints increase the usability of translated data in downstream tasks. This allows overcoming some of the hurdles of surgical data acquisition and annotation and enables improving models' performance by translating between existing datasets of sequential frames.

Keywords: Cataract surgery; Generative adversarial networks; Generative models; Sequence translation; Temporal consistency; Unsupervised image translation.

Publication types

Review

MeSH terms

Artifacts
Benchmarking
Cataract Extraction*
Cataract*
Humans
Image Processing, Computer-Assisted
Motion

Grants and funding

ZN 01IS17050/Bundesministerium für Bildung und Forschung