Search | arXiv e-print repository

E-DSSR: Efficient Dynamic Surgical Scene Reconstruction with Transformer-based Stereoscopic Depth Perception

Authors: Yonghao Long, Zhaoshuo Li, Chi Hang Yee, Chi Fai Ng, Russell H. Taylor, Mathias Unberath, Qi Dou

Abstract: Reconstructing the scene of robotic surgery from the stereo endoscopic video is an important and promising topic in surgical data science, which potentially supports many applications such as surgical visual perception, robotic surgery education and intra-operative context awareness. However, current methods are mostly restricted to reconstructing static anatomy assuming no tissue deformation, too… ▽ More Reconstructing the scene of robotic surgery from the stereo endoscopic video is an important and promising topic in surgical data science, which potentially supports many applications such as surgical visual perception, robotic surgery education and intra-operative context awareness. However, current methods are mostly restricted to reconstructing static anatomy assuming no tissue deformation, tool occlusion and de-occlusion, and camera movement. However, these assumptions are not always satisfied in minimal invasive robotic surgeries. In this work, we present an efficient reconstruction pipeline for highly dynamic surgical scenes that runs at 28 fps. Specifically, we design a transformer-based stereoscopic depth perception for efficient depth estimation and a light-weight tool segmentor to handle tool occlusion. After that, a dynamic reconstruction algorithm which can estimate the tissue deformation and camera movement, and aggregate the information over time is proposed for surgical scene reconstruction. We evaluate the proposed pipeline on two datasets, the public Hamlyn Centre Endoscopic Video Dataset and our in-house DaVinci robotic surgery dataset. The results demonstrate that our method can recover the scene obstructed by the surgical tool and handle the movement of camera in realistic surgical scenarios effectively at real-time speed. △ Less

Submitted 1 July, 2021; originally announced July 2021.

Comments: Accepted to MICCAI 2021

arXiv:1801.07860 [pdf]

doi 10.1038/s41746-018-0029-1

Scalable and accurate deep learning for electronic health records

Authors: Alvin Rajkomar, Eyal Oren, Kai Chen, Andrew M. Dai, Nissan Hajaj, Peter J. Liu, Xiaobing Liu, Mimi Sun, Patrik Sundberg, Hector Yee, Kun Zhang, Gavin E. Duggan, Gerardo Flores, Michaela Hardt, Jamie Irvine, Quoc Le, Kurt Litsch, Jake Marcus, Alexander Mossin, Justin Tansuwan, De Wang, James Wexler, Jimbo Wilson, Dana Ludwig, Samuel L. Volchenboum , et al. (9 additional authors not shown)

Abstract: Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient's record. We propose a representation of p… ▽ More Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient's record. We propose a representation of patients' entire, raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two U.S. academic medical centers with 216,221 adult patients hospitalized for at least 24 hours. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting in-hospital mortality (AUROC across sites 0.93-0.94), 30-day unplanned readmission (AUROC 0.75-0.76), prolonged length of stay (AUROC 0.85-0.86), and all of a patient's final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed state-of-the-art traditional predictive models in all cases. We also present a case-study of a neural-network attribution system, which illustrates how clinicians can gain some transparency into the predictions. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios, complete with explanations that directly highlight evidence in the patient's chart. △ Less

Submitted 11 May, 2018; v1 submitted 24 January, 2018; originally announced January 2018.

Comments: Published version from https://www.nature.com/articles/s41746-018-0029-1

Journal ref: npj Digital Medicine 1:18 (2018)

arXiv:1301.4171 [pdf, ps, other]

Affinity Weighted Embedding

Authors: Jason Weston, Ron Weiss, Hector Yee

Abstract: Supervised (linear) embedding models like Wsabie and PSI have proven successful at ranking, recommendation and annotation tasks. However, despite being scalable to large datasets they do not take full advantage of the extra data due to their linear nature, and typically underfit. We propose a new class of models which aim to provide improved performance while retaining many of the benefits of the… ▽ More Supervised (linear) embedding models like Wsabie and PSI have proven successful at ranking, recommendation and annotation tasks. However, despite being scalable to large datasets they do not take full advantage of the extra data due to their linear nature, and typically underfit. We propose a new class of models which aim to provide improved performance while retaining many of the benefits of the existing class of embedding models. Our new approach works by iteratively learning a linear embedding model where the next iteration's features and labels are reweighted as a function of the previous iteration. We describe several variants of the family, and give some initial results. △ Less

Submitted 17 January, 2013; originally announced January 2013.

arXiv:cs/0511032 [pdf]

Spatiotemporal sensistivity and visual attention for efficient rendering of dynamic environments

Authors: Yang Li Hector Yee

Abstract: We present a method to accelerate global illumination computation in dynamic environments by taking advantage of limitations of the human visual system. A model of visual attention is used to locate regions of interest in a scene and to modulate spatiotemporal sensitivity. The method is applied in the form of a spatiotemporal error tolerance map. Perceptual acceleration combined with good sampli… ▽ More We present a method to accelerate global illumination computation in dynamic environments by taking advantage of limitations of the human visual system. A model of visual attention is used to locate regions of interest in a scene and to modulate spatiotemporal sensitivity. The method is applied in the form of a spatiotemporal error tolerance map. Perceptual acceleration combined with good sampling protocols provide a global illumination solution feasible for use in animation. Results indicate an order of magnitude improvement in computational speed. The method is adaptable and can also be used in image-based rendering, geometry level of detail selection, realistic image synthesis, video telephony and video compression. △ Less

Submitted 8 November, 2005; originally announced November 2005.

Journal ref: ACM Transactions on Graphics, 20(1), January 2001

Showing 1–4 of 4 results for author: Yee, H