Zum Hauptinhalt springen

Showing 1–3 of 3 results for author: Ozguroglu, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16862  [pdf, other

    cs.RO cs.CV

    Dreamitate: Real-World Visuomotor Policy Learning via Video Generation

    Authors: Junbang Liang, Ruoshi Liu, Ege Ozguroglu, Sruthi Sudhakar, Achal Dave, Pavel Tokmakov, Shuran Song, Carl Vondrick

    Abstract: A key challenge in manipulation is learning a policy that can robustly generalize to diverse visual environments. A promising mechanism for learning robust policies is to leverage video generative models, which are pretrained on large-scale datasets of internet videos. In this paper, we propose a visuomotor policy learning framework that fine-tunes a video diffusion model on human demonstrations o… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Project page: https://dreamitate.cs.columbia.edu/

  2. arXiv:2405.14868  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis

    Authors: Basile Van Hoorick, Rundi Wu, Ege Ozguroglu, Kyle Sargent, Ruoshi Liu, Pavel Tokmakov, Achal Dave, Changxi Zheng, Carl Vondrick

    Abstract: Accurate reconstruction of complex dynamic scenes from just a single viewpoint continues to be a challenging task in computer vision. Current dynamic novel view synthesis methods typically require videos from many different camera viewpoints, necessitating careful recording setups, and significantly restricting their utility in the wild as well as in terms of embodied AI applications. In this pape… ▽ More

    Submitted 5 July, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted to ECCV 2024. Project webpage is available at: https://gcd.cs.columbia.edu/

  3. arXiv:2401.14398  [pdf, other

    cs.CV cs.LG

    pix2gestalt: Amodal Segmentation by Synthesizing Wholes

    Authors: Ege Ozguroglu, Ruoshi Liu, Dídac Surís, Dian Chen, Achal Dave, Pavel Tokmakov, Carl Vondrick

    Abstract: We introduce pix2gestalt, a framework for zero-shot amodal segmentation, which learns to estimate the shape and appearance of whole objects that are only partially visible behind occlusions. By capitalizing on large-scale diffusion models and transferring their representations to this task, we learn a conditional diffusion model for reconstructing whole objects in challenging zero-shot cases, incl… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: Website: https://gestalt.cs.columbia.edu/