Skip to main content

Showing 1–50 of 75 results for author: Snavely, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13759  [pdf, other

    cs.CV cs.GR

    Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion

    Authors: Boyang Deng, Richard Tucker, Zhengqi Li, Leonidas Guibas, Noah Snavely, Gordon Wetzstein

    Abstract: We present a method for generating Streetscapes-long sequences of views through an on-the-fly synthesized city-scale scene. Our generation is conditioned by language input (e.g., city name, weather), as well as an underlying map/layout hosting the desired trajectory. Compared to recent models for video generation or 3D view synthesis, our method can scale to much longer-range camera trajectories,… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: *Equal Contributions, Project Page: https://boyangdeng.com/streetscapes

  2. arXiv:2406.11819  [pdf, other

    cs.CV

    MegaScenes: Scene-Level View Synthesis at Scale

    Authors: Joseph Tung, Gene Chou, Ruojin Cai, Guandao Yang, Kai Zhang, Gordon Wetzstein, Bharath Hariharan, Noah Snavely

    Abstract: Scene-level novel view synthesis (NVS) is fundamental to many vision and graphics applications. Recently, pose-conditioned diffusion models have led to significant progress by extracting 3D information from 2D foundation models, but these methods are limited by the lack of scene-level training data. Common dataset choices either consist of isolated objects (Objaverse), or of object-centric scenes… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Our project page is at https://megascenes.github.io

  3. arXiv:2406.07520  [pdf, other

    cs.CV cs.AI cs.GR

    Neural Gaffer: Relighting Any Object via Diffusion

    Authors: Haian Jin, Yuan Li, Fujun Luan, Yuanbo Xiangli, Sai Bi, Kai Zhang, Zexiang Xu, Jin Sun, Noah Snavely

    Abstract: Single-image relighting is a challenging task that involves reasoning about the complex interplay between geometry, materials, and lighting. Many prior methods either support only specific categories of images, such as portraits, or require special capture conditions, like using a flashlight. Alternatively, some methods explicitly decompose a scene into intrinsic components, such as normals and BR… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Project Website: https://neural-gaffer.github.io

  4. arXiv:2404.13026  [pdf, other

    cs.CV cs.AI

    PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation

    Authors: Tianyuan Zhang, Hong-Xing Yu, Rundi Wu, Brandon Y. Feng, Changxi Zheng, Noah Snavely, Jiajun Wu, William T. Freeman

    Abstract: Realistic object interactions are crucial for creating immersive virtual experiences, yet synthesizing realistic 3D object dynamics in response to novel interactions remains a significant challenge. Unlike unconditional or text-conditioned dynamics generation, action-conditioned dynamics requires perceiving the physical material properties of objects and grounding the 3D motion prediction on these… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Project website at: https://physdreamer.github.io/

  5. arXiv:2312.04560  [pdf, other

    cs.CV cs.AI cs.GR

    NeRFiller: Completing Scenes via Generative 3D Inpainting

    Authors: Ethan Weber, Aleksander Hołyński, Varun Jampani, Saurabh Saxena, Noah Snavely, Abhishek Kar, Angjoo Kanazawa

    Abstract: We propose NeRFiller, an approach that completes missing portions of a 3D capture via generative 3D inpainting using off-the-shelf 2D visual generative models. Often parts of a captured 3D scene or object are missing due to mesh reconstruction failures or a lack of observations (e.g., contact regions, such as the bottom of objects, or hard-to-reach areas). We approach this challenging 3D inpaintin… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: Project page: https://ethanweber.me/nerfiller

  6. arXiv:2312.03884  [pdf, other

    cs.CV cs.GR

    WonderJourney: Going from Anywhere to Everywhere

    Authors: Hong-Xing Yu, Haoyi Duan, Junhwa Hur, Kyle Sargent, Michael Rubinstein, William T. Freeman, Forrester Cole, Deqing Sun, Noah Snavely, Jiajun Wu, Charles Herrmann

    Abstract: We introduce WonderJourney, a modularized framework for perpetual 3D scene generation. Unlike prior work on view generation that focuses on a single type of scenes, we start at any user-provided location (by a text description or an image) and generate a journey through a long sequence of diverse yet coherently connected 3D scenes. We leverage an LLM to generate textual descriptions of the scenes… ▽ More

    Submitted 12 April, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: Project website with video results: https://kovenyu.com/WonderJourney/

  7. arXiv:2309.07906  [pdf, other

    cs.CV

    Generative Image Dynamics

    Authors: Zhengqi Li, Richard Tucker, Noah Snavely, Aleksander Holynski

    Abstract: We present an approach to modeling an image-space prior on scene motion. Our prior is learned from a collection of motion trajectories extracted from real video sequences depicting natural, oscillatory dynamics such as trees, flowers, candles, and clothes swaying in the wind. We model this dense, long-term motion prior in the Fourier domain:given a single image, our trained model uses a frequency-… ▽ More

    Submitted 14 May, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Project website: http://generative-dynamics.github.io

  8. arXiv:2309.02420  [pdf, other

    cs.CV

    Doppelgangers: Learning to Disambiguate Images of Similar Structures

    Authors: Ruojin Cai, Joseph Tung, Qianqian Wang, Hadar Averbuch-Elor, Bharath Hariharan, Noah Snavely

    Abstract: We consider the visual disambiguation task of determining whether a pair of visually similar images depict the same or distinct 3D surfaces (e.g., the same or opposite sides of a symmetric building). Illusory image matches, where two images observe distinct but visually similar 3D surfaces, can be challenging for humans to differentiate, and can also lead 3D reconstruction algorithms to produce er… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: Published in ICCV 2023 (Oral); Project page: http://doppelgangers-3d.github.io/

  9. arXiv:2306.07970  [pdf, other

    cs.CV

    Neural Scene Chronology

    Authors: Haotong Lin, Qianqian Wang, Ruojin Cai, Sida Peng, Hadar Averbuch-Elor, Xiaowei Zhou, Noah Snavely

    Abstract: In this work, we aim to reconstruct a time-varying 3D model, capable of rendering photo-realistic renderings with independent control of viewpoint, illumination, and time, from Internet photos of large-scale landmarks. The core challenges are twofold. First, different types of temporal changes, such as illumination and changes to the underlying scene itself (such as replacing one graffiti artwork… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: CVPR 2023; Project page: https://zju3dv.github.io/neusc/

  10. arXiv:2306.05422  [pdf, other

    cs.CV

    Tracking Everything Everywhere All at Once

    Authors: Qianqian Wang, Yen-Yu Chang, Ruojin Cai, Zhengqi Li, Bharath Hariharan, Aleksander Holynski, Noah Snavely

    Abstract: We present a new test-time optimization method for estimating dense and long-range motion from a video sequence. Prior optical flow or particle video tracking algorithms typically operate within limited temporal windows, struggling to track through occlusions and maintain global consistency of estimated motion trajectories. We propose a complete and globally consistent motion representation, dubbe… ▽ More

    Submitted 12 September, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: ICCV 2023

  11. arXiv:2304.04848  [pdf, other

    cs.CV

    Neural Lens Modeling

    Authors: Wenqi Xian, Aljaž Božič, Noah Snavely, Christoph Lassner

    Abstract: Recent methods for 3D reconstruction and rendering increasingly benefit from end-to-end optimization of the entire image formation process. However, this approach is currently limited: effects of the optical hardware stack and in particular lenses are hard to model in a unified way. This limits the quality that can be achieved for camera calibration and the fidelity of the results of 3D reconstruc… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

    Comments: To be presented at CVPR 2023, Project webpage: https://neural-lens.github.io

  12. arXiv:2303.16201  [pdf, other

    cs.CV cs.AI cs.LG

    ASIC: Aligning Sparse in-the-wild Image Collections

    Authors: Kamal Gupta, Varun Jampani, Carlos Esteves, Abhinav Shrivastava, Ameesh Makadia, Noah Snavely, Abhishek Kar

    Abstract: We present a method for joint alignment of sparse in-the-wild image collections of an object category. Most prior works assume either ground-truth keypoint annotations or a large dataset of images of a single object category. However, neither of the above assumptions hold true for the long-tail of the objects present in the world. We present a self-supervised technique that directly optimizes on a… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: Web: https://kampta.github.io/asic

  13. arXiv:2303.13515  [pdf, other

    cs.CV cs.LG

    Persistent Nature: A Generative Model of Unbounded 3D Worlds

    Authors: Lucy Chai, Richard Tucker, Zhengqi Li, Phillip Isola, Noah Snavely

    Abstract: Despite increasingly realistic image quality, recent 3D image generative models often operate on 3D volumes of fixed extent with limited camera motions. We investigate the task of unconditionally synthesizing unbounded nature scenes, enabling arbitrarily large camera motion while maintaining a persistent 3D world model. Our scene representation consists of an extendable, planar scene layout grid,… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: CVPR camera ready version, project page: https://chail.github.io/persistent-nature/

  14. arXiv:2301.05211  [pdf, other

    cs.CV cs.GR

    Accidental Light Probes

    Authors: Hong-Xing Yu, Samir Agarwala, Charles Herrmann, Richard Szeliski, Noah Snavely, Jiajun Wu, Deqing Sun

    Abstract: Recovering lighting in a scene from a single image is a fundamental problem in computer vision. While a mirror ball light probe can capture omnidirectional lighting, light probes are generally unavailable in everyday images. In this work, we study recovering lighting from accidental light probes (ALPs) -- common, shiny objects like Coke cans, which often accidentally appear in daily scenes. We pro… ▽ More

    Submitted 10 June, 2023; v1 submitted 12 January, 2023; originally announced January 2023.

    Comments: CVPR2023. Project website: https://kovenyu.com/ALP/

  15. arXiv:2212.04965  [pdf, other

    cs.CV

    Seeing a Rose in Five Thousand Ways

    Authors: Yunzhi Zhang, Shangzhe Wu, Noah Snavely, Jiajun Wu

    Abstract: What is a rose, visually? A rose comprises its intrinsics, including the distribution of geometry, texture, and material specific to its object category. With knowledge of these intrinsic properties, we may render roses of different sizes and shapes, in different poses, and under different lighting conditions. In this work, we build a generative model that learns to capture such object intrinsics… ▽ More

    Submitted 20 May, 2024; v1 submitted 9 December, 2022; originally announced December 2022.

    Comments: CVPR 2023. Project page: https://cs.stanford.edu/~yzzhang/projects/rose/

  16. arXiv:2211.11082  [pdf, other

    cs.CV

    DynIBaR: Neural Dynamic Image-Based Rendering

    Authors: Zhengqi Li, Qianqian Wang, Forrester Cole, Richard Tucker, Noah Snavely

    Abstract: We address the problem of synthesizing novel views from a monocular video depicting a complex dynamic scene. State-of-the-art methods based on temporally varying Neural Radiance Fields (aka dynamic NeRFs) have shown impressive results on this task. However, for long videos with complex object motions and uncontrolled camera trajectories, these methods can produce blurry or inaccurate renderings, h… ▽ More

    Submitted 24 April, 2023; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: Award Candidate, CVPR 2023 Project page: dynibar.github.io

  17. arXiv:2211.02145  [pdf, other

    cs.CV

    FactorMatte: Redefining Video Matting for Re-Composition Tasks

    Authors: Zeqi Gu, Wenqi Xian, Noah Snavely, Abe Davis

    Abstract: We propose "factor matting", an alternative formulation of the video matting problem in terms of counterfactual video synthesis that is better suited for re-composition tasks. The goal of factor matting is to separate the contents of video into independent components, each visualizing a counterfactual version of the scene where contents of other components have been removed. We show that factor ma… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: Project webpage: https://factormatte.github.io

  18. arXiv:2210.06642  [pdf, other

    cs.CV cs.GR

    What's in a Decade? Transforming Faces Through Time

    Authors: Eric Ming Chen, Jin Sun, Apoorv Khandelwal, Dani Lischinski, Noah Snavely, Hadar Averbuch-Elor

    Abstract: How can one visually characterize people in a decade? In this work, we assemble the Faces Through Time dataset, which contains over a thousand portrait images from each decade, spanning the 1880s to the present day. Using our new dataset, we present a framework for resynthesizing portrait images across time, imagining how a portrait taken during a particular decade might have looked like, had it b… ▽ More

    Submitted 31 January, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: Project Page: https://facesthroughtime.github.io

  19. arXiv:2209.04061  [pdf, other

    cs.CV

    im2nerf: Image to Neural Radiance Field in the Wild

    Authors: Lu Mi, Abhijit Kundu, David Ross, Frank Dellaert, Noah Snavely, Alireza Fathi

    Abstract: We propose im2nerf, a learning framework that predicts a continuous neural object representation given a single input image in the wild, supervised by only segmentation output from off-the-shelf recognition methods. The standard approach to constructing neural radiance fields takes advantage of multi-view consistency and requires many calibrated views of a scene, a requirement that cannot be satis… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.

    Comments: 12 pages, 8 figures, 4 tables

  20. arXiv:2207.11148  [pdf, other

    cs.CV

    InfiniteNature-Zero: Learning Perpetual View Generation of Natural Scenes from Single Images

    Authors: Zhengqi Li, Qianqian Wang, Noah Snavely, Angjoo Kanazawa

    Abstract: We present a method for learning to generate unbounded flythrough videos of natural scenes starting from a single view, where this capability is learned from a collection of single photographs, without requiring camera poses or even multiple views of each scene. To achieve this, we propose a novel self-supervised view generation training paradigm, where we sample and rendering virtual camera traje… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

    Comments: ECCV 2022 (Oral Presentation)

  21. arXiv:2206.06360  [pdf, other

    cs.CV

    ARF: Artistic Radiance Fields

    Authors: Kai Zhang, Nick Kolkin, Sai Bi, Fujun Luan, Zexiang Xu, Eli Shechtman, Noah Snavely

    Abstract: We present a method for transferring the artistic features of an arbitrary style image to a 3D scene. Previous methods that perform 3D stylization on point clouds or meshes are sensitive to geometric reconstruction errors for complex real-world scenes. Instead, we propose to stylize the more robust radiance field representation. We find that the commonly used Gram matrix-based loss tends to produc… ▽ More

    Submitted 13 June, 2022; originally announced June 2022.

    Comments: Project page: https://www.cs.cornell.edu/projects/arf/

  22. Neural 3D Reconstruction in the Wild

    Authors: Jiaming Sun, Xi Chen, Qianqian Wang, Zhengqi Li, Hadar Averbuch-Elor, Xiaowei Zhou, Noah Snavely

    Abstract: We are witnessing an explosion of neural implicit representations in computer vision and graphics. Their applicability has recently expanded beyond tasks such as shape generation and image-based rendering to the fundamental problem of image-based 3D reconstruction. However, existing methods typically assume constrained 3D environments with constant illumination captured by a small set of roughly u… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted to SIGGRAPH 2022 (Conference Proceedings). Project page: https://zju3dv.github.io/neuralrecon-w/

  23. arXiv:2205.06255  [pdf, other

    cs.CV

    3D Moments from Near-Duplicate Photos

    Authors: Qianqian Wang, Zhengqi Li, David Salesin, Noah Snavely, Brian Curless, Janne Kontkanen

    Abstract: We introduce 3D Moments, a new computational photography effect. As input we take a pair of near-duplicate photos, i.e., photos of moving subjects from similar viewpoints, common in people's photo collections. As output, we produce a video that smoothly interpolates the scene motion from the first photo to the second, while also producing camera motion with parallax that gives a heightened sense o… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: CVPR 2022

  24. arXiv:2204.07151  [pdf, other

    cs.CV

    Deformable Sprites for Unsupervised Video Decomposition

    Authors: Vickie Ye, Zhengqi Li, Richard Tucker, Angjoo Kanazawa, Noah Snavely

    Abstract: We describe a method to extract persistent elements of a dynamic scene from an input video. We represent each scene element as a \emph{Deformable Sprite} consisting of three components: 1) a 2D texture image for the entire video, 2) per-frame masks for the element, and 3) non-rigid deformations that map the texture image into each video frame. The resulting decomposition allows for applications su… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: CVPR 2022 Oral. Project Site: https://deformable-sprites.github.io

  25. arXiv:2204.02232  [pdf, other

    cs.CV

    IRON: Inverse Rendering by Optimizing Neural SDFs and Materials from Photometric Images

    Authors: Kai Zhang, Fujun Luan, Zhengqi Li, Noah Snavely

    Abstract: We propose a neural inverse rendering pipeline called IRON that operates on photometric images and outputs high-quality 3D content in the format of triangle meshes and material textures readily deployable in existing graphics pipelines. Our method adopts neural representations for geometry as signed distance fields (SDFs) and materials during optimization to enjoy their flexibility and compactness… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

    Comments: CVPR 2022; Project page is: https://kai-46.github.io/IRON-website/

  26. arXiv:2203.08414  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Unsupervised Semantic Segmentation by Distilling Feature Correspondences

    Authors: Mark Hamilton, Zhoutong Zhang, Bharath Hariharan, Noah Snavely, William T. Freeman

    Abstract: Unsupervised semantic segmentation aims to discover and localize semantically meaningful categories within image corpora without any form of annotation. To solve this task, algorithms must produce features for every pixel that are both semantically meaningful and compact enough to form distinct clusters. Unlike previous works which achieve this with a single end-to-end framework, we propose to sep… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

  27. arXiv:2112.01502  [pdf, other

    cs.CV

    Dimensions of Motion: Monocular Prediction through Flow Subspaces

    Authors: Richard Strong Bowen, Richard Tucker, Ramin Zabih, Noah Snavely

    Abstract: We introduce a way to learn to estimate a scene representation from a single image by predicting a low-dimensional subspace of optical flow for each training example, which encompasses the variety of possible camera and object movement. Supervision is provided by a novel loss which measures the distance between this predicted flow subspace and an observed optical flow. This provides a new approach… ▽ More

    Submitted 26 October, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

    Comments: Project page at https://dimensions-of-motion.github.io/

  28. arXiv:2108.07253  [pdf, other

    cs.CV cs.CL cs.LG

    Who's Waldo? Linking People Across Text and Images

    Authors: Claire Yuqing Cui, Apoorv Khandelwal, Yoav Artzi, Noah Snavely, Hadar Averbuch-Elor

    Abstract: We present a task and benchmark dataset for person-centric visual grounding, the problem of linking between people named in a caption and people pictured in an image. In contrast to prior work in visual grounding, which is predominantly object-based, our new task masks out the names of people in captions in order to encourage methods trained on such image-caption pairs to focus on contextual cues… ▽ More

    Submitted 17 August, 2021; v1 submitted 16 August, 2021; originally announced August 2021.

    Comments: Published in ICCV 2021 (Oral). Project webpage: https://whoswaldo.github.io

  29. arXiv:2108.05863  [pdf, other

    cs.CV

    Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision

    Authors: Xiaoshi Wu, Hadar Averbuch-Elor, Jin Sun, Noah Snavely

    Abstract: The abundance and richness of Internet photos of landmarks and cities has led to significant progress in 3D vision over the past two decades, including automated 3D reconstructions of the world's landmarks from tourist photos. However, a major source of information available for these 3D-augmented collections---namely language, e.g., from image captions---has been virtually untapped. In this work,… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

    Comments: Published in ICCV 2021; Project webpage: https://www.cs.cornell.edu/projects/babel/

  30. arXiv:2106.03336  [pdf, other

    cs.CV

    Wide-Baseline Relative Camera Pose Estimation with Directional Learning

    Authors: Kefan Chen, Noah Snavely, Ameesh Makadia

    Abstract: Modern deep learning techniques that regress the relative camera pose between two images have difficulty dealing with challenging scenarios, such as large camera motions resulting in occlusions and significant changes in perspective that leave little overlap between images. These models continue to struggle even with the benefit of large supervised training datasets. To address the limitations of… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

  31. arXiv:2104.13530  [pdf, other

    cs.CV

    Extreme Rotation Estimation using Dense Correlation Volumes

    Authors: Ruojin Cai, Bharath Hariharan, Noah Snavely, Hadar Averbuch-Elor

    Abstract: We present a technique for estimating the relative 3D rotation of an RGB image pair in an extreme setting, where the images have little or no overlap. We observe that, even when images do not overlap, there may be rich hidden cues as to their geometric relationship, such as light source directions, vanishing points, and symmetries present in the scene. We propose a network design that can automati… ▽ More

    Submitted 19 July, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

    Comments: Published in CVPR 2021; Project page: https://ruojincai.github.io/ExtremeRotation/

  32. arXiv:2104.11224  [pdf, other

    cs.CV cs.GR

    KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control

    Authors: Tomas Jakab, Richard Tucker, Ameesh Makadia, Jiajun Wu, Noah Snavely, Angjoo Kanazawa

    Abstract: We introduce KeypointDeformer, a novel unsupervised method for shape control through automatically discovered 3D keypoints. We cast this as the problem of aligning a source 3D object to a target 3D object from the same object category. Our method analyzes the difference between the shapes of the two objects by comparing their latent representations. This latent representation is in the form of 3D… ▽ More

    Submitted 22 April, 2021; originally announced April 2021.

    Comments: CVPR 2021 (oral). Project page: http://tomasjakab.github.io/KeypointDeformer

  33. arXiv:2104.03954  [pdf, other

    cs.CV cs.GR

    De-rendering the World's Revolutionary Artefacts

    Authors: Shangzhe Wu, Ameesh Makadia, Jiajun Wu, Noah Snavely, Richard Tucker, Angjoo Kanazawa

    Abstract: Recent works have shown exciting results in unsupervised image de-rendering -- learning to decompose 3D shape, appearance, and lighting from single-image collections without explicit supervision. However, many of these assume simplistic material and lighting models. We propose a method, termed RADAR, that can recover environment illumination and surface materials from real single-image collections… ▽ More

    Submitted 31 August, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

    Comments: CVPR 2021. Project page: https://sorderender.github.io/

  34. arXiv:2104.00674  [pdf, other

    cs.CV cs.GR

    PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Material Editing and Relighting

    Authors: Kai Zhang, Fujun Luan, Qianqian Wang, Kavita Bala, Noah Snavely

    Abstract: We present PhySG, an end-to-end inverse rendering pipeline that includes a fully differentiable renderer and can reconstruct geometry, materials, and illumination from scratch from a set of RGB input images. Our framework represents specular BRDFs and environmental illumination using mixtures of spherical Gaussians, and represents geometry as a signed distance function parameterized as a Multi-Lay… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

    Comments: Accepted to CVPR 2021; Project page: https://kai-46.github.io/PhySG-website/

  35. arXiv:2103.16183  [pdf, other

    cs.CV

    Repopulating Street Scenes

    Authors: Yifan Wang, Andrew Liu, Richard Tucker, Jiajun Wu, Brian L. Curless, Steven M. Seitz, Noah Snavely

    Abstract: We present a framework for automatically reconfiguring images of street scenes by populating, depopulating, or repopulating them with objects such as pedestrians or vehicles. Applications of this method include anonymizing images to enhance privacy, generating data augmentations for perception tasks like autonomous driving, and composing scenes to achieve a certain ambiance, such as empty streets… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

    Comments: CVPR 2021

  36. arXiv:2102.13090  [pdf, other

    cs.CV

    IBRNet: Learning Multi-View Image-Based Rendering

    Authors: Qianqian Wang, Zhicheng Wang, Kyle Genova, Pratul Srinivasan, Howard Zhou, Jonathan T. Barron, Ricardo Martin-Brualla, Noah Snavely, Thomas Funkhouser

    Abstract: We present a method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views. The core of our method is a network architecture that includes a multilayer perceptron and a ray transformer that estimates radiance and volume density at continuous 5D locations (3D spatial locations and 2D viewing directions), drawing appearance information on the fly from multiple s… ▽ More

    Submitted 6 April, 2021; v1 submitted 25 February, 2021; originally announced February 2021.

    Comments: CVPR 2021. Project page: https://ibrnet.github.io/

  37. arXiv:2012.09855  [pdf, other

    cs.CV cs.GR

    Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image

    Authors: Andrew Liu, Richard Tucker, Varun Jampani, Ameesh Makadia, Noah Snavely, Angjoo Kanazawa

    Abstract: We introduce the problem of perpetual view generation - long-range generation of novel views corresponding to an arbitrarily long camera trajectory given a single image. This is a challenging problem that goes far beyond the capabilities of current view synthesis methods, which quickly degenerate when presented with large camera motions. Methods for video generation also have limited ability to pr… ▽ More

    Submitted 30 November, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

    Comments: ICCV 2021 (oral); Project page: https://infinite-nature.github.io/; Video: https://www.youtube.com/watch?v=oXUf6anNAtc

  38. arXiv:2011.13583  [pdf, ps, other

    cs.CY cs.CV cs.DB

    An Ethical Highlighter for People-Centric Dataset Creation

    Authors: Margot Hanley, Apoorv Khandelwal, Hadar Averbuch-Elor, Noah Snavely, Helen Nissenbaum

    Abstract: Important ethical concerns arising from computer vision datasets of people have been receiving significant attention, and a number of datasets have been withdrawn as a result. To meet the academic need for people-centric datasets, we propose an analytical framework to guide ethical evaluation of existing datasets and to serve future dataset creators in avoiding missteps. Our work is informed by a… ▽ More

    Submitted 27 November, 2020; originally announced November 2020.

    Comments: Part of the Navigating the Broader Impacts of AI Research Workshop at NeurIPS 2020

  39. arXiv:2011.13084  [pdf, other

    cs.CV

    Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes

    Authors: Zhengqi Li, Simon Niklaus, Noah Snavely, Oliver Wang

    Abstract: We present a method to perform novel view and time synthesis of dynamic scenes, requiring only a monocular video with known camera poses as input. To do this, we introduce Neural Scene Flow Fields, a new representation that models the dynamic scene as a time-variant continuous function of appearance, geometry, and 3D scene motion. Our representation is optimized through a neural network to fit the… ▽ More

    Submitted 20 April, 2021; v1 submitted 25 November, 2020; originally announced November 2020.

    Comments: CVPR 2021, Project Website: http://www.cs.cornell.edu/~zl548/NSFF/

  40. arXiv:2011.10007  [pdf, other

    cs.CV cs.LG stat.ML

    Multi-Plane Program Induction with 3D Box Priors

    Authors: Yikai Li, Jiayuan Mao, Xiuming Zhang, William T. Freeman, Joshua B. Tenenbaum, Noah Snavely, Jiajun Wu

    Abstract: We consider two important aspects in understanding and editing images: modeling regular, program-like texture or patterns in 2D planes, and 3D posing of these planes in the scene. Unlike prior work on image-based program synthesis, which assumes the image contains a single visible 2D plane, we present Box Program Induction (BPI), which infers a program-like scene representation that simultaneously… ▽ More

    Submitted 22 November, 2020; v1 submitted 19 November, 2020; originally announced November 2020.

    Comments: NeurIPS 2020. First two authors contributed equally. Project page: http://bpi.csail.mit.edu

  41. arXiv:2010.07492  [pdf, other

    cs.CV

    NeRF++: Analyzing and Improving Neural Radiance Fields

    Authors: Kai Zhang, Gernot Riegler, Noah Snavely, Vladlen Koltun

    Abstract: Neural Radiance Fields (NeRF) achieve impressive view synthesis results for a variety of capture settings, including 360 capture of bounded scenes and forward-facing capture of bounded and unbounded scenes. NeRF fits multi-layer perceptrons (MLPs) representing view-invariant opacity and view-dependent color volumes to a set of training images, and samples novel views based on volume rendering tech… ▽ More

    Submitted 21 October, 2020; v1 submitted 14 October, 2020; originally announced October 2020.

    Comments: Code is available at https://github.com/Kai-46/nerfplusplus; fix a minor formatting issue in Fig. 4

  42. arXiv:2008.08701  [pdf, other

    cs.CV

    Hidden Footprints: Learning Contextual Walkability from 3D Human Trails

    Authors: Jin Sun, Hadar Averbuch-Elor, Qianqian Wang, Noah Snavely

    Abstract: Predicting where people can walk in a scene is important for many tasks, including autonomous driving systems and human behavior analysis. Yet learning a computational model for this purpose is challenging due to semantic ambiguity and a lack of labeled data: current datasets only tell you where people are, not where they could be. We tackle this problem by leveraging information from existing dat… ▽ More

    Submitted 19 August, 2020; originally announced August 2020.

    Comments: European Conference on Computer Vision (ECCV) 2020

  43. arXiv:2008.06520  [pdf, other

    cs.CV cs.LG

    Learning Gradient Fields for Shape Generation

    Authors: Ruojin Cai, Guandao Yang, Hadar Averbuch-Elor, Zekun Hao, Serge Belongie, Noah Snavely, Bharath Hariharan

    Abstract: In this work, we propose a novel technique to generate shapes from point cloud data. A point cloud can be viewed as samples from a distribution of 3D points whose density is concentrated near the surface of the shape. Point cloud generation thus amounts to moving randomly sampled points to high-density areas. We generate point clouds by performing stochastic gradient ascent on an unnormalized prob… ▽ More

    Submitted 18 August, 2020; v1 submitted 14 August, 2020; originally announced August 2020.

    Comments: Published in ECCV 2020 (Spotlight); Project page: https://www.cs.cornell.edu/~ruojin/ShapeGF/

  44. arXiv:2008.02796  [pdf, other

    cs.CV cs.GR

    Learning to Factorize and Relight a City

    Authors: Andrew Liu, Shiry Ginosar, Tinghui Zhou, Alexei A. Efros, Noah Snavely

    Abstract: We propose a learning-based framework for disentangling outdoor scenes into temporally-varying illumination and permanent scene factors. Inspired by the classic intrinsic image decomposition, our learning signal builds upon two insights: 1) combining the disentangled factors should reconstruct the original image, and 2) the permanent factors should stay constant across multiple temporal samples of… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

    Comments: ECCV 2020 (Spotlight). Supplemental Material attached

  45. arXiv:2007.15194  [pdf, other

    cs.CV

    Crowdsampling the Plenoptic Function

    Authors: Zhengqi Li, Wenqi Xian, Abe Davis, Noah Snavely

    Abstract: Many popular tourist landmarks are captured in a multitude of online, public photos. These photos represent a sparse and unstructured sampling of the plenoptic function for a particular scene. In this paper,we present a new approach to novel view synthesis under time-varying illumination from such data. Our approach builds on the recent multi-plane image (MPI) format for representing local light f… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

    Comments: ECCV, 2020 (Oral)

  46. arXiv:2006.14616  [pdf, ps, other

    cs.CV

    An Analysis of SVD for Deep Rotation Estimation

    Authors: Jake Levinson, Carlos Esteves, Kefan Chen, Noah Snavely, Angjoo Kanazawa, Afshin Rostamizadeh, Ameesh Makadia

    Abstract: Symmetric orthogonalization via SVD, and closely related procedures, are well-known techniques for projecting matrices onto $O(n)$ or $SO(n)$. These tools have long been used for applications in computer vision, for example optimal 3D alignment problems solved by orthogonal Procrustes, rotation averaging, or Essential matrix decomposition. Despite its utility in different settings, SVD orthogonali… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

  47. arXiv:2006.09662  [pdf, other

    cs.CV cs.GR cs.LG

    MetaSDF: Meta-learning Signed Distance Functions

    Authors: Vincent Sitzmann, Eric R. Chan, Richard Tucker, Noah Snavely, Gordon Wetzstein

    Abstract: Neural implicit shape representations are an emerging paradigm that offers many potential benefits over conventional discrete representations, including memory efficiency at a high spatial resolution. Generalizing across shapes with such neural implicit representations amounts to learning priors over the respective function space and enables geometry reconstruction from partial or noisy observatio… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

    Comments: Project website: https://vsitzmann.github.io/metasdf/

  48. Visual Chirality

    Authors: Zhiqiu Lin, Jin Sun, Abe Davis, Noah Snavely

    Abstract: How can we tell whether an image has been mirrored? While we understand the geometry of mirror reflections very well, less has been said about how it affects distributions of imagery at scale, despite widespread use for data augmentation in computer vision. In this paper, we investigate how the statistics of visual data are changed by reflection. We refer to these changes as "visual chirality", af… ▽ More

    Submitted 16 June, 2020; originally announced June 2020.

    Comments: Published at CVPR 2020, Best Paper Nomination, Oral Presentation. Project Page: https://linzhiqiu.github.io/papers/chirality/

    ACM Class: I.4

    Journal ref: CVPR (2020), 12292-12300

  49. arXiv:2004.13324  [pdf, other

    cs.CV

    Learning Feature Descriptors using Camera Pose Supervision

    Authors: Qianqian Wang, Xiaowei Zhou, Bharath Hariharan, Noah Snavely

    Abstract: Recent research on learned visual descriptors has shown promising improvements in correspondence estimation, a key component of many 3D vision tasks. However, existing descriptor learning frameworks typically require ground-truth correspondences between feature points for training, which are challenging to acquire at scale. In this paper we propose a novel weakly-supervised framework that can lear… ▽ More

    Submitted 29 January, 2024; v1 submitted 28 April, 2020; originally announced April 2020.

    Comments: ECCV 2020 (oral)

  50. arXiv:2004.11958  [pdf

    cs.CV cs.AI cs.LG eess.IV

    The Plant Pathology 2020 challenge dataset to classify foliar disease of apples

    Authors: Ranjita Thapa, Noah Snavely, Serge Belongie, Awais Khan

    Abstract: Apple orchards in the U.S. are under constant threat from a large number of pathogens and insects. Appropriate and timely deployment of disease management depends on early disease detection. Incorrect and delayed diagnosis can result in either excessive or inadequate use of chemicals, with increased production costs, environmental, and health impacts. We have manually captured 3,651 high-quality,… ▽ More

    Submitted 24 April, 2020; originally announced April 2020.

    Comments: 11 pages, 5 figures, Kaggle competition website: https://www.kaggle.com/c/plant-pathology-2020-fgvc7, CVPR fine-grained visual categorization website: https://sites.google.com/view/fgvc7/competitions

    ACM Class: I.2.1; I.2.10