Zum Hauptinhalt springen

Showing 1–23 of 23 results for author: Choy, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.09230  [pdf, other

    cs.CV

    Improving Distant 3D Object Detection Using 2D Box Supervision

    Authors: Zetong Yang, Zhiding Yu, Chris Choy, Renhao Wang, Anima Anandkumar, Jose M. Alvarez

    Abstract: Improving the detection of distant 3d objects is an important yet challenging task. For camera-based 3D perception, the annotation of 3d bounding relies heavily on LiDAR for accurate depth information. As such, the distance of annotation is often limited due to the sparsity of LiDAR points on distant objects, which hampers the capability of existing detectors for long-range scenarios. We address t… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  2. arXiv:2309.00583  [pdf, other

    cs.LG math.NA

    Geometry-Informed Neural Operator for Large-Scale 3D PDEs

    Authors: Zongyi Li, Nikola Borislavov Kovachki, Chris Choy, Boyi Li, Jean Kossaifi, Shourya Prakash Otta, Mohammad Amin Nabian, Maximilian Stadler, Christian Hundt, Kamyar Azizzadenesheli, Anima Anandkumar

    Abstract: We propose the geometry-informed neural operator (GINO), a highly efficient approach to learning the solution operator of large-scale partial differential equations with varying geometries. GINO uses a signed distance function and point-cloud representations of the input shape and neural operators based on graph and Fourier architectures to learn the solution operator. The graph neural operator ha… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

  3. arXiv:2305.13220  [pdf, other

    cs.CV

    Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids

    Authors: Wei Dong, Chris Choy, Charles Loop, Or Litany, Yuke Zhu, Anima Anandkumar

    Abstract: Indoor scene reconstruction from monocular images has long been sought after by augmented reality and robotics developers. Recent advances in neural field representations and monocular priors have led to remarkable results in scene-level surface reconstructions. The reliance on Multilayer Perceptrons (MLP), however, significantly limits speed in training and rendering. In this work, we propose to… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: CVPR 2023

  4. arXiv:2302.12251  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

    Authors: Yiming Li, Zhiding Yu, Christopher Choy, Chaowei Xiao, Jose M. Alvarez, Sanja Fidler, Chen Feng, Anima Anandkumar

    Abstract: Humans can easily imagine the complete 3D geometry of occluded objects and scenes. This appealing ability is vital for recognition and understanding. To enable such capability in AI systems, we propose VoxFormer, a Transformer-based semantic scene completion framework that can output complete 3D volumetric semantics from only 2D images. Our framework adopts a two-stage design where we start from a… ▽ More

    Submitted 25 March, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: CVPR 2023 Highlight (10% of accepted papers, 2.5% of submissions)

  5. arXiv:2208.11537  [pdf, other

    cs.CV

    PeRFception: Perception using Radiance Fields

    Authors: Yoonwoo Jeong, Seungjoo Shin, Junha Lee, Christopher Choy, Animashree Anandkumar, Minsu Cho, Jaesik Park

    Abstract: The recent progress in implicit 3D representation, i.e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner. This new representation can effectively convey the information of hundreds of high-resolution images in one compact format and allows photorealistic synthesis of novel views. In this work, using the variant of NeRF call… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

    Comments: Project Page: https://postech-cvlab.github.io/PeRFception/

  6. arXiv:2206.08077  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Neural Scene Representation for Locomotion on Structured Terrain

    Authors: David Hoeller, Nikita Rudin, Christopher Choy, Animashree Anandkumar, Marco Hutter

    Abstract: We propose a learning-based method to reconstruct the local terrain for locomotion with a mobile robot traversing urban environments. Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the algorithm estimates the topography in the robot's vicinity. The raw measurements from these cameras are noisy and only provide partial and occluded observations that in man… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

  7. arXiv:2203.06856  [pdf, other

    cs.CV cs.AI cs.RO

    ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation

    Authors: Bokui Shen, Zhenyu Jiang, Christopher Choy, Leonidas J. Guibas, Silvio Savarese, Anima Anandkumar, Yuke Zhu

    Abstract: Manipulating volumetric deformable objects in the real world, like plush toys and pizza dough, bring substantial challenges due to infinite shape variations, non-rigid motions, and partial observability. We introduce ACID, an action-conditional visual dynamics model for volumetric deformable objects based on structured implicit neural representations. ACID integrates two new techniques: implicit r… ▽ More

    Submitted 5 August, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: RSS 2022 Best Student Paper Award Finalist. Please check out more details at https://b0ku1.github.io/acid/

    Journal ref: Robotics: Science and Systems (RSS), 2022

  8. arXiv:2112.01316  [pdf, other

    cs.CV

    Putting 3D Spatially Sparse Networks on a Diet

    Authors: Junha Lee, Christopher Choy, Jaesik Park

    Abstract: 3D neural networks have become prevalent for many 3D vision tasks including object detection, segmentation, registration, and various perception tasks for 3D inputs. However, due to the sparsity and irregularity of 3D data, custom 3D operators or network designs have been the primary focus of research, while the size of networks or efficacy of parameters has been overlooked. In this work, we perfo… ▽ More

    Submitted 8 April, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

  9. arXiv:2108.13826  [pdf, other

    cs.CV

    Self-Calibrating Neural Radiance Fields

    Authors: Yoonwoo Jeong, Seokjun Ahn, Christopher Choy, Animashree Anandkumar, Minsu Cho, Jaesik Park

    Abstract: In this work, we propose a camera self-calibration algorithm for generic cameras with arbitrary non-linear distortions. We jointly learn the geometry of the scene and the accurate camera parameters without any calibration objects. Our camera model consists of a pinhole model, a fourth order radial distortion, and a generic noise model that can learn arbitrary non-linear camera distortions. While t… ▽ More

    Submitted 2 September, 2021; v1 submitted 31 August, 2021; originally announced August 2021.

    Comments: Accepted in ICCV21, Project Page: https://postech-cvlab.github.io/SCNeRF/

  10. arXiv:2105.06464  [pdf, other

    cs.CV cs.LG

    DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

    Authors: Shiyi Lan, Zhiding Yu, Christopher Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Larry S. Davis, Anima Anandkumar

    Abstract: We introduce DiscoBox, a novel framework that jointly learns instance segmentation and semantic correspondence using bounding box supervision. Specifically, we propose a self-ensembling framework where instance segmentation and semantic correspondence are jointly guided by a structured teacher in addition to the bounding box supervision. The teacher is a structured energy model incorporating a pai… ▽ More

    Submitted 5 June, 2021; v1 submitted 13 May, 2021; originally announced May 2021.

    Comments: Tech Report

  11. arXiv:2006.12356  [pdf, other

    cs.CV

    Generative Sparse Detection Networks for 3D Single-shot Object Detection

    Authors: JunYoung Gwak, Christopher Choy, Silvio Savarese

    Abstract: 3D object detection has been widely studied due to its potential applicability to many promising areas such as robotics and augmented reality. Yet, the sparse nature of the 3D data poses unique challenges to this task. Most notably, the observable surface of the 3D point clouds is disjoint from the center of the instance to ground the bounding box prediction on. To this end, we propose Generative… ▽ More

    Submitted 22 June, 2020; originally announced June 2020.

  12. arXiv:2005.08144  [pdf, other

    cs.CV cs.LG stat.ML

    High-dimensional Convolutional Networks for Geometric Pattern Recognition

    Authors: Christopher Choy, Junha Lee, Rene Ranftl, Jaesik Park, Vladlen Koltun

    Abstract: Many problems in science and engineering can be formulated in terms of geometric patterns in high-dimensional spaces. We present high-dimensional convolutional networks (ConvNets) for pattern recognition problems that arise in the context of geometric registration. We first study the effectiveness of convolutional networks in detecting linear subspaces in high-dimensional spaces with up to 32 dime… ▽ More

    Submitted 16 May, 2020; originally announced May 2020.

    Comments: Accepted for CVPR 2020 oral presentation

  13. arXiv:2004.11540  [pdf, other

    cs.CV cs.CG cs.LG eess.IV

    Deep Global Registration

    Authors: Christopher Choy, Wei Dong, Vladlen Koltun

    Abstract: We present Deep Global Registration, a differentiable framework for pairwise registration of real-world 3D scans. Deep global registration is based on three modules: a 6-dimensional convolutional network for correspondence confidence prediction, a differentiable Weighted Procrustes algorithm for closed-form pose estimation, and a robust gradient-based SE(3) optimizer for pose refinement. Experimen… ▽ More

    Submitted 8 May, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: Accepted for CVPR'20 oral presentation

  14. arXiv:2003.12622  [pdf, other

    cs.CV

    SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans

    Authors: Armen Avetisyan, Tatiana Khanova, Christopher Choy, Denver Dash, Angela Dai, Matthias Nießner

    Abstract: We present a novel approach to reconstructing lightweight, CAD-based representations of scanned 3D environments from commodity RGB-D sensors. Our key idea is to jointly optimize for both CAD model alignments as well as layout estimations of the scanned scene, explicitly modeling inter-relationships between objects-to-objects and objects-to-layout. Since object arrangement and scene layout are intr… ▽ More

    Submitted 27 March, 2020; originally announced March 2020.

    Comments: Video here https://youtu.be/F0DpggYByh0

  15. arXiv:1904.08755  [pdf, other

    cs.CV cs.AI

    4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

    Authors: Christopher Choy, JunYoung Gwak, Silvio Savarese

    Abstract: In many robotics and VR/AR applications, 3D-videos are readily-available sources of input (a continuous sequence of depth images, or LIDAR scans). However, those 3D-videos are processed frame-by-frame either through 2D convnets or 3D perception algorithms. In this work, we propose 4-dimensional convolutional neural networks for spatio-temporal perception that can directly process such 3D-videos us… ▽ More

    Submitted 13 June, 2019; v1 submitted 18 April, 2019; originally announced April 2019.

    Comments: CVPR'19

  16. arXiv:1803.08495  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings

    Authors: Kevin Chen, Christopher B. Choy, Manolis Savva, Angel X. Chang, Thomas Funkhouser, Silvio Savarese

    Abstract: We present a method for generating colored 3D shapes from natural language. To this end, we first learn joint embeddings of freeform text descriptions and colored 3D shapes. Our model combines and extends learning by association and metric learning approaches to learn implicit cross-modal connections, and produces a joint representation that captures the many-to-many relations between language and… ▽ More

    Submitted 22 March, 2018; originally announced March 2018.

  17. arXiv:1710.07563  [pdf, other

    cs.CV

    SEGCloud: Semantic Segmentation of 3D Point Clouds

    Authors: Lyne P. Tchapmi, Christopher B. Choy, Iro Armeni, JunYoung Gwak, Silvio Savarese

    Abstract: 3D semantic scene labeling is fundamental to agents operating in the real world. In particular, labeling raw 3D point sets from sensors provides fine-grained semantics. Recent works leverage the capabilities of Neural Networks (NNs), but are limited to coarse voxel predictions and do not explicitly enforce global consistency. We present SEGCloud, an end-to-end framework to obtain 3D point-level se… ▽ More

    Submitted 20 October, 2017; originally announced October 2017.

    Comments: Accepted as a spotlight at the International Conference of 3D Vision (3DV 2017)

  18. arXiv:1708.04672  [pdf, other

    cs.CV cs.GR

    DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image

    Authors: Andrey Kurenkov, Jingwei Ji, Animesh Garg, Viraj Mehta, JunYoung Gwak, Christopher Choy, Silvio Savarese

    Abstract: 3D reconstruction from a single image is a key problem in multiple applications ranging from robotic manipulation to augmented reality. Prior methods have tackled this problem through generative models which predict 3D reconstructions as voxels or point clouds. However, these methods can be computationally expensive and miss fine details. We introduce a new differentiable layer for 3D data deforma… ▽ More

    Submitted 10 August, 2017; originally announced August 2017.

    Comments: 11 pages, 9 figures, NIPS

  19. arXiv:1705.10904  [pdf, other

    cs.CV

    Weakly supervised 3D Reconstruction with Adversarial Constraint

    Authors: JunYoung Gwak, Christopher B. Choy, Animesh Garg, Manmohan Chandraker, Silvio Savarese

    Abstract: Supervised 3D reconstruction has witnessed a significant progress through the use of deep neural networks. However, this increase in performance requires large scale annotations of 2D/3D data. In this paper, we explore inexpensive 2D supervision as an alternative for expensive 3D CAD annotation. Specifically, we use foreground masks as weak supervision through a raytrace pooling layer that enables… ▽ More

    Submitted 4 October, 2017; v1 submitted 30 May, 2017; originally announced May 2017.

  20. arXiv:1704.04394  [pdf, other

    cs.CV

    DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents

    Authors: Namhoon Lee, Wongun Choi, Paul Vernaza, Christopher B. Choy, Philip H. S. Torr, Manmohan Chandraker

    Abstract: We introduce a Deep Stochastic IOC RNN Encoderdecoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes. DESIRE effectively predicts future locations of objects in multiple scenes by 1) accounting for the multi-modal nature of the future prediction (i.e., given the same context, future may vary), 2) foreseeing the potential future outcomes and m… ▽ More

    Submitted 14 April, 2017; originally announced April 2017.

    Comments: Accepted at CVPR 2017

  21. arXiv:1701.02426  [pdf, other

    cs.CV

    Scene Graph Generation by Iterative Message Passing

    Authors: Danfei Xu, Yuke Zhu, Christopher B. Choy, Li Fei-Fei

    Abstract: Understanding a visual scene goes beyond recognizing individual objects in isolation. Relationships between objects also constitute rich semantic information about the scene. In this work, we explicitly model the objects and their relationships using scene graphs, a visually-grounded graphical structure of an image. We propose a novel end-to-end model that generates such structured scene represent… ▽ More

    Submitted 12 April, 2017; v1 submitted 9 January, 2017; originally announced January 2017.

    Comments: CVPR 2017

  22. arXiv:1606.03558  [pdf, other

    cs.CV

    Universal Correspondence Network

    Authors: Christopher B. Choy, JunYoung Gwak, Silvio Savarese, Manmohan Chandraker

    Abstract: We present a deep learning framework for accurate visual correspondences and demonstrate its effectiveness for both geometric and semantic matching, spanning across rigid motions to intra-class shape or appearance variations. In contrast to previous CNN-based approaches that optimize a surrogate patch similarity objective, we use deep metric learning to directly learn a feature space that preserve… ▽ More

    Submitted 31 October, 2016; v1 submitted 11 June, 2016; originally announced June 2016.

    Comments: To appear at NIPS 2016 as full oral presentation

  23. arXiv:1604.00449  [pdf, other

    cs.CV cs.AI

    3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction

    Authors: Christopher B. Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, Silvio Savarese

    Abstract: Inspired by the recent success of methods that employ shape priors to achieve robust 3D reconstructions, we propose a novel recurrent neural network architecture that we call the 3D Recurrent Reconstruction Neural Network (3D-R2N2). The network learns a mapping from images of objects to their underlying 3D shapes from a large collection of synthetic data. Our network takes in one or more images of… ▽ More

    Submitted 1 April, 2016; originally announced April 2016.

    Comments: Appendix can be found at http://cvgl.stanford.edu/papers/choy_16_appendix.pdf