Zum Hauptinhalt springen

Showing 1–27 of 27 results for author: Tyree, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.00463  [pdf, other

    cs.CV cs.RO

    Diff-DOPE: Differentiable Deep Object Pose Estimation

    Authors: Jonathan Tremblay, Bowen Wen, Valts Blukis, Balakumar Sundaralingam, Stephen Tyree, Stan Birchfield

    Abstract: We introduce Diff-DOPE, a 6-DoF pose refiner that takes as input an image, a 3D textured model of an object, and an initial pose of the object. The method uses differentiable rendering to update the object pose to minimize the visual error between the image and the projection of the model. We show that this simple, yet effective, idea is able to achieve state-of-the-art results on pose estimation… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

    Comments: Submitted to ICRA 2023. Project page is at https://diffdope.github.io

  2. arXiv:2308.01477  [pdf, other

    cs.RO cs.CV

    HANDAL: A Dataset of Real-World Manipulable Object Categories with Pose Annotations, Affordances, and Reconstructions

    Authors: Andrew Guo, Bowen Wen, Jianhe Yuan, Jonathan Tremblay, Stephen Tyree, Jeffrey Smith, Stan Birchfield

    Abstract: We present the HANDAL dataset for category-level object pose estimation and affordance prediction. Unlike previous datasets, ours is focused on robotics-ready manipulable objects that are of the proper size and shape for functional grasping by robot manipulators, such as pliers, utensils, and screwdrivers. Our annotation process is streamlined, requiring only a single off-the-shelf camera and semi… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: IROS 2023. Project page: https://nvlabs.github.io/HANDAL/

  3. arXiv:2303.14158  [pdf, other

    cs.CV cs.AI cs.GR cs.RO

    BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects

    Authors: Bowen Wen, Jonathan Tremblay, Valts Blukis, Stephen Tyree, Thomas Muller, Alex Evans, Dieter Fox, Jan Kautz, Stan Birchfield

    Abstract: We present a near real-time method for 6-DoF tracking of an unknown object from a monocular RGBD video sequence, while simultaneously performing neural 3D reconstruction of the object. Our method works for arbitrary rigid objects, even when visual texture is largely absent. The object is assumed to be segmented in the first frame only. No additional information is required, and no assumption is ma… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

    Comments: CVPR 2023

  4. arXiv:2212.06870  [pdf, other

    cs.CV cs.RO

    MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare

    Authors: Yann Labbé, Lucas Manuelli, Arsalan Mousavian, Stephen Tyree, Stan Birchfield, Jonathan Tremblay, Justin Carpentier, Mathieu Aubry, Dieter Fox, Josef Sivic

    Abstract: We introduce MegaPose, a method to estimate the 6D pose of novel objects, that is, objects unseen during training. At inference time, the method only assumes knowledge of (i) a region of interest displaying the object in the image and (ii) a CAD model of the observed object. The contributions of this work are threefold. First, we present a 6D pose refiner based on a render&compare strategy which c… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

    Comments: CoRL 2022

  5. arXiv:2210.11668  [pdf, other

    cs.RO cs.CV

    RGB-Only Reconstruction of Tabletop Scenes for Collision-Free Manipulator Control

    Authors: Zhenggang Tang, Balakumar Sundaralingam, Jonathan Tremblay, Bowen Wen, Ye Yuan, Stephen Tyree, Charles Loop, Alexander Schwing, Stan Birchfield

    Abstract: We present a system for collision-free control of a robot manipulator that uses only RGB views of the world. Perceptual input of a tabletop scene is provided by multiple images of an RGB camera (without depth) that is either handheld or mounted on the robot end effector. A NeRF-like process is used to reconstruct the 3D geometry of the scene, from which the Euclidean full signed distance function… ▽ More

    Submitted 10 March, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: ICRA 2023. Project page at https://ngp-mpc.github.io/

  6. arXiv:2210.10108  [pdf, other

    cs.CV cs.RO

    Parallel Inversion of Neural Radiance Fields for Robust Pose Estimation

    Authors: Yunzhi Lin, Thomas Müller, Jonathan Tremblay, Bowen Wen, Stephen Tyree, Alex Evans, Patricio A. Vela, Stan Birchfield

    Abstract: We present a parallelized optimization method based on fast Neural Radiance Fields (NeRF) for estimating 6-DoF pose of a camera with respect to an object or scene. Given a single observed RGB image of the target, we can predict the translation and rotation of the camera by minimizing the residual between pixels rendered from a fast NeRF model and pixels in the observed image. We integrate a moment… ▽ More

    Submitted 10 March, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: ICRA 2023. Project page at https://pnerfp.github.io/

  7. arXiv:2205.11047  [pdf, other

    cs.CV cs.RO

    Keypoint-Based Category-Level Object Pose Tracking from an RGB Sequence with Uncertainty Estimation

    Authors: Yunzhi Lin, Jonathan Tremblay, Stephen Tyree, Patricio A. Vela, Stan Birchfield

    Abstract: We propose a single-stage, category-level 6-DoF pose estimation algorithm that simultaneously detects and tracks instances of objects within a known category. Our method takes as input the previous and current frame from a monocular RGB video, as well as predictions from the previous frame, to predict the bounding cuboid and 6-DoF pose (up to scale). Internally, a deep network predicts distributio… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

    Comments: ICRA 2022. Project site is at https://sites.google.com/view/centerposetrack

  8. arXiv:2203.05701  [pdf, other

    cs.RO cs.CV

    6-DoF Pose Estimation of Household Objects for Robotic Manipulation: An Accessible Dataset and Benchmark

    Authors: Stephen Tyree, Jonathan Tremblay, Thang To, Jia Cheng, Terry Mosier, Jeffrey Smith, Stan Birchfield

    Abstract: We present a new dataset for 6-DoF pose estimation of known objects, with a focus on robotic manipulation research. We propose a set of toy grocery objects, whose physical instantiations are readily available for purchase and are appropriately sized for robotic grasping and manipulation. We provide 3D scanned textured models of these objects, suitable for generating synthetic training data, as wel… ▽ More

    Submitted 15 December, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

    Comments: IROS 2022. Project page is at https://github.com/swtyree/hope-dataset

  9. arXiv:2109.06161  [pdf, other

    cs.CV cs.RO

    Single-Stage Keypoint-Based Category-Level Object Pose Estimation from an RGB Image

    Authors: Yunzhi Lin, Jonathan Tremblay, Stephen Tyree, Patricio A. Vela, Stan Birchfield

    Abstract: Prior work on 6-DoF object pose estimation has largely focused on instance-level processing, in which a textured CAD model is available for each object being detected. Category-level 6-DoF pose estimation represents an important step toward developing robotic vision systems that operate in unstructured, real-world scenarios. In this work, we propose a single-stage, keypoint-based approach for cate… ▽ More

    Submitted 12 May, 2022; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: ICRA 2022. Project page at https://sites.google.com/view/centerpose

  10. arXiv:2105.13962  [pdf, other

    cs.CV cs.RO

    NViSII: A Scriptable Tool for Photorealistic Image Generation

    Authors: Nathan Morrical, Jonathan Tremblay, Yunzhi Lin, Stephen Tyree, Stan Birchfield, Valerio Pascucci, Ingo Wald

    Abstract: We present a Python-based renderer built on NVIDIA's OptiX ray tracing engine and the OptiX AI denoiser, designed to generate high-quality synthetic images for research in computer vision and deep learning. Our tool enables the description and manipulation of complex dynamic 3D scenes containing object meshes, materials, textures, lighting, volumetric data (e.g., smoke), and backgrounds. Metadata,… ▽ More

    Submitted 28 May, 2021; originally announced May 2021.

    Comments: SDG Workshop at ICLR 2021. Project page is at https://github.com/owl-project/NVISII

  11. arXiv:2103.13539  [pdf, other

    cs.RO

    Multi-View Fusion for Multi-Level Robotic Scene Understanding

    Authors: Yunzhi Lin, Jonathan Tremblay, Stephen Tyree, Patricio A. Vela, Stan Birchfield

    Abstract: We present a system for multi-level scene awareness for robotic manipulation. Given a sequence of camera-in-hand RGB images, the system calculates three types of information: 1) a point cloud representation of all the surfaces in the scene, for the purpose of obstacle avoidance; 2) the rough pose of unknown objects from categories corresponding to primitive shapes (e.g., cuboids and cylinders); an… ▽ More

    Submitted 14 October, 2021; v1 submitted 24 March, 2021; originally announced March 2021.

    Comments: Presented at IROS 2021. Video is at https://youtu.be/FuqMxuODGlw

  12. arXiv:2008.11822  [pdf, other

    cs.RO

    Indirect Object-to-Robot Pose Estimation from an External Monocular RGB Camera

    Authors: Jonathan Tremblay, Stephen Tyree, Terry Mosier, Stan Birchfield

    Abstract: We present a robotic grasping system that uses a single external monocular RGB camera as input. The object-to-robot pose is computed indirectly by combining the output of two neural networks: one that estimates the object-to-camera pose, and another that estimates the robot-to-camera pose. Both networks are trained entirely on synthetic data, relying on domain randomization to bridge the sim-to-re… ▽ More

    Submitted 26 August, 2020; originally announced August 2020.

    Comments: IROS 2020. Video at https://youtu.be/E0J91llX-ys

  13. arXiv:2005.07695  [pdf, other

    cs.RO

    How to Close Sim-Real Gap? Transfer with Segmentation!

    Authors: Mengyuan Yan, Qingyun Sun, Iuri Frosio, Stephen Tyree, Jan Kautz

    Abstract: One fundamental difficulty in robotic learning is the sim-real gap problem. In this work, we propose to use segmentation as the interface between perception and control, as a domain-invariant state representation. We identify two sources of sim-real gap, one is dynamics sim-real gap, the other is visual sim-real gap. To close dynamics sim-real gap, we propose to use closed-loop control. For comple… ▽ More

    Submitted 14 May, 2020; originally announced May 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:1712.03303

  14. arXiv:1906.10771  [pdf, other

    cs.LG cs.CV stat.ML

    Importance Estimation for Neural Network Pruning

    Authors: Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Frosio, Jan Kautz

    Abstract: Structural pruning of neural network parameters reduces computation, energy, and memory transfer costs during inference. We propose a novel method that estimates the contribution of a neuron (filter) to the final loss and iteratively removes those with smaller scores. We describe two variations of our method using the first and second-order Taylor expansions to approximate a filter's contribution.… ▽ More

    Submitted 25 June, 2019; originally announced June 2019.

  15. arXiv:1903.08114  [pdf, other

    cs.LG cs.DC stat.ML

    Exact Gaussian Processes on a Million Data Points

    Authors: Ke Alexander Wang, Geoff Pleiss, Jacob R. Gardner, Stephen Tyree, Kilian Q. Weinberger, Andrew Gordon Wilson

    Abstract: Gaussian processes (GPs) are flexible non-parametric models, with a capacity that grows with the available data. However, computational constraints with standard inference procedures have limited exact GPs to problems with fewer than about ten thousand training points, necessitating approximations for larger datasets. In this paper, we develop a scalable approach for exact GPs that leverages multi… ▽ More

    Submitted 10 December, 2019; v1 submitted 19 March, 2019; originally announced March 2019.

    Comments: Published at NeurIPS 2019

  16. arXiv:1805.07054  [pdf, other

    cs.RO

    Synthetically Trained Neural Networks for Learning Human-Readable Plans from Real-World Demonstrations

    Authors: Jonathan Tremblay, Thang To, Artem Molchanov, Stephen Tyree, Jan Kautz, Stan Birchfield

    Abstract: We present a system to infer and execute a human-readable program from a real-world demonstration. The system consists of a series of neural networks to perform perception, program generation, and program execution. Leveraging convolutional pose machines, the perception network reliably detects the bounding cuboids of objects in real images even when severely occluded, after training only on synth… ▽ More

    Submitted 10 July, 2018; v1 submitted 18 May, 2018; originally announced May 2018.

    Comments: IEEE International Conference on Robotics and Automation (ICRA) 2018. For associated video, see https://youtu.be/B7ZT5oSnRys

  17. arXiv:1712.03303  [pdf, other

    cs.RO

    Sim-to-Real Transfer of Accurate Grasping with Eye-In-Hand Observations and Continuous Control

    Authors: Mengyuan Yan, Iuri Frosio, Stephen Tyree, Jan Kautz

    Abstract: In the context of deep learning for robotics, we show effective method of training a real robot to grasp a tiny sphere (1.37cm of diameter), with an original combination of system design choices. We decompose the end-to-end system into a vision module and a closed-loop controller module. The two modules use target object segmentation as their common interface. The vision module extracts informatio… ▽ More

    Submitted 19 December, 2017; v1 submitted 8 December, 2017; originally announced December 2017.

    Comments: Neural Information Processing Systems (NIPS) 2017 Workshop on Acting and Interacting in the Real World: Challenges in Robot Learning

  18. arXiv:1709.01591  [pdf, other

    cs.CV

    Improving Landmark Localization with Semi-Supervised Learning

    Authors: Sina Honari, Pavlo Molchanov, Stephen Tyree, Pascal Vincent, Christopher Pal, Jan Kautz

    Abstract: We present two techniques to improve landmark localization in images from partially annotated datasets. Our primary goal is to leverage the common situation where precise landmark locations are only provided for a small data subset, but where class labels for classification or regression tasks related to the landmarks are more abundantly available. First, we propose the framework of sequential mul… ▽ More

    Submitted 28 October, 2018; v1 submitted 5 September, 2017; originally announced September 2017.

    Comments: Published as a conference paper in CVPR 2018

  19. arXiv:1705.07162  [pdf, other

    cs.CV

    A Lightweight Approach for On-the-Fly Reflectance Estimation

    Authors: Kihwan Kim, Jinwei Gu, Stephen Tyree, Pavlo Molchanov, Matthias Nießner, Jan Kautz

    Abstract: Estimating surface reflectance (BRDF) is one key component for complete 3D scene capture, with wide applications in virtual reality, augmented reality, and human computer interaction. Prior work is either limited to controlled environments (\eg gonioreflectometers, light stages, or multi-camera domes), or requires the joint optimization of shape, illumination, and reflectance, which is often compu… ▽ More

    Submitted 2 April, 2018; v1 submitted 19 May, 2017; originally announced May 2017.

    Comments: ICCV 2017

  20. arXiv:1611.06440  [pdf, other

    cs.LG stat.ML

    Pruning Convolutional Neural Networks for Resource Efficient Inference

    Authors: Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz

    Abstract: We propose a new formulation for pruning convolutional kernels in neural networks to enable efficient inference. We interleave greedy criteria-based pruning with fine-tuning by backpropagation - a computationally efficient procedure that maintains good generalization in the pruned network. We propose a new criterion based on Taylor expansion that approximates the change in the cost function induce… ▽ More

    Submitted 8 June, 2017; v1 submitted 19 November, 2016; originally announced November 2016.

    Comments: 17 pages, 14 figures, ICLR 2017 paper

  21. arXiv:1611.06256  [pdf, other

    cs.LG

    Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

    Authors: Mohammad Babaeizadeh, Iuri Frosio, Stephen Tyree, Jason Clemons, Jan Kautz

    Abstract: We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks. We analyze its computational traits and concentrate on aspects critical to leveraging the GPU's computational power. We introduce a system of queues and a dynamic scheduling strategy, potentially helpful for othe… ▽ More

    Submitted 2 March, 2017; v1 submitted 18 November, 2016; originally announced November 2016.

  22. arXiv:1506.04449  [pdf, other

    cs.LG cs.CV cs.NE

    Compressing Convolutional Neural Networks

    Authors: Wenlin Chen, James T. Wilson, Stephen Tyree, Kilian Q. Weinberger, Yixin Chen

    Abstract: Convolutional neural networks (CNN) are increasingly used in many areas of computer vision. They are particularly attractive because of their ability to "absorb" great quantities of labeled data through millions of parameters. However, as model sizes increase, so do the storage and memory requirements of the classifiers. We present a novel network architecture, Frequency-Sensitive Hashed Nets (Fre… ▽ More

    Submitted 14 June, 2015; originally announced June 2015.

  23. arXiv:1504.04788  [pdf, other

    cs.LG cs.NE

    Compressing Neural Networks with the Hashing Trick

    Authors: Wenlin Chen, James T. Wilson, Stephen Tyree, Kilian Q. Weinberger, Yixin Chen

    Abstract: As deep nets are increasingly used in applications suited for mobile devices, a fundamental dilemma becomes apparent: the trend in deep learning is to grow models to absorb ever-increasing data set sizes; however mobile devices are designed with very little memory and cannot store such large models. We present a novel network architecture, HashedNets, that exploits inherent redundancy in neural ne… ▽ More

    Submitted 19 April, 2015; originally announced April 2015.

  24. arXiv:1501.06478  [pdf, other

    cs.LG

    Compressed Support Vector Machines

    Authors: Zhixiang Xu, Jacob R. Gardner, Stephen Tyree, Kilian Q. Weinberger

    Abstract: Support vector machines (SVM) can classify data sets along highly non-linear decision boundaries because of the kernel-trick. This expressiveness comes at a price: During test-time, the SVM classifier needs to compute the kernel inner-product between a test sample and all support vectors. With large training data sets, the time required for this computation can be substantial. In this paper, we in… ▽ More

    Submitted 2 February, 2015; v1 submitted 26 January, 2015; originally announced January 2015.

  25. arXiv:1412.1740  [pdf, other

    stat.ML cs.CV cs.LG

    Image Data Compression for Covariance and Histogram Descriptors

    Authors: Matt J. Kusner, Nicholas I. Kolkin, Stephen Tyree, Kilian Q. Weinberger

    Abstract: Covariance and histogram image descriptors provide an effective way to capture information about images. Both excel when used in combination with special purpose distance metrics. For covariance descriptors these metrics measure the distance along the non-Euclidean Riemannian manifold of symmetric positive definite matrices. For histogram descriptors the Earth Mover's distance measures the optimal… ▽ More

    Submitted 23 May, 2015; v1 submitted 4 December, 2014; originally announced December 2014.

  26. arXiv:1404.1066  [pdf, other

    cs.LG

    Parallel Support Vector Machines in Practice

    Authors: Stephen Tyree, Jacob R. Gardner, Kilian Q. Weinberger, Kunal Agrawal, John Tran

    Abstract: In this paper, we evaluate the performance of various parallel optimization methods for Kernel Support Vector Machines on multicore CPUs and GPUs. In particular, we provide the first comparison of algorithms with explicit and implicit parallelization. Most existing parallel implementations for multi-core or GPU architectures are based on explicit parallelization of Sequential Minimal Optimization… ▽ More

    Submitted 3 April, 2014; originally announced April 2014.

    Comments: 10 pages

  27. arXiv:1402.7001  [pdf, other

    cs.LG

    Marginalizing Corrupted Features

    Authors: Laurens van der Maaten, Minmin Chen, Stephen Tyree, Kilian Weinberger

    Abstract: The goal of machine learning is to develop predictors that generalize well to test data. Ideally, this is achieved by training on an almost infinitely large training data set that captures all variations in the data distribution. In practical learning settings, however, we do not have infinite data and our predictors may overfit. Overfitting may be combatted, for example, by adding a regularizer t… ▽ More

    Submitted 27 February, 2014; originally announced February 2014.