Zum Hauptinhalt springen

Showing 1–50 of 158 results for author: Urtasun, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08691  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    UnO: Unsupervised Occupancy Fields for Perception and Forecasting

    Authors: Ben Agro, Quinlan Sykora, Sergio Casas, Thomas Gilles, Raquel Urtasun

    Abstract: Perceiving the world and forecasting its future state is a critical task for self-driving. Supervised approaches leverage annotated object labels to learn a model of the world -- traditionally with object detections and trajectory predictions, or temporal bird's-eye-view (BEV) occupancy fields. However, these annotations are expensive and typically limited to a set of predefined categories that do… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  2. arXiv:2406.04426  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    DeTra: A Unified Model for Object Detection and Trajectory Forecasting

    Authors: Sergio Casas, Ben Agro, Jiageng Mao, Thomas Gilles, Alexander Cui, Thomas Li, Raquel Urtasun

    Abstract: The tasks of object detection and trajectory forecasting play a crucial role in understanding the scene for autonomous driving. These tasks are typically executed in a cascading manner, making them prone to compounding errors. Furthermore, there is usually a very thin interface between the two tasks, creating a lossy information bottleneck. To address these challenges, our approach formulates the… ▽ More

    Submitted 13 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  3. arXiv:2404.01486  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    QuAD: Query-based Interpretable Neural Motion Planning for Autonomous Driving

    Authors: Sourav Biswas, Sergio Casas, Quinlan Sykora, Ben Agro, Abbas Sadat, Raquel Urtasun

    Abstract: A self-driving vehicle must understand its environment to determine the appropriate action. Traditional autonomy systems rely on object detection to find the agents in the scene. However, object detection assumes a discrete set of objects and loses information about uncertainty, so any errors compound when predicting the future behavior of those agents. Alternatively, dense occupancy grid maps hav… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  4. arXiv:2312.06654  [pdf, other

    cs.CV cs.LG

    LightSim: Neural Lighting Simulation for Urban Scenes

    Authors: Ava Pun, Gary Sun, Jingkang Wang, Yun Chen, Ze Yang, Sivabalan Manivasagam, Wei-Chiu Ma, Raquel Urtasun

    Abstract: Different outdoor illumination conditions drastically alter the appearance of urban scenes, and they can harm the performance of image-based robot perception systems if not seen during training. Camera simulation provides a cost-effective solution to create a large dataset of images captured under different lighting conditions. Towards this goal, we propose LightSim, a neural lighting camera simul… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023. Project page: https://waabi.ai/lightsim/

  5. arXiv:2311.05607  [pdf, other

    cs.CV cs.AI cs.GR

    Real-Time Neural Rasterization for Large Scenes

    Authors: Jeffrey Yunfan Liu, Yun Chen, Ze Yang, Jingkang Wang, Sivabalan Manivasagam, Raquel Urtasun

    Abstract: We propose a new method for realistic real-time novel-view synthesis (NVS) of large scenes. Existing neural rendering methods generate realistic results, but primarily work for small scale scenes (<50 square meters) and have difficulty at large scale (>10000 square meters). Traditional graphics-based rasterization rendering is fast for large scenes but lacks realism and requires expensive manually… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: Published in ICCV 2023. webpage: https://waabi.ai/NeuRas/

  6. arXiv:2311.05602  [pdf, other

    cs.CV cs.RO

    Reconstructing Objects in-the-wild for Realistic Sensor Simulation

    Authors: Ze Yang, Sivabalan Manivasagam, Yun Chen, Jingkang Wang, Rui Hu, Raquel Urtasun

    Abstract: Reconstructing objects from real world data and rendering them at novel views is critical to bringing realism, diversity and scale to simulation for robotics training and testing. In this work, we present NeuSim, a novel approach that estimates accurate geometry and realistic appearance from sparse in-the-wild data captured at distance and at limited viewpoints. Towards this goal, we represent the… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: ICRA 2023. Project page: https://waabi.ai/neusim/

  7. arXiv:2311.02007  [pdf, other

    cs.CV cs.RO

    Towards Unsupervised Object Detection From LiDAR Point Clouds

    Authors: Lunjun Zhang, Anqi Joyce Yang, Yuwen Xiong, Sergio Casas, Bin Yang, Mengye Ren, Raquel Urtasun

    Abstract: In this paper, we study the problem of unsupervised object detection from 3D point clouds in self-driving scenes. We present a simple yet effective method that exploits (i) point clustering in near-range areas where the point clouds are dense, (ii) temporal consistency to filter out noisy unsupervised detections, (iii) translation equivariance of CNNs to extend the auto-labels to long range, and (… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: CVPR 2023

  8. arXiv:2311.01556  [pdf, other

    cs.CV cs.RO

    MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory

    Authors: Enxu Li, Sergio Casas, Raquel Urtasun

    Abstract: Semantic segmentation of LiDAR point clouds has been widely studied in recent years, with most existing methods focusing on tackling this task using a single scan of the environment. However, leveraging the temporal stream of observations can provide very rich contextual information on regions of the scene with poor visibility (e.g., occlusions) or sparse observations (e.g., at long range), and ca… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: accepted at ICCV 2023

  9. arXiv:2311.01520  [pdf, other

    cs.CV cs.RO

    4D-Former: Multimodal 4D Panoptic Segmentation

    Authors: Ali Athar, Enxu Li, Sergio Casas, Raquel Urtasun

    Abstract: 4D panoptic segmentation is a challenging but practically useful task that requires every point in a LiDAR point-cloud sequence to be assigned a semantic class label, and individual objects to be segmented and tracked over time. Existing approaches utilize only LiDAR inputs which convey limited information in regions with point sparsity. This problem can, however, be mitigated by utilizing RGB cam… ▽ More

    Submitted 17 November, 2023; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: accepted at CoRL 2023

  10. arXiv:2311.01448  [pdf, other

    cs.CV cs.LG cs.RO

    UltraLiDAR: Learning Compact Representations for LiDAR Completion and Generation

    Authors: Yuwen Xiong, Wei-Chiu Ma, Jingkang Wang, Raquel Urtasun

    Abstract: LiDAR provides accurate geometric measurements of the 3D world. Unfortunately, dense LiDARs are very expensive and the point clouds captured by low-beam LiDAR are often sparse. To address these issues, we present UltraLiDAR, a data-driven framework for scene-level LiDAR completion, LiDAR generation, and LiDAR manipulation. The crux of UltraLiDAR is a compact, discrete representation that encodes t… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: CVPR 2023. Project page: https://waabi.ai/ultralidar/

  11. arXiv:2311.01447  [pdf, other

    cs.CV cs.LG cs.RO

    CADSim: Robust and Scalable in-the-wild 3D Reconstruction for Controllable Sensor Simulation

    Authors: Jingkang Wang, Sivabalan Manivasagam, Yun Chen, Ze Yang, Ioan Andrei Bârsan, Anqi Joyce Yang, Wei-Chiu Ma, Raquel Urtasun

    Abstract: Realistic simulation is key to enabling safe and scalable development of % self-driving vehicles. A core component is simulating the sensors so that the entire autonomy system can be tested in simulation. Sensor simulation involves modeling traffic participants, such as vehicles, with high quality appearance and articulated geometry, and rendering them in real time. The self-driving industry has t… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: CoRL 2022. Project page: https://waabi.ai/cadsim/

  12. arXiv:2311.01446  [pdf, other

    cs.RO cs.CV cs.LG

    Adv3D: Generating Safety-Critical 3D Objects through Closed-Loop Simulation

    Authors: Jay Sarva, Jingkang Wang, James Tu, Yuwen Xiong, Sivabalan Manivasagam, Raquel Urtasun

    Abstract: Self-driving vehicles (SDVs) must be rigorously tested on a wide range of scenarios to ensure safe deployment. The industry typically relies on closed-loop simulation to evaluate how the SDV interacts on a corpus of synthetic and real scenarios and verify it performs properly. However, they primarily only test the system's motion planning module, and only consider behavior variations. It is key to… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: CoRL 2023. Project page: https://waabi.ai/adv3d/

  13. arXiv:2311.01444  [pdf, other

    cs.CV cs.RO

    LabelFormer: Object Trajectory Refinement for Offboard Perception from LiDAR Point Clouds

    Authors: Anqi Joyce Yang, Sergio Casas, Nikita Dvornik, Sean Segal, Yuwen Xiong, Jordan Sir Kwang Hu, Carter Fang, Raquel Urtasun

    Abstract: A major bottleneck to scaling-up training of self-driving perception systems are the human annotations required for supervision. A promising alternative is to leverage "auto-labelling" offboard perception models that are trained to automatically generate annotations from raw LiDAR point clouds at a fraction of the cost. Auto-labels are most commonly generated via a two-stage approach -- first obje… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: 20 pages, 8 figures, 7 tables

    Journal ref: CoRL 2023

  14. arXiv:2311.01394  [pdf, other

    cs.RO cs.CV cs.LG

    Learning Realistic Traffic Agents in Closed-loop

    Authors: Chris Zhang, James Tu, Lunjun Zhang, Kelvin Wong, Simon Suo, Raquel Urtasun

    Abstract: Realistic traffic simulation is crucial for developing self-driving software in a safe and scalable manner prior to real-world deployment. Typically, imitation learning (IL) is used to learn human-like traffic agents directly from real-world observations collected offline, but without explicit specification of traffic rules, agents trained from IL alone frequently display unrealistic infractions l… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: CORL 2023

  15. arXiv:2311.01017  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion

    Authors: Lunjun Zhang, Yuwen Xiong, Ze Yang, Sergio Casas, Rui Hu, Raquel Urtasun

    Abstract: Learning world models can teach an agent how the world works in an unsupervised manner. Even though it can be viewed as a special case of sequence modeling, progress for scaling world models on robotic applications such as autonomous driving has been somewhat less rapid than scaling language models with Generative Pre-trained Transformers (GPT). We identify two reasons as major bottlenecks: dealin… ▽ More

    Submitted 1 April, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: ICLR 2024

  16. arXiv:2308.01898  [pdf, other

    cs.CV cs.RO

    UniSim: A Neural Closed-Loop Sensor Simulator

    Authors: Ze Yang, Yun Chen, Jingkang Wang, Sivabalan Manivasagam, Wei-Chiu Ma, Anqi Joyce Yang, Raquel Urtasun

    Abstract: Rigorously testing autonomy systems is essential for making safe self-driving vehicles (SDV) a reality. It requires one to generate safety critical scenarios beyond what can be collected safely in the world, as many scenarios happen rarely on public roads. To accurately evaluate performance, we need to test the SDV on these scenarios in closed-loop, where the SDV and other actors interact with eac… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Comments: CVPR 2023 Highlight. Project page: https://waabi.ai/research/unisim/

  17. arXiv:2308.01471  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving

    Authors: Ben Agro, Quinlan Sykora, Sergio Casas, Raquel Urtasun

    Abstract: A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants. Existing works either perform object detection followed by trajectory forecasting of the detected objects, or predict dense occupancy and flow grids for the whole scene. The former poses a safety concern as the number of detections needs to be kept low for efficienc… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: 19 pages, 13 figures

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 1379-1388

  18. Rethinking Closed-loop Training for Autonomous Driving

    Authors: Chris Zhang, Runsheng Guo, Wenyuan Zeng, Yuwen Xiong, Binbin Dai, Rui Hu, Mengye Ren, Raquel Urtasun

    Abstract: Recent advances in high-fidelity simulators have enabled closed-loop training of autonomous driving agents, potentially solving the distribution shift in training v.s. deployment and allowing training to be scaled both safely and cheaply. However, there is a lack of understanding of how to build effective training benchmarks for closed-loop training. In this work, we present the first empirical st… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: ECCV 2022

  19. arXiv:2211.02545  [pdf, other

    cs.RO cs.AI cs.CV cs.LG cs.MA

    GoRela: Go Relative for Viewpoint-Invariant Motion Forecasting

    Authors: Alexander Cui, Sergio Casas, Kelvin Wong, Simon Suo, Raquel Urtasun

    Abstract: The task of motion forecasting is critical for self-driving vehicles (SDVs) to be able to plan a safe maneuver. Towards this goal, modern approaches reason about the map, the agents' past trajectories and their interactions in order to produce accurate forecasts. The predominant approach has been to encode the map and other agents in the reference frame of each target agent. However, this approach… ▽ More

    Submitted 8 November, 2022; v1 submitted 4 November, 2022; originally announced November 2022.

  20. arXiv:2206.08365  [pdf, other

    cs.CV cs.RO

    Virtual Correspondence: Humans as a Cue for Extreme-View Geometry

    Authors: Wei-Chiu Ma, Anqi Joyce Yang, Shenlong Wang, Raquel Urtasun, Antonio Torralba

    Abstract: Recovering the spatial layout of the cameras and the geometry of the scene from extreme-view images is a longstanding challenge in computer vision. Prevailing 3D reconstruction algorithms often adopt the image matching paradigm and presume that a portion of the scene is co-visible across images, yielding poor performance when there is little overlap among inputs. In contrast, humans can associate… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: CVPR 2022. Project page: https://people.csail.mit.edu/weichium/virtual-correspondence/

  21. arXiv:2106.13435  [pdf, other

    cs.CV cs.LG

    NP-DRAW: A Non-Parametric Structured Latent Variable Model for Image Generation

    Authors: Xiaohui Zeng, Raquel Urtasun, Richard Zemel, Sanja Fidler, Renjie Liao

    Abstract: In this paper, we present a non-parametric structured latent variable model for image generation, called NP-DRAW, which sequentially draws on a latent canvas in a part-by-part fashion and then decodes the image from the canvas. Our key contributions are as follows. 1) We propose a non-parametric prior distribution over the appearance of image parts so that the latent variable ``what-to-draw'' per… ▽ More

    Submitted 4 July, 2021; v1 submitted 25 June, 2021; originally announced June 2021.

    Comments: UAI2021, code at https://github.com/ZENGXH/NPDRAW

  22. arXiv:2104.03956  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Just Label What You Need: Fine-Grained Active Selection for Perception and Prediction through Partially Labeled Scenes

    Authors: Sean Segal, Nishanth Kumar, Sergio Casas, Wenyuan Zeng, Mengye Ren, Jingkang Wang, Raquel Urtasun

    Abstract: Self-driving vehicles must perceive and predict the future positions of nearby actors in order to avoid collisions and drive safely. A learned deep learning module is often responsible for this task, requiring large-scale, high-quality training datasets. As data collection is often significantly cheaper than labeling in this domain, the decision of which subset of examples to label can have a prof… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

  23. arXiv:2101.07907  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    IntentNet: Learning to Predict Intention from Raw Sensor Data

    Authors: Sergio Casas, Wenjie Luo, Raquel Urtasun

    Abstract: In order to plan a safe maneuver, self-driving vehicles need to understand the intent of other traffic participants. We define intent as a combination of discrete high-level behaviors as well as continuous trajectories describing future motion. In this paper, we develop a one-stage detector and forecaster that exploits both 3D point clouds produced by a LiDAR sensor as well as dynamic maps of the… ▽ More

    Submitted 19 January, 2021; originally announced January 2021.

    Comments: CoRL 2018

  24. arXiv:2101.07719  [pdf, other

    cs.CV cs.RO

    Deep Feedback Inverse Problem Solver

    Authors: Wei-Chiu Ma, Shenlong Wang, Jiayuan Gu, Sivabalan Manivasagam, Antonio Torralba, Raquel Urtasun

    Abstract: We present an efficient, effective, and generic approach towards solving inverse problems. The key idea is to leverage the feedback signal provided by the forward process and learn an iterative update model. Specifically, at each iteration, the neural network takes the feedback as input and outputs an update on the current estimation. Our approach does not have any restrictions on the forward proc… ▽ More

    Submitted 19 January, 2021; originally announced January 2021.

    Comments: ECCV 2020 Spotlight

  25. arXiv:2101.06865  [pdf, other

    cs.CV

    Non-parametric Memory for Spatio-Temporal Segmentation of Construction Zones for Self-Driving

    Authors: Min Bai, Shenlong Wang, Kelvin Wong, Ersin Yumer, Raquel Urtasun

    Abstract: In this paper, we introduce a non-parametric memory representation for spatio-temporal segmentation that captures the local space and time around an autonomous vehicle (AV). Our representation has three important properties: (i) it remembers what it has seen in the past, (ii) it reinforces and (iii) forgets its past beliefs based on new evidence. Reinforcing is important as the first time we see a… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

  26. arXiv:2101.06860  [pdf, other

    cs.CV

    Mending Neural Implicit Modeling for 3D Vehicle Reconstruction in the Wild

    Authors: Shivam Duggal, Zihao Wang, Wei-Chiu Ma, Sivabalan Manivasagam, Justin Liang, Shenlong Wang, Raquel Urtasun

    Abstract: Reconstructing high-quality 3D objects from sparse, partial observations from a single view is of crucial importance for various applications in computer vision, robotics, and graphics. While recent neural implicit modeling methods show promising results on synthetic or dense data, they perform poorly on sparse and noisy real-world data. We discover that the limitations of a popular neural implici… ▽ More

    Submitted 25 November, 2021; v1 submitted 17 January, 2021; originally announced January 2021.

  27. arXiv:2101.06832  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Deep Structured Reactive Planning

    Authors: Jerry Liu, Wenyuan Zeng, Raquel Urtasun, Ersin Yumer

    Abstract: An intelligent agent operating in the real-world must balance achieving its goal with maintaining the safety and comfort of not only itself, but also other participants within the surrounding scene. This requires jointly reasoning about the behavior of other actors while deciding its own actions as these two processes are inherently intertwined - a vehicle will yield to us if we decide to proceed… ▽ More

    Submitted 29 April, 2021; v1 submitted 17 January, 2021; originally announced January 2021.

    Comments: ICRA 2021

  28. arXiv:2101.06806  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    MP3: A Unified Model to Map, Perceive, Predict and Plan

    Authors: Sergio Casas, Abbas Sadat, Raquel Urtasun

    Abstract: High-definition maps (HD maps) are a key component of most modern self-driving systems due to their valuable semantic and geometric information. Unfortunately, building HD maps has proven hard to scale due to their cost as well as the requirements they impose in the localization system that has to work everywhere with centimeter-level accuracy. Being able to drive without an HD map would be very b… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

  29. arXiv:2101.06784  [pdf, other

    cs.CV cs.LG

    Exploring Adversarial Robustness of Multi-Sensor Perception Systems in Self Driving

    Authors: James Tu, Huichen Li, Xinchen Yan, Mengye Ren, Yun Chen, Ming Liang, Eilyan Bitar, Ersin Yumer, Raquel Urtasun

    Abstract: Modern self-driving perception systems have been shown to improve upon processing complementary inputs such as LiDAR with images. In isolation, 2D images have been found to be extremely vulnerable to adversarial attacks. Yet, there have been limited studies on the adversarial robustness of multi-modal models that fuse LiDAR features with image features. Furthermore, existing works do not consider… ▽ More

    Submitted 7 January, 2022; v1 submitted 17 January, 2021; originally announced January 2021.

  30. arXiv:2101.06742  [pdf, other

    cs.CV cs.AI cs.LG cs.RO stat.ML

    Deep Parametric Continuous Convolutional Neural Networks

    Authors: Shenlong Wang, Simon Suo, Wei-Chiu Ma, Andrei Pokrovsky, Raquel Urtasun

    Abstract: Standard convolutional neural networks assume a grid structured input is available and exploit discrete convolutions as their fundamental building blocks. This limits their applicability to many real-world applications. In this paper we propose Parametric Continuous Convolution, a new learnable operator that operates over non-grid structured data. The key idea is to exploit parameterized kernel fu… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

    Comments: Accepted by CVPR 2018

  31. arXiv:2101.06720  [pdf, other

    cs.CV cs.LG cs.RO

    Deep Multi-Task Learning for Joint Localization, Perception, and Prediction

    Authors: John Phillips, Julieta Martinez, Ioan Andrei Bârsan, Sergio Casas, Abbas Sadat, Raquel Urtasun

    Abstract: Over the last few years, we have witnessed tremendous progress on many subtasks of autonomous driving, including perception, motion forecasting, and motion planning. However, these systems often assume that the car is accurately localized against a high-definition map. In this paper we question this assumption, and investigate the issues that arise in state-of-the-art autonomy stacks under localiz… ▽ More

    Submitted 10 April, 2021; v1 submitted 17 January, 2021; originally announced January 2021.

    Comments: CVPR 21

  32. arXiv:2101.06679  [pdf, other

    cs.CV cs.RO

    End-to-end Interpretable Neural Motion Planner

    Authors: Wenyuan Zeng, Wenjie Luo, Simon Suo, Abbas Sadat, Bin Yang, Sergio Casas, Raquel Urtasun

    Abstract: In this paper, we propose a neural motion planner (NMP) for learning to drive autonomously in complex urban scenarios that include traffic-light handling, yielding, and interactions with multiple road-users. Towards this goal, we design a holistic model that takes as input raw LIDAR data and a HD map and produces interpretable intermediate representations in the form of 3D detections and their fut… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

    Comments: CVPR 2019 (Oral)

  33. arXiv:2101.06653  [pdf, other

    cs.CV cs.RO

    LaneRCNN: Distributed Representations for Graph-Centric Motion Forecasting

    Authors: Wenyuan Zeng, Ming Liang, Renjie Liao, Raquel Urtasun

    Abstract: Forecasting the future behaviors of dynamic actors is an important task in many robotics applications such as self-driving. It is extremely challenging as actors have latent intentions and their trajectories are governed by complex interactions between the other actors, themselves, and the maps. In this paper, we propose LaneRCNN, a graph-centric motion forecasting model. Importantly, relying on a… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

  34. arXiv:2101.06608  [pdf, other

    cs.CV

    Network Automatic Pruning: Start NAP and Take a Nap

    Authors: Wenyuan Zeng, Yuwen Xiong, Raquel Urtasun

    Abstract: Network pruning can significantly reduce the computation and memory footprint of large neural networks. To achieve a good trade-off between model size and performance, popular pruning techniques usually rely on hand-crafted heuristics and require manually setting the compression ratio for each layer. This process is typically time-consuming and requires expert knowledge to achieve good results. In… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

    Comments: An updated version of 'MLPrune: Multi-Layer Pruning for Automated Neural Network Compression'

  35. arXiv:2101.06594  [pdf, other

    cs.CV

    PLUMENet: Efficient 3D Object Detection from Stereo Images

    Authors: Yan Wang, Bin Yang, Rui Hu, Ming Liang, Raquel Urtasun

    Abstract: 3D object detection is a key component of many robotic applications such as self-driving vehicles. While many approaches rely on expensive 3D sensors such as LiDAR to produce accurate 3D estimates, methods that exploit stereo cameras have recently shown promising results at a lower cost. Existing approaches tackle this problem in two steps: first depth estimation from stereo images is performed to… ▽ More

    Submitted 31 July, 2021; v1 submitted 17 January, 2021; originally announced January 2021.

    Comments: IROS 2021

  36. arXiv:2101.06590  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Cost-Efficient Online Hyperparameter Optimization

    Authors: Jingkang Wang, Mengye Ren, Ilija Bogunovic, Yuwen Xiong, Raquel Urtasun

    Abstract: Recent work on hyperparameters optimization (HPO) has shown the possibility of training certain hyperparameters together with regular parameters. However, these online HPO algorithms still require running evaluation on a set of validation examples at each training step, steeply increasing the training cost. To decide when to query the validation loss, we model online HPO as a time-varying Bayesian… ▽ More

    Submitted 16 January, 2021; originally announced January 2021.

  37. arXiv:2101.06586  [pdf, other

    cs.CV

    Auto4D: Learning to Label 4D Objects from Sequential Point Clouds

    Authors: Bin Yang, Min Bai, Ming Liang, Wenyuan Zeng, Raquel Urtasun

    Abstract: In the past few years we have seen great advances in object perception (particularly in 4D space-time dimensions) thanks to deep learning methods. However, they typically rely on large amounts of high-quality labels to achieve good performance, which often require time-consuming and expensive work by human annotators. To address this we propose an automatic annotation pipeline that generates accur… ▽ More

    Submitted 11 March, 2021; v1 submitted 16 January, 2021; originally announced January 2021.

  38. arXiv:2101.06571  [pdf, other

    cs.CV

    S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling

    Authors: Ze Yang, Shenlong Wang, Sivabalan Manivasagam, Zeng Huang, Wei-Chiu Ma, Xinchen Yan, Ersin Yumer, Raquel Urtasun

    Abstract: Constructing and animating humans is an important component for building virtual worlds in a wide variety of applications such as virtual reality or robotics testing in simulation. As there are exponentially many variations of humans with different shape, pose and clothing, it is critical to develop methods that can automatically reconstruct and animate humans at scale from real world data. Toward… ▽ More

    Submitted 16 January, 2021; originally announced January 2021.

  39. arXiv:2101.06562  [pdf, other

    cs.RO cs.CV

    Asynchronous Multi-View SLAM

    Authors: Anqi Joyce Yang, Can Cui, Ioan Andrei Bârsan, Raquel Urtasun, Shenlong Wang

    Abstract: Existing multi-camera SLAM systems assume synchronized shutters for all cameras, which is often not the case in practice. In this work, we propose a generalized multi-camera SLAM formulation which accounts for asynchronous sensor observations. Our framework integrates a continuous-time motion model to relate information across asynchronous multi-frames during tracking, local mapping, and loop clos… ▽ More

    Submitted 14 July, 2021; v1 submitted 16 January, 2021; originally announced January 2021.

    Comments: 25 pages, 23 figures, 13 tables

    Journal ref: Published at ICRA 2021

  40. arXiv:2101.06560  [pdf, other

    cs.LG cs.CR cs.CV

    Adversarial Attacks On Multi-Agent Communication

    Authors: James Tu, Tsunhsuan Wang, Jingkang Wang, Sivabalan Manivasagam, Mengye Ren, Raquel Urtasun

    Abstract: Growing at a fast pace, modern autonomous systems will soon be deployed at scale, opening up the possibility for cooperative multi-agent systems. Sharing information and distributing workloads allow autonomous agents to better perform tasks and increase computation efficiency. However, shared information can be modified to execute adversarial attacks on deep learning models that are widely employe… ▽ More

    Submitted 12 October, 2021; v1 submitted 16 January, 2021; originally announced January 2021.

    Journal ref: International Conference On Computer Vision 2021

  41. arXiv:2101.06557  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors

    Authors: Simon Suo, Sebastian Regalado, Sergio Casas, Raquel Urtasun

    Abstract: Simulation has the potential to massively scale evaluation of self-driving systems enabling rapid development as well as safe deployment. To close the gap between simulation and the real world, we need to simulate realistic multi-agent behaviors. Existing simulation environments rely on heuristic-based models that directly encode traffic rules, which cannot capture irregular maneuvers (e.g., nudgi… ▽ More

    Submitted 16 January, 2021; originally announced January 2021.

  42. arXiv:2101.06554  [pdf, other

    cs.LG cs.RO

    Diverse Complexity Measures for Dataset Curation in Self-driving

    Authors: Abbas Sadat, Sean Segal, Sergio Casas, James Tu, Bin Yang, Raquel Urtasun, Ersin Yumer

    Abstract: Modern self-driving autonomy systems heavily rely on deep learning. As a consequence, their performance is influenced significantly by the quality and richness of the training data. Data collecting platforms can generate many hours of raw data in a daily basis, however, it is not feasible to label everything. It is thus of key importance to have a mechanism to identify "what to label". Active lear… ▽ More

    Submitted 16 January, 2021; originally announced January 2021.

    Comments: 13 pages

  43. arXiv:2101.06553  [pdf, other

    cs.CV cs.LG

    Self-Supervised Representation Learning from Flow Equivariance

    Authors: Yuwen Xiong, Mengye Ren, Wenyuan Zeng, Raquel Urtasun

    Abstract: Self-supervised representation learning is able to learn semantically meaningful features; however, much of its recent success relies on multiple crops of an image with very few objects. Instead of learning view-invariant representation from simple images, humans learn representations in a complex world with changing scenes by observing object movement, deformation, pose variation, and ego motion.… ▽ More

    Submitted 12 October, 2021; v1 submitted 16 January, 2021; originally announced January 2021.

    Comments: ICCV 2021

  44. arXiv:2101.06549  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles

    Authors: Jingkang Wang, Ava Pun, James Tu, Sivabalan Manivasagam, Abbas Sadat, Sergio Casas, Mengye Ren, Raquel Urtasun

    Abstract: As self-driving systems become better, simulating scenarios where the autonomy stack may fail becomes more important. Traditionally, those scenarios are generated for a few scenes with respect to the planning module that takes ground-truth actor states as input. This does not scale and cannot identify all possible autonomy failures, such as perception failures due to occlusion. In this paper, we p… ▽ More

    Submitted 16 April, 2023; v1 submitted 16 January, 2021; originally announced January 2021.

    Comments: CVPR 2021. Corrected typos in the adversarial objective

  45. arXiv:2101.06547  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    LookOut: Diverse Multi-Future Prediction and Planning for Self-Driving

    Authors: Alexander Cui, Sergio Casas, Abbas Sadat, Renjie Liao, Raquel Urtasun

    Abstract: In this paper, we present LookOut, a novel autonomy system that perceives the environment, predicts a diverse set of futures of how the scene might unroll and estimates the trajectory of the SDV by optimizing a set of contingency plans over these future realizations. In particular, we learn a diverse joint distribution over multi-agent future trajectories in a traffic scene that covers a wide rang… ▽ More

    Submitted 7 May, 2021; v1 submitted 16 January, 2021; originally announced January 2021.

  46. arXiv:2101.06545  [pdf, other

    cs.CV

    VideoClick: Video Object Segmentation with a Single Click

    Authors: Namdar Homayounfar, Justin Liang, Wei-Chiu Ma, Raquel Urtasun

    Abstract: Annotating videos with object segmentation masks typically involves a two stage procedure of drawing polygons per object instance for all the frames and then linking them through time. While simple, this is a very tedious, time consuming and expensive process, making the creation of accurate annotations at scale only possible for well-funded labs. What if we were able to segment an object in the f… ▽ More

    Submitted 16 January, 2021; originally announced January 2021.

  47. arXiv:2101.06543  [pdf, other

    cs.CV cs.AI cs.GR cs.RO

    GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving

    Authors: Yun Chen, Frieda Rong, Shivam Duggal, Shenlong Wang, Xinchen Yan, Sivabalan Manivasagam, Shangjie Xue, Ersin Yumer, Raquel Urtasun

    Abstract: Scalable sensor simulation is an important yet challenging open problem for safety-critical domains such as self-driving. Current works in image simulation either fail to be photorealistic or do not model the 3D environment and the dynamic objects within, losing high-level control and physical realism. In this paper, we present GeoSim, a geometry-aware image composition process which synthesizes n… ▽ More

    Submitted 16 May, 2021; v1 submitted 16 January, 2021; originally announced January 2021.

    Comments: Accepted by CVPR 2021 as Oral

  48. arXiv:2101.06541  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    SceneGen: Learning to Generate Realistic Traffic Scenes

    Authors: Shuhan Tan, Kelvin Wong, Shenlong Wang, Sivabalan Manivasagam, Mengye Ren, Raquel Urtasun

    Abstract: We consider the problem of generating realistic traffic scenes automatically. Existing methods typically insert actors into the scene according to a set of hand-crafted heuristics and are limited in their ability to model the true complexity and diversity of real traffic scenes, thus inducing a content gap between synthesized traffic scenes versus real ones. As a result, existing simulators lack t… ▽ More

    Submitted 16 January, 2021; originally announced January 2021.

  49. arXiv:2101.02385  [pdf, other

    cs.CV cs.LG cs.RO

    Safety-Oriented Pedestrian Motion and Scene Occupancy Forecasting

    Authors: Katie Luo, Sergio Casas, Renjie Liao, Xinchen Yan, Yuwen Xiong, Wenyuan Zeng, Raquel Urtasun

    Abstract: In this paper, we address the important problem in self-driving of forecasting multi-pedestrian motion and their shared scene occupancy map, critical for safe navigation. Our contributions are two-fold. First, we advocate for predicting both the individual motions as well as the scene occupancy map in order to effectively deal with missing detections caused by postprocessing, e.g., confidence thre… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

  50. Pit30M: A Benchmark for Global Localization in the Age of Self-Driving Cars

    Authors: Julieta Martinez, Sasha Doubov, Jack Fan, Ioan Andrei Bârsan, Shenlong Wang, Gellért Máttyus, Raquel Urtasun

    Abstract: We are interested in understanding whether retrieval-based localization approaches are good enough in the context of self-driving vehicles. Towards this goal, we introduce Pit30M, a new image and LiDAR dataset with over 30 million frames, which is 10 to 100 times larger than those used in previous work. Pit30M is captured under diverse conditions (i.e., season, weather, time of the day, traffic),… ▽ More

    Submitted 30 April, 2024; v1 submitted 22 December, 2020; originally announced December 2020.

    Comments: Published at IROS 2020