Zum Hauptinhalt springen

Showing 1–28 of 28 results for author: Casas, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08691  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    UnO: Unsupervised Occupancy Fields for Perception and Forecasting

    Authors: Ben Agro, Quinlan Sykora, Sergio Casas, Thomas Gilles, Raquel Urtasun

    Abstract: Perceiving the world and forecasting its future state is a critical task for self-driving. Supervised approaches leverage annotated object labels to learn a model of the world -- traditionally with object detections and trajectory predictions, or temporal bird's-eye-view (BEV) occupancy fields. However, these annotations are expensive and typically limited to a set of predefined categories that do… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  2. arXiv:2406.04426  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    DeTra: A Unified Model for Object Detection and Trajectory Forecasting

    Authors: Sergio Casas, Ben Agro, Jiageng Mao, Thomas Gilles, Alexander Cui, Thomas Li, Raquel Urtasun

    Abstract: The tasks of object detection and trajectory forecasting play a crucial role in understanding the scene for autonomous driving. These tasks are typically executed in a cascading manner, making them prone to compounding errors. Furthermore, there is usually a very thin interface between the two tasks, creating a lossy information bottleneck. To address these challenges, our approach formulates the… ▽ More

    Submitted 13 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  3. arXiv:2404.01486  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    QuAD: Query-based Interpretable Neural Motion Planning for Autonomous Driving

    Authors: Sourav Biswas, Sergio Casas, Quinlan Sykora, Ben Agro, Abbas Sadat, Raquel Urtasun

    Abstract: A self-driving vehicle must understand its environment to determine the appropriate action. Traditional autonomy systems rely on object detection to find the agents in the scene. However, object detection assumes a discrete set of objects and loses information about uncertainty, so any errors compound when predicting the future behavior of those agents. Alternatively, dense occupancy grid maps hav… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  4. arXiv:2311.02007  [pdf, other

    cs.CV cs.RO

    Towards Unsupervised Object Detection From LiDAR Point Clouds

    Authors: Lunjun Zhang, Anqi Joyce Yang, Yuwen Xiong, Sergio Casas, Bin Yang, Mengye Ren, Raquel Urtasun

    Abstract: In this paper, we study the problem of unsupervised object detection from 3D point clouds in self-driving scenes. We present a simple yet effective method that exploits (i) point clustering in near-range areas where the point clouds are dense, (ii) temporal consistency to filter out noisy unsupervised detections, (iii) translation equivariance of CNNs to extend the auto-labels to long range, and (… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: CVPR 2023

  5. arXiv:2311.01556  [pdf, other

    cs.CV cs.RO

    MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory

    Authors: Enxu Li, Sergio Casas, Raquel Urtasun

    Abstract: Semantic segmentation of LiDAR point clouds has been widely studied in recent years, with most existing methods focusing on tackling this task using a single scan of the environment. However, leveraging the temporal stream of observations can provide very rich contextual information on regions of the scene with poor visibility (e.g., occlusions) or sparse observations (e.g., at long range), and ca… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: accepted at ICCV 2023

  6. arXiv:2311.01520  [pdf, other

    cs.CV cs.RO

    4D-Former: Multimodal 4D Panoptic Segmentation

    Authors: Ali Athar, Enxu Li, Sergio Casas, Raquel Urtasun

    Abstract: 4D panoptic segmentation is a challenging but practically useful task that requires every point in a LiDAR point-cloud sequence to be assigned a semantic class label, and individual objects to be segmented and tracked over time. Existing approaches utilize only LiDAR inputs which convey limited information in regions with point sparsity. This problem can, however, be mitigated by utilizing RGB cam… ▽ More

    Submitted 17 November, 2023; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: accepted at CoRL 2023

  7. arXiv:2311.01444  [pdf, other

    cs.CV cs.RO

    LabelFormer: Object Trajectory Refinement for Offboard Perception from LiDAR Point Clouds

    Authors: Anqi Joyce Yang, Sergio Casas, Nikita Dvornik, Sean Segal, Yuwen Xiong, Jordan Sir Kwang Hu, Carter Fang, Raquel Urtasun

    Abstract: A major bottleneck to scaling-up training of self-driving perception systems are the human annotations required for supervision. A promising alternative is to leverage "auto-labelling" offboard perception models that are trained to automatically generate annotations from raw LiDAR point clouds at a fraction of the cost. Auto-labels are most commonly generated via a two-stage approach -- first obje… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: 20 pages, 8 figures, 7 tables

    Journal ref: CoRL 2023

  8. arXiv:2311.01017  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion

    Authors: Lunjun Zhang, Yuwen Xiong, Ze Yang, Sergio Casas, Rui Hu, Raquel Urtasun

    Abstract: Learning world models can teach an agent how the world works in an unsupervised manner. Even though it can be viewed as a special case of sequence modeling, progress for scaling world models on robotic applications such as autonomous driving has been somewhat less rapid than scaling language models with Generative Pre-trained Transformers (GPT). We identify two reasons as major bottlenecks: dealin… ▽ More

    Submitted 1 April, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: ICLR 2024

  9. arXiv:2308.01471  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving

    Authors: Ben Agro, Quinlan Sykora, Sergio Casas, Raquel Urtasun

    Abstract: A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants. Existing works either perform object detection followed by trajectory forecasting of the detected objects, or predict dense occupancy and flow grids for the whole scene. The former poses a safety concern as the number of detections needs to be kept low for efficienc… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: 19 pages, 13 figures

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 1379-1388

  10. arXiv:2211.02545  [pdf, other

    cs.RO cs.AI cs.CV cs.LG cs.MA

    GoRela: Go Relative for Viewpoint-Invariant Motion Forecasting

    Authors: Alexander Cui, Sergio Casas, Kelvin Wong, Simon Suo, Raquel Urtasun

    Abstract: The task of motion forecasting is critical for self-driving vehicles (SDVs) to be able to plan a safe maneuver. Towards this goal, modern approaches reason about the map, the agents' past trajectories and their interactions in order to produce accurate forecasts. The predominant approach has been to encode the map and other agents in the reference frame of each target agent. However, this approach… ▽ More

    Submitted 8 November, 2022; v1 submitted 4 November, 2022; originally announced November 2022.

  11. arXiv:2104.03956  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Just Label What You Need: Fine-Grained Active Selection for Perception and Prediction through Partially Labeled Scenes

    Authors: Sean Segal, Nishanth Kumar, Sergio Casas, Wenyuan Zeng, Mengye Ren, Jingkang Wang, Raquel Urtasun

    Abstract: Self-driving vehicles must perceive and predict the future positions of nearby actors in order to avoid collisions and drive safely. A learned deep learning module is often responsible for this task, requiring large-scale, high-quality training datasets. As data collection is often significantly cheaper than labeling in this domain, the decision of which subset of examples to label can have a prof… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

  12. arXiv:2101.07907  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    IntentNet: Learning to Predict Intention from Raw Sensor Data

    Authors: Sergio Casas, Wenjie Luo, Raquel Urtasun

    Abstract: In order to plan a safe maneuver, self-driving vehicles need to understand the intent of other traffic participants. We define intent as a combination of discrete high-level behaviors as well as continuous trajectories describing future motion. In this paper, we develop a one-stage detector and forecaster that exploits both 3D point clouds produced by a LiDAR sensor as well as dynamic maps of the… ▽ More

    Submitted 19 January, 2021; originally announced January 2021.

    Comments: CoRL 2018

  13. arXiv:2101.06806  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    MP3: A Unified Model to Map, Perceive, Predict and Plan

    Authors: Sergio Casas, Abbas Sadat, Raquel Urtasun

    Abstract: High-definition maps (HD maps) are a key component of most modern self-driving systems due to their valuable semantic and geometric information. Unfortunately, building HD maps has proven hard to scale due to their cost as well as the requirements they impose in the localization system that has to work everywhere with centimeter-level accuracy. Being able to drive without an HD map would be very b… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

  14. arXiv:2101.06720  [pdf, other

    cs.CV cs.LG cs.RO

    Deep Multi-Task Learning for Joint Localization, Perception, and Prediction

    Authors: John Phillips, Julieta Martinez, Ioan Andrei Bârsan, Sergio Casas, Abbas Sadat, Raquel Urtasun

    Abstract: Over the last few years, we have witnessed tremendous progress on many subtasks of autonomous driving, including perception, motion forecasting, and motion planning. However, these systems often assume that the car is accurately localized against a high-definition map. In this paper we question this assumption, and investigate the issues that arise in state-of-the-art autonomy stacks under localiz… ▽ More

    Submitted 10 April, 2021; v1 submitted 17 January, 2021; originally announced January 2021.

    Comments: CVPR 21

  15. arXiv:2101.06679  [pdf, other

    cs.CV cs.RO

    End-to-end Interpretable Neural Motion Planner

    Authors: Wenyuan Zeng, Wenjie Luo, Simon Suo, Abbas Sadat, Bin Yang, Sergio Casas, Raquel Urtasun

    Abstract: In this paper, we propose a neural motion planner (NMP) for learning to drive autonomously in complex urban scenarios that include traffic-light handling, yielding, and interactions with multiple road-users. Towards this goal, we design a holistic model that takes as input raw LIDAR data and a HD map and produces interpretable intermediate representations in the form of 3D detections and their fut… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

    Comments: CVPR 2019 (Oral)

  16. arXiv:2101.06557  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors

    Authors: Simon Suo, Sebastian Regalado, Sergio Casas, Raquel Urtasun

    Abstract: Simulation has the potential to massively scale evaluation of self-driving systems enabling rapid development as well as safe deployment. To close the gap between simulation and the real world, we need to simulate realistic multi-agent behaviors. Existing simulation environments rely on heuristic-based models that directly encode traffic rules, which cannot capture irregular maneuvers (e.g., nudgi… ▽ More

    Submitted 16 January, 2021; originally announced January 2021.

  17. arXiv:2101.06554  [pdf, other

    cs.LG cs.RO

    Diverse Complexity Measures for Dataset Curation in Self-driving

    Authors: Abbas Sadat, Sean Segal, Sergio Casas, James Tu, Bin Yang, Raquel Urtasun, Ersin Yumer

    Abstract: Modern self-driving autonomy systems heavily rely on deep learning. As a consequence, their performance is influenced significantly by the quality and richness of the training data. Data collecting platforms can generate many hours of raw data in a daily basis, however, it is not feasible to label everything. It is thus of key importance to have a mechanism to identify "what to label". Active lear… ▽ More

    Submitted 16 January, 2021; originally announced January 2021.

    Comments: 13 pages

  18. arXiv:2101.06549  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles

    Authors: Jingkang Wang, Ava Pun, James Tu, Sivabalan Manivasagam, Abbas Sadat, Sergio Casas, Mengye Ren, Raquel Urtasun

    Abstract: As self-driving systems become better, simulating scenarios where the autonomy stack may fail becomes more important. Traditionally, those scenarios are generated for a few scenes with respect to the planning module that takes ground-truth actor states as input. This does not scale and cannot identify all possible autonomy failures, such as perception failures due to occlusion. In this paper, we p… ▽ More

    Submitted 16 April, 2023; v1 submitted 16 January, 2021; originally announced January 2021.

    Comments: CVPR 2021. Corrected typos in the adversarial objective

  19. arXiv:2101.06547  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    LookOut: Diverse Multi-Future Prediction and Planning for Self-Driving

    Authors: Alexander Cui, Sergio Casas, Abbas Sadat, Renjie Liao, Raquel Urtasun

    Abstract: In this paper, we present LookOut, a novel autonomy system that perceives the environment, predicts a diverse set of futures of how the scene might unroll and estimates the trajectory of the SDV by optimizing a set of contingency plans over these future realizations. In particular, we learn a diverse joint distribution over multi-agent future trajectories in a traffic scene that covers a wide rang… ▽ More

    Submitted 7 May, 2021; v1 submitted 16 January, 2021; originally announced January 2021.

  20. arXiv:2101.02385  [pdf, other

    cs.CV cs.LG cs.RO

    Safety-Oriented Pedestrian Motion and Scene Occupancy Forecasting

    Authors: Katie Luo, Sergio Casas, Renjie Liao, Xinchen Yan, Yuwen Xiong, Wenyuan Zeng, Raquel Urtasun

    Abstract: In this paper, we address the important problem in self-driving of forecasting multi-pedestrian motion and their shared scene occupancy map, critical for safe navigation. Our contributions are two-fold. First, we advocate for predicting both the individual motions as well as the scene occupancy map in order to effectively deal with missing detections caused by postprocessing, e.g., confidence thre… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

  21. arXiv:2011.06425  [pdf, other

    cs.CV cs.RO

    StrObe: Streaming Object Detection from LiDAR Packets

    Authors: Davi Frossard, Simon Suo, Sergio Casas, James Tu, Rui Hu, Raquel Urtasun

    Abstract: Many modern robotics systems employ LiDAR as their main sensing modality due to its geometrical richness. Rolling shutter LiDARs are particularly common, in which an array of lasers scans the scene from a rotating base. Points are emitted as a stream of packets, each covering a sector of the 360° coverage. Modern perception algorithms wait for the full sweep to be built before processing the data,… ▽ More

    Submitted 13 November, 2020; v1 submitted 12 November, 2020; originally announced November 2020.

    Comments: To be presented at the 4th Conference on Robot Learning (CoRL 2020)

  22. arXiv:2008.05930  [pdf, other

    cs.RO cs.AI cs.CV cs.LG stat.ML

    Perceive, Predict, and Plan: Safe Motion Planning Through Interpretable Semantic Representations

    Authors: Abbas Sadat, Sergio Casas, Mengye Ren, Xinyu Wu, Pranaab Dhawan, Raquel Urtasun

    Abstract: In this paper we propose a novel end-to-end learnable network that performs joint perception, prediction and motion planning for self-driving vehicles and produces interpretable intermediate representations. Unlike existing neural motion planners, our motion planning costs are consistent with our perception and prediction estimates. This is achieved by a novel differentiable semantic occupancy rep… ▽ More

    Submitted 13 August, 2020; originally announced August 2020.

    Comments: European Conference on Computer Vision (ECCV) 2020

  23. arXiv:2007.14366  [pdf, other

    cs.CV

    RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects

    Authors: Bin Yang, Runsheng Guo, Ming Liang, Sergio Casas, Raquel Urtasun

    Abstract: We tackle the problem of exploiting Radar for perception in the context of self-driving as Radar provides complementary information to other sensors such as LiDAR or cameras in the form of Doppler velocity. The main challenges of using Radar are the noise and measurement ambiguities which have been a struggle for existing simple input or output fusion methods. To better address this, we propose a… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

    Comments: ECCV 2020

  24. arXiv:2007.12036  [pdf, other

    cs.CV cs.LG cs.RO stat.ML

    Implicit Latent Variable Model for Scene-Consistent Motion Forecasting

    Authors: Sergio Casas, Cole Gulino, Simon Suo, Katie Luo, Renjie Liao, Raquel Urtasun

    Abstract: In order to plan a safe maneuver an autonomous vehicle must accurately perceive its environment, and understand the interactions among traffic participants. In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data. In particular, we propose to characterize the joint distribution over future trajectories via an implicit latent variable mode… ▽ More

    Submitted 23 July, 2020; originally announced July 2020.

    Comments: European Conference on Computer Vision (ECCV) 2020

  25. arXiv:2006.02636  [pdf, other

    cs.CV cs.AI cs.LG cs.RO stat.ML

    The Importance of Prior Knowledge in Precise Multimodal Prediction

    Authors: Sergio Casas, Cole Gulino, Simon Suo, Raquel Urtasun

    Abstract: Roads have well defined geometries, topologies, and traffic rules. While this has been widely exploited in motion planning methods to produce maneuvers that obey the law, little work has been devoted to utilize these priors in perception and motion forecasting methods. In this paper we propose to incorporate these structured priors as a loss function. In contrast to imposing hard constraints, this… ▽ More

    Submitted 3 June, 2020; originally announced June 2020.

  26. arXiv:2005.14711  [pdf, other

    cs.CV cs.RO

    PnPNet: End-to-End Perception and Prediction with Tracking in the Loop

    Authors: Ming Liang, Bin Yang, Wenyuan Zeng, Yun Chen, Rui Hu, Sergio Casas, Raquel Urtasun

    Abstract: We tackle the problem of joint perception and motion forecasting in the context of self-driving vehicles. Towards this goal we propose PnPNet, an end-to-end model that takes as input sequential sensor data, and outputs at each time step object tracks and their future trajectories. The key component is a novel tracking module that generates object tracks online from detections and exploits trajecto… ▽ More

    Submitted 27 June, 2020; v1 submitted 29 May, 2020; originally announced May 2020.

    Comments: CVPR2020

  27. arXiv:1910.08233  [pdf, other

    cs.CV cs.LG cs.RO eess.SP

    Spatially-Aware Graph Neural Networks for Relational Behavior Forecasting from Sensor Data

    Authors: Sergio Casas, Cole Gulino, Renjie Liao, Raquel Urtasun

    Abstract: In this paper, we tackle the problem of relational behavior forecasting from sensor data. Towards this goal, we propose a novel spatially-aware graph neural network (SpAGNN) that models the interactions between agents in the scene. Specifically, we exploit a convolutional neural network to detect the actors and compute their initial states. A graph neural network then iteratively updates the actor… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

  28. arXiv:1910.08041  [pdf, other

    cs.CV cs.LG cs.RO

    Discrete Residual Flow for Probabilistic Pedestrian Behavior Prediction

    Authors: Ajay Jain, Sergio Casas, Renjie Liao, Yuwen Xiong, Song Feng, Sean Segal, Raquel Urtasun

    Abstract: Self-driving vehicles plan around both static and dynamic objects, applying predictive models of behavior to estimate future locations of the objects in the environment. However, future behavior is inherently uncertain, and models of motion that produce deterministic outputs are limited to short timescales. Particularly difficult is the prediction of human behavior. In this work, we propose the di… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

    Comments: CoRL 2019