Search | arXiv e-print repository

Waypoint Planning Networks

Authors: Alexandru-Iosif Toma, Hussein Ali Jaafar, Hao-Ya Hsueh, Stephen James, Daniel Lenton, Ronald Clark, Sajad Saeedi

Abstract: With the recent advances in machine learning, path planning algorithms are also evolving; however, the learned path planning algorithms often have difficulty competing with success rates of classic algorithms. We propose waypoint planning networks (WPN), a hybrid algorithm based on LSTMs with a local kernel - a classic algorithm such as A*, and a global kernel using a learned algorithm. WPN produc… ▽ More With the recent advances in machine learning, path planning algorithms are also evolving; however, the learned path planning algorithms often have difficulty competing with success rates of classic algorithms. We propose waypoint planning networks (WPN), a hybrid algorithm based on LSTMs with a local kernel - a classic algorithm such as A*, and a global kernel using a learned algorithm. WPN produces a more computationally efficient and robust solution. We compare WPN against A*, as well as related works including motion planning networks (MPNet) and value iteration networks (VIN). In this paper, the design and experiments have been conducted for 2D environments. Experimental results outline the benefits of WPN, both in efficiency and generalization. It is shown that WPN's search space is considerably less than A*, while being able to generate near optimal results. Additionally, WPN works on partial maps, unlike A* which needs the full map in advance. The code is available online. △ Less

Submitted 1 May, 2021; originally announced May 2021.

Comments: The Conference on Robots and Vision (CRV2021) Supplementary Website: https://sites.google.com/view/waypoint-planning-networks

arXiv:2102.07764 [pdf, other]

End-to-End Egospheric Spatial Memory

Authors: Daniel Lenton, Stephen James, Ronald Clark, Andrew J. Davison

Abstract: Spatial memory, or the ability to remember and recall specific locations and objects, is central to autonomous agents' ability to carry out tasks in real environments. However, most existing artificial memory modules are not very adept at storing spatial information. We propose a parameter-free module, Egospheric Spatial Memory (ESM), which encodes the memory in an ego-sphere around the agent, ena… ▽ More Spatial memory, or the ability to remember and recall specific locations and objects, is central to autonomous agents' ability to carry out tasks in real environments. However, most existing artificial memory modules are not very adept at storing spatial information. We propose a parameter-free module, Egospheric Spatial Memory (ESM), which encodes the memory in an ego-sphere around the agent, enabling expressive 3D representations. ESM can be trained end-to-end via either imitation or reinforcement learning, and improves both training efficiency and final performance against other memory baselines on both drone and manipulator visuomotor control tasks. The explicit egocentric geometry also enables us to seamlessly combine the learned controller with other non-learned modalities, such as local obstacle avoidance. We further show applications to semantic segmentation on the ScanNet dataset, where ESM naturally combines image-level and map-level inference modalities. Through our broad set of experiments, we show that ESM provides a general computation graph for embodied spatial reasoning, and the module forms a bridge between real-time mapping systems and differentiable memory architectures. Implementation at: https://github.com/ivy-dl/memory. △ Less

Submitted 17 February, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

Comments: Conference paper at ICLR 2021. Implementation: https://github.com/ivy-dl/memory Project page: https://djl11.github.io/ESM/

arXiv:2102.02886 [pdf, other]

Ivy: Templated Deep Learning for Inter-Framework Portability

Authors: Daniel Lenton, Fabio Pardo, Fabian Falck, Stephen James, Ronald Clark

Abstract: We introduce Ivy, a templated Deep Learning (DL) framework which abstracts existing DL frameworks. Ivy unifies the core functions of these frameworks to exhibit consistent call signatures, syntax and input-output behaviour. New high-level framework-agnostic functions and classes, which are usable alongside framework-specific code, can then be implemented as compositions of the unified low-level Iv… ▽ More We introduce Ivy, a templated Deep Learning (DL) framework which abstracts existing DL frameworks. Ivy unifies the core functions of these frameworks to exhibit consistent call signatures, syntax and input-output behaviour. New high-level framework-agnostic functions and classes, which are usable alongside framework-specific code, can then be implemented as compositions of the unified low-level Ivy functions. Ivy currently supports TensorFlow, PyTorch, MXNet, Jax and NumPy. We also release four pure-Ivy libraries for mechanics, 3D vision, robotics, and differentiable environments. Through our evaluations, we show that Ivy can significantly reduce lines of code with a runtime overhead of less than 1% in most cases. We welcome developers to join the Ivy community by writing their own functions, layers and libraries in Ivy, maximizing their audience and helping to accelerate DL research through inter-framework codebases. More information can be found at https://ivy-dl.org. △ Less

Submitted 5 April, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

Comments: Code at https://github.com/ivy-dl/ivy

arXiv:2011.14787 [pdf, other]

Unsupervised Path Regression Networks

Authors: Michal Pándy, Daniel Lenton, Ronald Clark

Abstract: We demonstrate that challenging shortest path problems can be solved via direct spline regression from a neural network, trained in an unsupervised manner (i.e. without requiring ground truth optimal paths for training). To achieve this, we derive a geometry-dependent optimal cost function whose minima guarantees collision-free solutions. Our method beats state-of-the-art supervised learning basel… ▽ More We demonstrate that challenging shortest path problems can be solved via direct spline regression from a neural network, trained in an unsupervised manner (i.e. without requiring ground truth optimal paths for training). To achieve this, we derive a geometry-dependent optimal cost function whose minima guarantees collision-free solutions. Our method beats state-of-the-art supervised learning baselines for shortest path planning, with a much more scalable training pipeline, and a significant speedup in inference time. △ Less

Submitted 9 March, 2021; v1 submitted 30 November, 2020; originally announced November 2020.

arXiv:2004.04336 [pdf, other]

MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion

Authors: Kentaro Wada, Edgar Sucar, Stephen James, Daniel Lenton, Andrew J. Davison

Abstract: Robots and other smart devices need efficient object-based scene representations from their on-board vision systems to reason about contact, physics and occlusion. Recognized precise object models will play an important role alongside non-parametric reconstructions of unrecognized structures. We present a system which can estimate the accurate poses of multiple known objects in contact and occlusi… ▽ More Robots and other smart devices need efficient object-based scene representations from their on-board vision systems to reason about contact, physics and occlusion. Recognized precise object models will play an important role alongside non-parametric reconstructions of unrecognized structures. We present a system which can estimate the accurate poses of multiple known objects in contact and occlusion from real-time, embodied multi-view vision. Our approach makes 3D object pose proposals from single RGB-D views, accumulates pose estimates and non-parametric occupancy information from multiple views as the camera moves, and performs joint optimization to estimate consistent, non-intersecting poses for multiple objects in contact. We verify the accuracy and robustness of our approach experimentally on 2 object datasets: YCB-Video, and our own challenging Cluttered YCB-Video. We demonstrate a real-time robotics application where a robot arm precisely and orderly disassembles complicated piles of objects, using only on-board RGB-D vision. △ Less

Submitted 8 April, 2020; originally announced April 2020.

Comments: 10 pages, 10 figures, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020

Showing 1–5 of 5 results for author: Lenton, D