Zum Hauptinhalt springen

Showing 1–50 of 69 results for author: Fermuller, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.16632  [pdf, other

    cs.NE cs.AI

    Maelstrom Networks

    Authors: Matthew Evanusa, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: Artificial Neural Networks has struggled to devise a way to incorporate working memory into neural networks. While the ``long term'' memory can be seen as the learned weights, the working memory consists likely more of dynamical activity, that is missing from feed-forward models. Current state of the art models such as transformers tend to ``solve'' this by ignoring working memory entirely and sim… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  2. arXiv:2408.13627  [pdf, other

    cs.CV

    Recent Event Camera Innovations: A Survey

    Authors: Bharatesh Chakravarthi, Aayush Atul Verma, Kostas Daniilidis, Cornelia Fermuller, Yezhou Yang

    Abstract: Event-based vision, inspired by the human visual system, offers transformative capabilities such as low latency, high dynamic range, and reduced power consumption. This paper presents a comprehensive survey of event cameras, tracing their evolution over time. It introduces the fundamental principles of event cameras, compares them with traditional frame cameras, and highlights their unique charact… ▽ More

    Submitted 27 August, 2024; v1 submitted 24 August, 2024; originally announced August 2024.

  3. arXiv:2407.01811  [pdf, other

    cs.RO cs.CV

    Active Human Pose Estimation via an Autonomous UAV Agent

    Authors: Jingxi Chen, Botao He, Chahat Deep Singh, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: One of the core activities of an active observer involves moving to secure a "better" view of the scene, where the definition of "better" is task-dependent. This paper focuses on the task of human pose estimation from videos capturing a person's activity. Self-occlusions within the scene can complicate or even prevent accurate human pose estimation. To address this, relocating the camera to a new… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  4. arXiv:2406.05075  [pdf, other

    cs.CV

    Diving Deep into the Motion Representation of Video-Text Models

    Authors: Chinmaya Devaraj, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: Videos are more informative than images because they capture the dynamics of the scene. By representing motion in videos, we can capture dynamic activities. In this work, we introduce GPT-4 generated motion descriptions that capture fine-grained motion descriptions of activities and apply them to three action datasets. We evaluated several video-text models on the task of retrieval of motion descr… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: ACL Findings , 2024

  5. arXiv:2406.02972  [pdf, other

    cs.CV

    Event3DGS: Event-Based 3D Gaussian Splatting for High-Speed Robot Egomotion

    Authors: Tianyi Xiong, Jiayi Wu, Botao He, Cornelia Fermuller, Yiannis Aloimonos, Heng Huang, Christopher A. Metzler

    Abstract: By combining differentiable rendering with explicit point-based scene representations, 3D Gaussian Splatting (3DGS) has demonstrated breakthrough 3D reconstruction capabilities. However, to date 3DGS has had limited impact on robotics, where high-speed egomotion is pervasive: Egomotion introduces motion blur and leads to artifacts in existing frame-based 3DGS reconstruction methods. To address thi… ▽ More

    Submitted 18 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  6. arXiv:2405.17769  [pdf, other

    cs.RO cs.CV

    Microsaccade-inspired Event Camera for Robotics

    Authors: Botao He, Ze Wang, Yuan Zhou, Jingxi Chen, Chahat Deep Singh, Haojia Li, Yuman Gao, Shaojie Shen, Kaiwei Wang, Yanjun Cao, Chao Xu, Yiannis Aloimonos, Fei Gao, Cornelia Fermuller

    Abstract: Neuromorphic vision sensors or event cameras have made the visual perception of extremely low reaction time possible, opening new avenues for high-dynamic robotics applications. These event cameras' output is dependent on both motion and texture. However, the event camera fails to capture object edges that are parallel to the camera motion. This is a problem intrinsic to the sensor and therefore c… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Published on Science Robotics June 2024 issue

  7. arXiv:2405.01322  [pdf, other

    cs.LO

    Reasoning About Group Polarization: From Semantic Games to Sequent Systems

    Authors: Robert Freiman, Carlos Olarte, Elaine Pimentel, Christian G. Fermüller

    Abstract: Group polarization, the phenomenon where individuals become more extreme after interacting, has been gaining attention, especially with the rise of social media shaping people's opinions. Recent interest has emerged in formal reasoning about group polarization using logical systems. In this work we consider the modal logic PNL that captures the notion of agents agreeing or disagreeing on a given t… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  8. arXiv:2404.07447  [pdf, other

    cs.RO

    Interactive-FAR:Interactive, Fast and Adaptable Routing for Navigation Among Movable Obstacles in Complex Unknown Environments

    Authors: Botao He, Guofei Chen, Wenshan Wang, Ji Zhang, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: This paper introduces a real-time algorithm for navigating complex unknown environments cluttered with movable obstacles. Our algorithm achieves fast, adaptable routing by actively attempting to manipulate obstacles during path planning and adjusting the global plan from sensor feedback. The main contributions include an improved dynamic Directed Visibility Graph (DV-graph) for rapid global path s… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Project website: https://www.far-planner.com/interactive-far-planner. 8 pages, 8 figures

  9. arXiv:2404.01568  [pdf, other

    cs.CV cs.CG

    A Linear Time and Space Local Point Cloud Geometry Encoder via Vectorized Kernel Mixture (VecKM)

    Authors: Dehao Yuan, Cornelia Fermüller, Tahseen Rabbani, Furong Huang, Yiannis Aloimonos

    Abstract: We propose VecKM, a local point cloud geometry encoder that is descriptive and efficient to compute. VecKM leverages a unique approach by vectorizing a kernel mixture to represent the local point cloud. Such representation's descriptiveness is supported by two theorems that validate its ability to reconstruct and preserve the similarity of the local shape. Unlike existing encoders downsampling the… ▽ More

    Submitted 30 June, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: ICML2024 Conference Paper

  10. arXiv:2404.00054  [pdf, other

    cs.HC cs.GR cs.LG

    Choreographing the Digital Canvas: A Machine Learning Approach to Artistic Performance

    Authors: Siyuan Peng, Kate Ladenheim, Snehesh Shrestha, Cornelia Fermüller

    Abstract: This paper introduces the concept of a design tool for artistic performances based on attribute descriptions. To do so, we used a specific performance of falling actions. The platform integrates a novel machine-learning (ML) model with an interactive interface to generate and visualize artistic movements. Our approach's core is a cyclic Attribute-Conditioned Variational Autoencoder (AC-VAE) model… ▽ More

    Submitted 25 March, 2024; originally announced April 2024.

  11. arXiv:2403.13800  [pdf, other

    cs.CV

    TimeRewind: Rewinding Time with Image-and-Events Video Diffusion

    Authors: Jingxi Chen, Brandon Y. Feng, Haoming Cai, Mingyang Xie, Christopher Metzler, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: This paper addresses the novel challenge of ``rewinding'' time from a single captured image to recover the fleeting moments missed just before the shutter button is pressed. This problem poses a significant challenge in computer vision and computational photography, as it requires predicting plausible pre-capture motion from a single static frame, an inherently ill-posed task due to the high degre… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  12. arXiv:2403.09850  [pdf, other

    cs.CV

    MARVIS: Motion & Geometry Aware Real and Virtual Image Segmentation

    Authors: Jiayi Wu, Xiaomin Lin, Shahriar Negahdaripour, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: Tasks such as autonomous navigation, 3D reconstruction, and object recognition near the water surfaces are crucial in marine robotics applications. However, challenges arise due to dynamic disturbances, e.g., light reflections and refraction from the random air-water interface, irregular liquid flow, and similar factors, which can lead to potential failures in perception and navigation systems. Tr… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  13. arXiv:2403.02274  [pdf, other

    cs.RO cs.LG

    NatSGD: A Dataset with Speech, Gestures, and Demonstrations for Robot Learning in Natural Human-Robot Interaction

    Authors: Snehesh Shrestha, Yantian Zha, Saketh Banagiri, Ge Gao, Yiannis Aloimonos, Cornelia Fermuller

    Abstract: Recent advancements in multimodal Human-Robot Interaction (HRI) datasets have highlighted the fusion of speech and gesture, expanding robots' capabilities to absorb explicit and implicit HRI insights. However, existing speech-gesture HRI datasets often focus on elementary tasks, like object pointing and pushing, revealing limitations in scaling to intricate domains and prioritizing human command d… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  14. arXiv:2312.00055  [pdf, other

    cs.CV cs.LG cs.RO

    LEAP: LLM-Generation of Egocentric Action Programs

    Authors: Eadom Dessalene, Michael Maynord, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: We introduce LEAP (illustrated in Figure 1), a novel method for generating video-grounded action programs through use of a Large Language Model (LLM). These action programs represent the motoric, perceptual, and structural aspects of action, and consist of sub-actions, pre- and post-conditions, and control flows. LEAP's action programs are centered on egocentric video and employ recent development… ▽ More

    Submitted 28 November, 2023; originally announced December 2023.

    Comments: Dataset: https://drive.google.com/drive/folders/1Cpkw_TI1IIxXdzor0pOXG3rWJWuKU5Ex?usp=drive_link

  15. arXiv:2311.00187  [pdf, other

    cs.CV

    Decodable and Sample Invariant Continuous Object Encoder

    Authors: Dehao Yuan, Furong Huang, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: We propose Hyper-Dimensional Function Encoding (HDFE). Given samples of a continuous object (e.g. a function), HDFE produces an explicit vector representation of the given object, invariant to the sample distribution and density. Sample distribution and density invariance enables HDFE to consistently encode continuous objects regardless of their sampling, and therefore allows neural networks to re… ▽ More

    Submitted 6 May, 2024; v1 submitted 31 October, 2023; originally announced November 2023.

    Comments: ICLR2024 Conference Paper

  16. arXiv:2310.08745  [pdf, other

    cs.RO cs.CV

    AcTExplore: Active Tactile Exploration of Unknown Objects

    Authors: Amir-Hossein Shahidzadeh, Seong Jong Yoo, Pavan Mantripragada, Chahat Deep Singh, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: Tactile exploration plays a crucial role in understanding object structures for fundamental robotics tasks such as grasping and manipulation. However, efficiently exploring such objects using tactile sensors is challenging, primarily due to the large-scale unknown environments and limited sensing coverage of these sensors. To this end, we present AcTExplore, an active tactile exploration method dr… ▽ More

    Submitted 20 June, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: 8 pages, 6 figures, Accepted to ICRA 2024

  17. arXiv:2307.14332  [pdf, other

    cs.CV cs.AI

    Event-based Vision for Early Prediction of Manipulation Actions

    Authors: Daniel Deniz, Cornelia Fermuller, Eduardo Ros, Manuel Rodriguez-Alvarez, Francisco Barranco

    Abstract: Neuromorphic visual sensors are artificial retinas that output sequences of asynchronous events when brightness changes occur in the scene. These sensors offer many advantages including very high temporal resolution, no motion blur and smart data compression ideal for real-time processing. In this study, we introduce an event-based dataset on fine-grained manipulation actions and perform an experi… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.

    Comments: 15 pages, 9 figures

  18. arXiv:2304.09413  [pdf, other

    cs.RO

    Considerations for Minimizing Data Collection Biases for Eliciting Natural Behavior in Human-Robot Interaction

    Authors: Snehesh Shrestha, Ge Gao, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: Many of us researchers take extra measures to control for known-unknowns. However, unknown-unknowns can, at best, be negligible, but otherwise, they could produce unreliable data that might have dire consequences in real-life downstream applications. Human-Robot Interaction standards informed by empirical data could save us time and effort and provide us with the path toward the robots of the futu… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

  19. arXiv:2304.09412  [pdf, other

    cs.HC cs.MM eess.SY

    hDesigner: Real-Time Haptic Feedback Pattern Designer

    Authors: Snehesh Shrestha, Ishan Tamrakar, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: Haptic sensing can provide a new dimension to enhance people's musical and cinematic experiences. However, designing a haptic pattern is neither intuitive nor trivial. Imagined haptic patterns tend to be different from experienced ones. As a result, researchers use simple step-curve patterns to create haptic stimuli. To this end, we designed and developed an intuitive haptic pattern designer that… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

  20. arXiv:2304.03631  [pdf, other

    cs.CV

    Therbligs in Action: Video Understanding through Motion Primitives

    Authors: Eadom Dessalene, Michael Maynord, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: In this paper we introduce a rule-based, compositional, and hierarchical modeling of action using Therbligs as our atoms. Introducing these atoms provides us with a consistent, expressive, contact-centered representation of action. Over the atoms we introduce a differentiable method of rule-based reasoning to regularize for logical consistency. Our approach is complementary to other approaches in… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: 8 pages

  21. arXiv:2301.00482  [pdf, other

    cs.HC

    FEVA: Fast Event Video Annotation Tool

    Authors: Snehesh Shrestha, William Sentosatio, Huiashu Peng, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: Video Annotation is a crucial process in computer science and social science alike. Many video annotation tools (VATs) offer a wide range of features for making annotation possible. We conducted an extensive survey of over 59 VATs and interviewed interdisciplinary researchers to evaluate the usability of VATs. Our findings suggest that most current VATs have overwhelming user interfaces, poor inte… ▽ More

    Submitted 7 January, 2023; v1 submitted 1 January, 2023; originally announced January 2023.

  22. arXiv:2210.00715  [pdf, other

    cs.CV cs.RO

    WorldGen: A Large Scale Generative Simulator

    Authors: Chahat Deep Singh, Riya Kumari, Cornelia Fermüller, Nitin J. Sanket, Yiannis Aloimonos

    Abstract: In the era of deep learning, data is the critical determining factor in the performance of neural network models. Generating large datasets suffers from various difficulties such as scalability, cost efficiency and photorealism. To avoid expensive and strenuous dataset collection and annotations, researchers have inclined towards computer-generated datasets. Although, a lack of photorealism and a… ▽ More

    Submitted 3 October, 2022; originally announced October 2022.

    Journal ref: Under review in ICRA 2023

  23. arXiv:2205.15534  [pdf, other

    cs.SC cs.AI cs.CV cs.LG

    Gluing Neural Networks Symbolically Through Hyperdimensional Computing

    Authors: Peter Sutor, Dehao Yuan, Douglas Summers-Stay, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: Hyperdimensional Computing affords simple, yet powerful operations to create long Hyperdimensional Vectors (hypervectors) that can efficiently encode information, be used for learning, and are dynamic enough to be modified on the fly. In this paper, we explore the notion of using binary hypervectors to directly encode the final, classifying output signals of neural networks in order to fuse differ… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

    Comments: 10 pages, 3 figures, 6 tables, accepted to IJCNN 2022 / IEEE WCCI 2022

  24. arXiv:2205.03467  [pdf, other

    cs.CV cs.RO

    EVIMO2: An Event Camera Dataset for Motion Segmentation, Optical Flow, Structure from Motion, and Visual Inertial Odometry in Indoor Scenes with Monocular or Stereo Algorithms

    Authors: Levi Burner, Anton Mitrokhin, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: A new event camera dataset, EVIMO2, is introduced that improves on the popular EVIMO dataset by providing more data, from better cameras, in more complex scenarios. As with its predecessor, EVIMO2 provides labels in the form of per-pixel ground truth depth and segmentation as well as camera and object poses. All sequences use data from physical cameras and many sequences feature multiple independe… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: 5 pages, 3 figures, 1 table

  25. arXiv:2203.12829  [pdf, other

    cs.CV cs.MM

    AIMusicGuru: Music Assisted Human Pose Correction

    Authors: Snehesh Shrestha, Cornelia Fermüller, Tianyu Huang, Pyone Thant Win, Adam Zukerman, Chethan M. Parameshwara, Yiannis Aloimonos

    Abstract: Pose Estimation techniques rely on visual cues available through observations represented in the form of pixels. But the performance is bounded by the frame rate of the video and struggles from motion blur, occlusions, and temporal coherence. This issue is magnified when people are interacting with objects and instruments, for example playing the violin. Standard approaches for postprocessing use… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: 10 pages, 7 figures, under review

  26. arXiv:2203.11174  [pdf, other

    cs.CV cs.RO

    DiffPoseNet: Direct Differentiable Camera Pose Estimation

    Authors: Chethan M. Parameshwara, Gokul Hari, Cornelia Fermüller, Nitin J. Sanket, Yiannis Aloimonos

    Abstract: Current deep neural network approaches for camera pose estimation rely on scene structure for 3D motion estimation, but this decreases the robustness and thereby makes cross-dataset generalization difficult. In contrast, classical approaches to structure from motion estimate 3D motion utilizing optical flow and then compute depth. Their accuracy, however, depends strongly on the quality of the opt… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

    Comments: 10 pages, 5 figures, Accepted to CVPR 2022

  27. arXiv:2203.07530  [pdf, other

    cs.RO cs.CV

    TTCDist: Fast Distance Estimation From an Active Monocular Camera Using Time-to-Contact

    Authors: Levi Burner, Nitin J. Sanket, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: Distance estimation from vision is fundamental for a myriad of robotic applications such as navigation, manipulation, and planning. Inspired by the mammal's visual system, which gazes at specific objects, we develop two novel constraints relating time-to-contact, acceleration, and distance that we call the $τ$-constraint and $Φ$-constraint. They allow an active (moving) camera to estimate depth ef… ▽ More

    Submitted 7 March, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: 19 pages, 24 figures, 1 table. To be published in ICRA 2023

  28. arXiv:2203.07290  [pdf, other

    cs.RO

    GradTac: Spatio-Temporal Gradient Based Tactile Sensing

    Authors: Kanishka Ganguly, Pavan Mantripragada, Chethan M. Parameshwara, Cornelia Fermüller, Nitin J. Sanket, Yiannis Aloimonos

    Abstract: Tactile sensing for robotics is achieved through a variety of mechanisms, including magnetic, optical-tactile, and conductive fluid. Currently, the fluid-based sensors have struck the right balance of anthropomorphic sizes and shapes and accuracy of tactile response measurement. However, this design is plagued by a low Signal to Noise Ratio (SNR) due to the fluid based sensing mechanism "damping"… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: 12 pages, 12 figures, 1 table Submitted to Frontiers in Robotics and AI under Multisensory Perception and Learning towards Dexterous Robot Manipulation and Interaction

  29. arXiv:2109.13859  [pdf, other

    cs.CV cs.RO

    NudgeSeg: Zero-Shot Object Segmentation by Repeated Physical Interaction

    Authors: Chahat Deep Singh, Nitin J. Sanket, Chethan M. Parameshwara, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: Recent advances in object segmentation have demonstrated that deep neural networks excel at object segmentation for specific classes in color and depth images. However, their performance is dictated by the number of classes and objects used for training, thereby hindering generalization to never seen objects or zero-shot samples. To exacerbate the problem further, object segmentation using image f… ▽ More

    Submitted 22 September, 2021; originally announced September 2021.

    Comments: 8 Pages, 7 Figures, 3 Tables

    Journal ref: IEEE International Conference on Robots and Systems (IROS) 2021

  30. arXiv:2106.15045  [pdf, other

    cs.CV cs.AI cs.RO

    EVPropNet: Detecting Drones By Finding Propellers For Mid-Air Landing And Following

    Authors: Nitin J. Sanket, Chahat Deep Singh, Chethan M. Parameshwara, Cornelia Fermüller, Guido C. H. E. de Croon, Yiannis Aloimonos

    Abstract: The rapid rise of accessibility of unmanned aerial vehicles or drones pose a threat to general security and confidentiality. Most of the commercially available or custom-built drones are multi-rotors and are comprised of multiple propellers. Since these propellers rotate at a high-speed, they are generally the fastest moving parts of an image and cannot be directly "seen" by a classical camera wit… ▽ More

    Submitted 28 June, 2021; originally announced June 2021.

    Comments: 11 pages, 10 figures, 6 tables. Accepted in Robotics: Science and Systems (RSS) 2021

  31. arXiv:2105.06562  [pdf, other

    cs.CV cs.NE cs.RO

    SpikeMS: Deep Spiking Neural Network for Motion Segmentation

    Authors: Chethan M. Parameshwara, Simin Li, Cornelia Fermüller, Nitin J. Sanket, Matthew S. Evanusa, Yiannis Aloimonos

    Abstract: Spiking Neural Networks (SNN) are the so-called third generation of neural networks which attempt to more closely match the functioning of the biological brain. They inherently encode temporal data, allowing for training with less energy usage and can be extremely energy efficient when coded on neuromorphic hardware. In addition, they are well suited for tasks involving event-based sensors, which… ▽ More

    Submitted 13 May, 2021; originally announced May 2021.

    Comments: 7 pages, 6 figures, 3 tables, Under review IROS 2021

  32. Forecasting Action through Contact Representations from First Person Video

    Authors: Eadom Dessalene, Chinmaya Devaraj, Michael Maynord, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: Human actions involving hand manipulations are structured according to the making and breaking of hand-object contact, and human visual understanding of action is reliant on anticipation of contact as is demonstrated by pioneering work in cognitive science. Taking inspiration from this, we introduce representations and models centered on contact, which we then use in action prediction and anticipa… ▽ More

    Submitted 1 February, 2021; originally announced February 2021.

    Comments: 12 pages, 5 figures. in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021

  33. arXiv:2011.03077  [pdf, other

    cs.RO cs.CV

    MorphEyes: Variable Baseline Stereo For Quadrotor Navigation

    Authors: Nitin J. Sanket, Chahat Deep Singh, Varun Asthana, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: Morphable design and depth-based visual control are two upcoming trends leading to advancements in the field of quadrotor autonomy. Stereo-cameras have struck the perfect balance of weight and accuracy of depth estimation but suffer from the problem of depth range being limited and dictated by the baseline chosen at design time. In this paper, we present a framework for quadrotor navigation based… ▽ More

    Submitted 5 November, 2020; originally announced November 2020.

    Comments: 7 pages, 10 figures, 1 table. Under review in ICRA 2021

  34. arXiv:2011.00712  [pdf, other

    cs.RO

    Grasping in the Dark: Zero-Shot Object Grasping Using Tactile Feedback

    Authors: Kanishka Ganguly, Behzad Sadrfaridpour, Pavan Mantripragada, Nitin J. Sanket, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: Grasping and manipulating a wide variety of objects is a fundamental skill that would determine the success and wide spread adaptation of robots in homes. Several end-effector designs for robust manipulation have been proposed but they mostly work when provided with prior information about the objects or equipped with external sensors for estimating object shape or size. Such approaches are limite… ▽ More

    Submitted 16 September, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

    Comments: 6 pages, 1 page references, 8 figures, 2 tables. Under review

  35. arXiv:2010.14611  [pdf, other

    cs.NE cs.LG

    Hybrid Backpropagation Parallel Reservoir Networks

    Authors: Matthew Evanusa, Snehesh Shrestha, Michelle Girvan, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: In many real-world applications, fully-differentiable RNNs such as LSTMs and GRUs have been widely deployed to solve time series learning tasks. These networks train via Backpropagation Through Time, which can work well in practice but involves a biologically unrealistic unrolling of the network in time for gradient updates, are computationally expensive, and can be hard to tune. A second paradigm… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

  36. arXiv:2010.06209  [pdf, other

    cs.NE cs.LG

    Deep Reservoir Networks with Learned Hidden Reservoir Weights using Direct Feedback Alignment

    Authors: Matthew Evanusa, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: Deep Reservoir Computing has emerged as a new paradigm for deep learning, which is based around the reservoir computing principle of maintaining random pools of neurons combined with hierarchical deep learning. The reservoir paradigm reflects and respects the high degree of recurrence in biological brains, and the role that neuronal dynamics play in learning. However, one issue hampering deep rese… ▽ More

    Submitted 14 October, 2020; v1 submitted 13 October, 2020; originally announced October 2020.

  37. arXiv:2009.00581  [pdf, other

    cs.NE cs.AI cs.CV

    A Deep 2-Dimensional Dynamical Spiking Neuronal Network for Temporal Encoding trained with STDP

    Authors: Matthew Evanusa, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: The brain is known to be a highly complex, asynchronous dynamical system that is highly tailored to encode temporal information. However, recent deep learning approaches to not take advantage of this temporal coding. Spiking Neural Networks (SNNs) can be trained using biologically-realistic learning mechanisms, and can have neuronal activation rules that are biologically relevant. This type of net… ▽ More

    Submitted 1 September, 2020; originally announced September 2020.

  38. arXiv:2006.06753  [pdf, other

    cs.CV cs.RO

    PRGFlow: Benchmarking SWAP-Aware Unified Deep Visual Inertial Odometry

    Authors: Nitin J. Sanket, Chahat Deep Singh, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: Odometry on aerial robots has to be of low latency and high robustness whilst also respecting the Size, Weight, Area and Power (SWAP) constraints as demanded by the size of the robot. A combination of visual sensors coupled with Inertial Measurement Units (IMUs) has proven to be the best combination to obtain robust and low latency odometry on resource-constrained aerial robots. Recently, deep lea… ▽ More

    Submitted 11 June, 2020; originally announced June 2020.

    Comments: 16 pages, 13 figures, 10 tables. Under review T-RO

  39. arXiv:2006.06158  [pdf, other

    cs.CV

    0-MMS: Zero-Shot Multi-Motion Segmentation With A Monocular Event Camera

    Authors: Chethan M. Parameshwara, Nitin J. Sanket, Chahat Deep Singh, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: Segmentation of moving objects in dynamic scenes is a key process in scene understanding for navigation tasks. Classical cameras suffer from motion blur in such scenarios rendering them effete. On the contrary, event cameras, because of their high temporal resolution and lack of motion blur, are tailor-made for this problem. We present an approach for monocular multi-motion segmentation, which com… ▽ More

    Submitted 6 November, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: 7 pages, 6 figures, 4 tables, Under review ICRA 2021

  40. arXiv:2006.03201  [pdf, other

    cs.CV cs.AI cs.RO

    Egocentric Object Manipulation Graphs

    Authors: Eadom Dessalene, Michael Maynord, Chinmaya Devaraj, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: We introduce Egocentric Object Manipulation Graphs (Ego-OMG) - a novel representation for activity modeling and anticipation of near future actions integrating three components: 1) semantic temporal structure of activities, 2) short-term dynamics, and 3) representations for appearance. Semantic temporal structure is modeled through a graph, embedded through a Graph Convolutional Network, whose sta… ▽ More

    Submitted 4 June, 2020; originally announced June 2020.

  41. arXiv:2001.09373  [pdf, other

    cs.LG cs.AI stat.ML

    Following Instructions by Imagining and Reaching Visual Goals

    Authors: John Kanu, Eadom Dessalene, Xiaomin Lin, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: While traditional methods for instruction-following typically assume prior linguistic and perceptual knowledge, many recent works in reinforcement learning (RL) have proposed learning policies end-to-end, typically by training neural networks to map joint representations of observations and instructions directly to actions. In this work, we present a novel framework for learning to perform tempora… ▽ More

    Submitted 25 January, 2020; originally announced January 2020.

  42. arXiv:1907.00635  [pdf, other

    cs.IR

    Dermtrainer: A Decision Support System for Dermatological Diseases

    Authors: Gernot Salzer, Agata Ciabattoni, Christian Fermüller, Martin Haiduk, Harald Kittler, Arno Lukas, Rosa María Rodríguez Domínguez, Antonia Wesinger, Elisabeth Riedl

    Abstract: Dermtrainer is a medical decision support system that assists general practitioners in diagnosing skin diseases and serves as a training platform for dermatologists. Its key components are a comprehensive dermatological knowledge base, a clinical algorithm for diagnosing skin diseases, a reasoning component for deducing the most likely differential diagnoses for a patient, and a library of high-qu… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

  43. arXiv:1906.11742  [pdf, ps, other

    cs.LO

    A Game Model for Proofs with Costs

    Authors: Timo Lang, Carlos Olarte, Elaine Pimentel, Christian Fermuller

    Abstract: We look at substructural calculi from a game semantic point of view, guided by certain intuitions about resource conscious and, more specifically, cost conscious reasoning. To this aim, we start with a game, where player I defends a claim corresponding to a (single-conclusion) sequent, while player II tries to refute that claim. Branching rules for additive connectives are modeled by choices of II… ▽ More

    Submitted 27 June, 2019; originally announced June 2019.

    Comments: To appear in TABLEAUX'19

  44. arXiv:1906.02919  [pdf, other

    cs.RO cs.CV

    EVDodgeNet: Deep Dynamic Obstacle Dodging with Event Cameras

    Authors: Nitin J. Sanket, Chethan M. Parameshwara, Chahat Deep Singh, Ashwin V. Kuruttukulam, Cornelia Fermüller, Davide Scaramuzza, Yiannis Aloimonos

    Abstract: Dynamic obstacle avoidance on quadrotors requires low latency. A class of sensors that are particularly suitable for such scenarios are event cameras. In this paper, we present a deep learning -- based solution for dodging multiple dynamic obstacles on a quadrotor with a single event camera and on-board computation. Our approach uses a series of shallow neural networks for estimating both the ego-… ▽ More

    Submitted 1 March, 2020; v1 submitted 7 June, 2019; originally announced June 2019.

    Comments: 15 pages, 16 figures, Code and Video can be found at: https://prg.cs.umd.edu/EVDodgeNet

    Journal ref: IEEE International Conference on Robotics and Automation (ICRA), 2020

  45. arXiv:1905.11926  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Network Deconvolution

    Authors: Chengxi Ye, Matthew Evanusa, Hua He, Anton Mitrokhin, Tom Goldstein, James A. Yorke, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: Convolution is a central operation in Convolutional Neural Networks (CNNs), which applies a kernel to overlapping regions shifted across the image. However, because of the strong correlations in real-world image data, convolutional kernels are in effect re-learning redundant data. In this work, we show that this redundancy has made neural network training challenging, and propose network deconvolu… ▽ More

    Submitted 25 February, 2020; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: ICLR 2020

  46. arXiv:1903.08248  [pdf, other

    cs.RO

    Computational Tactile Flow for Anthropomorphic Grippers

    Authors: Kanishka Ganguly, Behzad Sadrfaridpour, Cornelia Fermüller, Yiannis Aloimonos

    Abstract: Grasping objects requires tight integration between visual and tactile feedback. However, there is an inherent difference in the scale at which both these input modalities operate. It is thus necessary to be able to analyze tactile feedback in isolation in order to gain information about the surface the end-effector is operating on, such that more fine-grained features may be extracted from the su… ▽ More

    Submitted 19 March, 2019; originally announced March 2019.

    Comments: 8 pages, 18 figures. Submitted to 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2019)

  47. arXiv:1903.07520  [pdf, other

    cs.CV

    EV-IMO: Motion Segmentation Dataset and Learning Pipeline for Event Cameras

    Authors: Anton Mitrokhin, Chengxi Ye, Cornelia Fermuller, Yiannis Aloimonos, Tobi Delbruck

    Abstract: We present the first event-based learning approach for motion segmentation in indoor scenes and the first event-based dataset - EV-IMO - which includes accurate pixel-wise motion masks, egomotion and ground truth depth. Our approach is based on an efficient implementation of the SfM learning pipeline using a low parameter neural network architecture on event data. In addition to camera egomotion a… ▽ More

    Submitted 12 January, 2020; v1 submitted 18 March, 2019; originally announced March 2019.

    Comments: 8 pages, 6 figures. Submitted to 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2019)

  48. Topology-Aware Non-Rigid Point Cloud Registration

    Authors: Konstantinos Zampogiannis, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: In this paper, we introduce a non-rigid registration pipeline for pairs of unorganized point clouds that may be topologically different. Standard warp field estimation algorithms, even under robust, discontinuity-preserving regularization, tend to produce erratic motion estimates on boundaries associated with `close-to-open' topology changes. We overcome this limitation by exploiting backward moti… ▽ More

    Submitted 3 November, 2019; v1 submitted 16 November, 2018; originally announced November 2018.

  49. arXiv:1809.08625  [pdf, other

    cs.CV cs.LG cs.RO

    Unsupervised Learning of Dense Optical Flow, Depth and Egomotion from Sparse Event Data

    Authors: Chengxi Ye, Anton Mitrokhin, Cornelia Fermüller, James A. Yorke, Yiannis Aloimonos

    Abstract: In this work we present a lightweight, unsupervised learning pipeline for \textit{dense} depth, optical flow and egomotion estimation from sparse event output of the Dynamic Vision Sensor (DVS). To tackle this low level vision task, we use a novel encoder-decoder neural network architecture - ECN. Our work is the first monocular pipeline that generates dense depth and optical flow from sparse ev… ▽ More

    Submitted 25 February, 2019; v1 submitted 23 September, 2018; originally announced September 2018.

  50. arXiv:1807.04870  [pdf, other

    cs.CV cs.RO

    Extracting Contact and Motion from Manipulation Videos

    Authors: Konstantinos Zampogiannis, Kanishka Ganguly, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: When we physically interact with our environment using our hands, we touch objects and force them to move: contact and motion are defining properties of manipulation. In this paper, we present an active, bottom-up method for the detection of actor-object contacts and the extraction of moved objects and their motions in RGBD videos of manipulation actions. At the core of our approach lies non-rigid… ▽ More

    Submitted 2 February, 2019; v1 submitted 12 July, 2018; originally announced July 2018.