Zum Hauptinhalt springen

Showing 1–25 of 25 results for author: Latapie, H

.
  1. arXiv:2407.13937  [pdf, other

    cs.CV

    Boosting Online 3D Multi-Object Tracking through Camera-Radar Cross Check

    Authors: Sheng-Yao Kuan, Jen-Hao Cheng, Hsiang-Wei Huang, Wenhao Chai, Cheng-Yen Yang, Hugo Latapie, Gaowen Liu, Bing-Fei Wu, Jenq-Neng Hwang

    Abstract: In the domain of autonomous driving, the integration of multi-modal perception techniques based on data from diverse sensors has demonstrated substantial progress. Effectively surpassing the capabilities of state-of-the-art single-modality detectors through sensor fusion remains an active challenge. This work leverages the respective advantages of cameras in perspective view and radars in Bird's E… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 2024 IEEE Intelligent Vehicles Symposium (IV)

  2. arXiv:2407.06167  [pdf, other

    cs.CV cs.LG

    DεpS: Delayed ε-Shrinking for Faster Once-For-All Training

    Authors: Aditya Annavajjala, Alind Khare, Animesh Agrawal, Igor Fedorov, Hugo Latapie, Myungjin Lee, Alexey Tumanov

    Abstract: CNNs are increasingly deployed across different hardware, dynamic environments, and low-power embedded devices. This has led to the design and training of CNN architectures with the goal of maximizing accuracy subject to such variable deployment constraints. As the number of deployment scenarios grows, there is a need to find scalable solutions to design and train specialized CNNs. Once-for-all tr… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted to the 18th European Conference on Computer Vision (ECCV 2024)

  3. arXiv:2403.08108  [pdf, other

    cs.CV

    TaskCLIP: Extend Large Vision-Language Model for Task Oriented Object Detection

    Authors: Hanning Chen, Wenjun Huang, Yang Ni, Sanggeon Yun, Yezi Liu, Fei Wen, Alvaro Velasquez, Hugo Latapie, Mohsen Imani

    Abstract: Task-oriented object detection aims to find objects suitable for accomplishing specific tasks. As a challenging task, it requires simultaneous visual data processing and reasoning under ambiguous semantics. Recent solutions are mainly all-in-one models. However, the object detection backbones are pre-trained without text supervision. Thus, to incorporate task requirements, their intricate models u… ▽ More

    Submitted 6 September, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  4. arXiv:2403.05763  [pdf, other

    cs.AR cs.AI cs.LG

    HDReason: Algorithm-Hardware Codesign for Hyperdimensional Knowledge Graph Reasoning

    Authors: Hanning Chen, Yang Ni, Ali Zakeri, Zhuowen Zou, Sanggeon Yun, Fei Wen, Behnam Khaleghi, Narayan Srinivasa, Hugo Latapie, Mohsen Imani

    Abstract: In recent times, a plethora of hardware accelerators have been put forth for graph learning applications such as vertex classification and graph classification. However, previous works have paid little attention to Knowledge Graph Completion (KGC), a task that is well-known for its significantly higher algorithm complexity. The state-of-the-art KGC solutions based on graph convolution neural netwo… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  5. arXiv:2402.14672  [pdf, other

    cs.CL cs.AI

    Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments

    Authors: Yu Gu, Yiheng Shu, Hao Yu, Xiao Liu, Yuxiao Dong, Jie Tang, Jayanth Srinivasa, Hugo Latapie, Yu Su

    Abstract: The applications of large language models (LLMs) have expanded well beyond the confines of text processing, signaling a new era where LLMs are envisioned as generalist language agents capable of operating within complex real-world environments. These environments are often highly expansive, making it impossible for the LLM to process them within its short-term memory. Motivated by recent research… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: 16 pages, 8 figures, 4 tables

    ACM Class: I.2.7

  6. arXiv:2311.01623  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    VQPy: An Object-Oriented Approach to Modern Video Analytics

    Authors: Shan Yu, Zhenting Zhu, Yu Chen, Hanchen Xu, Pengzhan Zhao, Yang Wang, Arthi Padmanabhan, Hugo Latapie, Harry Xu

    Abstract: Video analytics is widely used in contemporary systems and services. At the forefront of video analytics are video queries that users develop to find objects of particular interest. Building upon the insight that video objects (e.g., human, animals, cars, etc.), the center of video analytics, are similar in spirit to objects modeled by traditional object-oriented languages, we propose to develop a… ▽ More

    Submitted 3 June, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: MLSys'24

  7. arXiv:2311.01423  [pdf, other

    cs.CV

    CenterRadarNet: Joint 3D Object Detection and Tracking Framework using 4D FMCW Radar

    Authors: Jen-Hao Cheng, Sheng-Yao Kuan, Hugo Latapie, Gaowen Liu, Jenq-Neng Hwang

    Abstract: Robust perception is a vital component for ensuring safe autonomous and assisted driving. Automotive radar (77 to 81 GHz), which offers weather-resilient sensing, provides a complementary capability to the vision- or LiDAR-based autonomous driving systems. Raw radio-frequency (RF) radar tensors contain rich spatiotemporal semantics besides 3D location information. The majority of previous methods… ▽ More

    Submitted 4 November, 2023; v1 submitted 2 November, 2023; originally announced November 2023.

  8. arXiv:2307.10577  [pdf, other

    cs.CV cs.AI

    Ethosight: A Reasoning-Guided Iterative Learning System for Nuanced Perception based on Joint-Embedding & Contextual Label Affinity

    Authors: Hugo Latapie, Shan Yu, Patrick Hammer, Kristinn R. Thorisson, Vahagn Petrosyan, Brandon Kynoch, Alind Khare, Payman Behnam, Alexey Tumanov, Aksheit Saxena, Anish Aralikatti, Hanning Chen, Mohsen Imani, Mike Archbold, Tangrui Li, Pei Wang, Justin Hart

    Abstract: Traditional computer vision models often necessitate extensive data acquisition, annotation, and validation. These models frequently struggle in real-world applications, resulting in high false positive and negative rates, and exhibit poor adaptability to new scenarios, often requiring costly retraining. To address these issues, we present Ethosight, a flexible and adaptable zero-shot video analyt… ▽ More

    Submitted 20 August, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

  9. arXiv:2307.02738  [pdf, other

    cs.AI cs.CL cs.SC

    RecallM: An Adaptable Memory Mechanism with Temporal Understanding for Large Language Models

    Authors: Brandon Kynoch, Hugo Latapie, Dwane van der Sluis

    Abstract: Large Language Models (LLMs) have made extraordinary progress in the field of Artificial Intelligence and have demonstrated remarkable capabilities across a large variety of tasks and domains. However, as we venture closer to creating Artificial General Intelligence (AGI) systems, we recognize the need to supplement LLMs with long-term memory to overcome the context window limitation and more impo… ▽ More

    Submitted 2 October, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: 8 pages, 7 figures, 1 table, Our code is publicly available online at: https://github.com/cisco-open/DeepVision/tree/main/recallm

  10. arXiv:2301.10879  [pdf, other

    cs.LG cs.DC

    SuperFedNAS: Cost-Efficient Federated Neural Architecture Search for On-Device Inference

    Authors: Alind Khare, Animesh Agrawal, Aditya Annavajjala, Payman Behnam, Myungjin Lee, Hugo Latapie, Alexey Tumanov

    Abstract: Neural Architecture Search (NAS) for Federated Learning (FL) is an emerging field. It automates the design and training of Deep Neural Networks (DNNs) when data cannot be centralized due to privacy, communication costs, or regulatory restrictions. Recent federated NAS methods not only reduce manual effort but also help achieve higher accuracy than traditional FL methods like FedAvg. Despite the su… ▽ More

    Submitted 11 July, 2024; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: Accepted at ECCV 2024

  11. arXiv:2301.07099  [pdf, other

    cs.LG cs.AI

    Adaptive Deep Neural Network Inference Optimization with EENet

    Authors: Fatih Ilhan, Ka-Ho Chow, Sihao Hu, Tiansheng Huang, Selim Tekin, Wenqi Wei, Yanzhao Wu, Myungjin Lee, Ramana Kompella, Hugo Latapie, Gaowen Liu, Ling Liu

    Abstract: Well-trained deep neural networks (DNNs) treat all test samples equally during prediction. Adaptive DNN inference with early exiting leverages the observation that some test examples can be easier to predict than others. This paper presents EENet, a novel early-exiting scheduling framework for multi-exit DNN models. Instead of having every sample go through all DNN layers during prediction, EENet… ▽ More

    Submitted 1 December, 2023; v1 submitted 14 January, 2023; originally announced January 2023.

  12. A Retrieve-and-Read Framework for Knowledge Graph Link Prediction

    Authors: Vardaan Pahuja, Boshi Wang, Hugo Latapie, Jayanth Srinivasa, Yu Su

    Abstract: Knowledge graph (KG) link prediction aims to infer new facts based on existing facts in the KG. Recent studies have shown that using the graph neighborhood of a node via graph neural networks (GNNs) provides more useful information compared to just using the query information. Conventional GNNs for KG link prediction follow the standard message-passing paradigm on the entire KG, which leads to sup… ▽ More

    Submitted 22 October, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: Accepted to CIKM'23; Published version DOI: https://doi.org/10.1145/3583780.3614769 ;12 pages, 4 figures

    Journal ref: CIKM (2023) 1992-2002

  13. arXiv:2208.03620  [pdf, other

    cs.CV

    Learning Omnidirectional Flow in 360-degree Video via Siamese Representation

    Authors: Keshav Bhandari, Bin Duan, Gaowen Liu, Hugo Latapie, Ziliang Zong, Yan Yan

    Abstract: Optical flow estimation in omnidirectional videos faces two significant issues: the lack of benchmark datasets and the challenge of adapting perspective video-based methods to accommodate the omnidirectional nature. This paper proposes the first perceptually natural-synthetic omnidirectional benchmark dataset with a 360-degree field of view, FLOW360, with 40 different videos and 4,000 video frames… ▽ More

    Submitted 6 August, 2022; originally announced August 2022.

    Comments: Accepted to ECCV22

  14. arXiv:2112.01603  [pdf

    cs.AI cs.LG

    Neurosymbolic Systems of Perception & Cognition: The Role of Attention

    Authors: Hugo Latapie, Ozkan Kilic, Kristinn R. Thorisson, Pei Wang, Patrick Hammer

    Abstract: A cognitive architecture aimed at cumulative learning must provide the necessary information and control structures to allow agents to learn incrementally and autonomously from their experience. This involves managing an agent's goals as well as continuously relating sensory information to these in its perception-cognition information stack. The more varied the environment of a learning agent is,… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

  15. arXiv:2107.03120  [pdf, other

    cs.CV cs.MM

    Cross-View Exocentric to Egocentric Video Synthesis

    Authors: Gaowen Liu, Hao Tang, Hugo Latapie, Jason Corso, Yan Yan

    Abstract: Cross-view video synthesis task seeks to generate video sequences of one view from another dramatically different view. In this paper, we investigate the exocentric (third-person) view to egocentric (first-person) view video generation task. This is challenging because egocentric view sometimes is remarkably different from the exocentric view. Thus, transforming the appearances across the two diff… ▽ More

    Submitted 7 July, 2021; originally announced July 2021.

    Comments: ACM MM 2021

  16. arXiv:2102.06112  [pdf, other

    cs.AI

    A Metamodel and Framework for Artificial General Intelligence From Theory to Practice

    Authors: Hugo Latapie, Ozkan Kilic, Gaowen Liu, Yan Yan, Ramana Kompella, Pei Wang, Kristinn R. Thorisson, Adam Lawrence, Yuhong Sun, Jayanth Srinivasa

    Abstract: This paper introduces a new metamodel-based knowledge representation that significantly improves autonomous learning and adaptation. While interest in hybrid machine learning / symbolic AI systems leveraging, for example, reasoning and knowledge graphs, is gaining popularity, we find there remains a need for both a clear definition of knowledge and a metamodel to guide the creation and manipulatio… ▽ More

    Submitted 11 February, 2021; originally announced February 2021.

    Comments: arXiv admin note: text overlap with arXiv:2008.12879

  17. arXiv:2102.03424  [pdf, other

    cs.CV cs.SD eess.AS eess.IV

    Learning Audio-Visual Correlations from Variational Cross-Modal Generation

    Authors: Ye Zhu, Yu Wu, Hugo Latapie, Yi Yang, Yan Yan

    Abstract: People can easily imagine the potential sound while seeing an event. This natural synchronization between audio and visual signals reveals their intrinsic correlations. To this end, we propose to learn the audio-visual correlations from the perspective of cross-modal generation in a self-supervised manner, the learned correlations can be then readily applied in multiple downstream tasks such as th… ▽ More

    Submitted 14 February, 2021; v1 submitted 5 February, 2021; originally announced February 2021.

    Comments: Accepted to ICASSP 2021

  18. arXiv:2010.08055  [pdf, other

    cs.CV

    Egok360: A 360 Egocentric Kinetic Human Activity Video Dataset

    Authors: Keshav Bhandari, Mario A. DeLaGarza, Ziliang Zong, Hugo Latapie, Yan Yan

    Abstract: Recently, there has been a growing interest in wearable sensors which provides new research perspectives for 360 ° video analysis. However, the lack of 360 ° datasets in literature hinders the research in this field. To bridge this gap, in this paper we propose a novel Egocentric (first-person) 360° Kinetic human activity video dataset (EgoK360). The EgoK360 dataset contains annotations of human a… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: 5 pages, 5 figures, 1 table, 2020 IEEE International Conference on Image Processing (ICIP)

  19. arXiv:2008.12879  [pdf, ps, other

    cs.AI

    A Metamodel and Framework for AGI

    Authors: Hugo Latapie, Ozkan Kilic

    Abstract: Can artificial intelligence systems exhibit superhuman performance, but in critical ways, lack the intelligence of even a single-celled organism? The answer is clearly 'yes' for narrow AI systems. Animals, plants, and even single-celled organisms learn to reliably avoid danger and move towards food. This is accomplished via a physical knowledge preserving metamodel that autonomously generates usef… ▽ More

    Submitted 6 September, 2020; v1 submitted 28 August, 2020; originally announced August 2020.

  20. arXiv:2002.03219  [pdf, other

    cs.CV cs.LG eess.IV

    Exocentric to Egocentric Image Generation via Parallel Generative Adversarial Network

    Authors: Gaowen Liu, Hao Tang, Hugo Latapie, Yan Yan

    Abstract: Cross-view image generation has been recently proposed to generate images of one view from another dramatically different view. In this paper, we investigate exocentric (third-person) view to egocentric (first-person) view image generation. This is a challenging task since egocentric view sometimes is remarkably different from exocentric view. Thus, transforming the appearances across the two view… ▽ More

    Submitted 8 February, 2020; originally announced February 2020.

    Comments: It has been accepted by ICASSP 2020

  21. arXiv:1907.01826  [pdf, other

    cs.CV

    Cascade Attention Guided Residue Learning GAN for Cross-Modal Translation

    Authors: Bin Duan, Wei Wang, Hao Tang, Hugo Latapie, Yan Yan

    Abstract: Since we were babies, we intuitively develop the ability to correlate the input from different cognitive sensors such as vision, audio, and text. However, in machine learning, this cross-modal learning is a nontrivial task because different modalities have no homogeneous properties. Previous works discover that there should be bridges among different modalities. From neurology and psychology persp… ▽ More

    Submitted 10 December, 2021; v1 submitted 3 July, 2019; originally announced July 2019.

    Comments: 9 pages, 6 figures, update template

  22. arXiv:1807.10591  [pdf, other

    cs.CV cs.LG

    Metric Embedding Autoencoders for Unsupervised Cross-Dataset Transfer Learning

    Authors: Alexey Potapov, Sergey Rodionov, Hugo Latapie, Enzo Fenoglio

    Abstract: Cross-dataset transfer learning is an important problem in person re-identification (Re-ID). Unfortunately, not too many deep transfer Re-ID models exist for realistic settings of practical Re-ID systems. We propose a purely deep transfer Re-ID model consisting of a deep convolutional neural network and an autoencoder. The latent code is divided into metric embedding and nuisance variables. We the… ▽ More

    Submitted 18 July, 2018; originally announced July 2018.

    Comments: ICANN 2018 (The 27th International Conference on Artificial Neural Networks) proceeding

  23. Improving Deep Models of Person Re-identification for Cross-Dataset Usage

    Authors: Sergey Rodionov, Alexey Potapov, Hugo Latapie, Enzo Fenoglio, Maxim Peterson

    Abstract: Person re-identification (Re-ID) is the task of matching humans across cameras with non-overlapping views that has important applications in visual surveillance. Like other computer vision tasks, this task has gained much with the utilization of deep learning methods. However, existing solutions based on deep learning are usually trained and tested on samples taken from same datasets, while in pra… ▽ More

    Submitted 23 July, 2018; originally announced July 2018.

    Comments: AIAI 2018 (14th International Conference on Artificial Intelligence Applications and Innovations) proceeding. The final publication is available at link.springer.com

  24. arXiv:1806.06946  [pdf

    cs.IR cs.AI cs.CV cs.LG

    Semantic Image Retrieval by Uniting Deep Neural Networks and Cognitive Architectures

    Authors: Alexey Potapov, Innokentii Zhdanov, Oleg Scherbakov, Nikolai Skorobogatko, Hugo Latapie, Enzo Fenoglio

    Abstract: Image and video retrieval by their semantic content has been an important and challenging task for years, because it ultimately requires bridging the symbolic/subsymbolic gap. Recent successes in deep learning enabled detection of objects belonging to many classes greatly outperforming traditional computer vision techniques. However, deep learning solutions capable of executing retrieval queries a… ▽ More

    Submitted 14 June, 2018; originally announced June 2018.

  25. Knowledge-Defined Networking

    Authors: Albert Mestres, Alberto Rodriguez-Natal, Josep Carner, Pere Barlet-Ros, Eduard Alarcón, Marc Solé, Victor Muntés, David Meyer, Sharon Barkai, Mike J Hibbett, Giovani Estrada, Khaldun Ma`ruf, Florin Coras, Vina Ermagan, Hugo Latapie, Chris Cassar, John Evans, Fabio Maino, Jean Walrand, Albert Cabellos

    Abstract: The research community has considered in the past the application of Artificial Intelligence (AI) techniques to control and operate networks. A notable example is the Knowledge Plane proposed by D.Clark et al. However, such techniques have not been extensively prototyped or deployed in the field yet. In this paper, we explore the reasons for the lack of adoption and posit that the rise of two rece… ▽ More

    Submitted 23 June, 2016; v1 submitted 20 June, 2016; originally announced June 2016.

    Comments: 8 pages, 22 references, 6 figures and 1 table

    Journal ref: ACM SIGCOMM Computer Communication Review, Volume 47, Issue 3, July 2017