Zum Hauptinhalt springen

Showing 1–11 of 11 results for author: Iuzzolino, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.12854  [pdf, other

    cs.CV

    Multiscale Video Pretraining for Long-Term Activity Forecasting

    Authors: Reuben Tan, Matthias De Lange, Michael Iuzzolino, Bryan A. Plummer, Kate Saenko, Karl Ridgeway, Lorenzo Torresani

    Abstract: Long-term activity forecasting is an especially challenging research problem because it requires understanding the temporal relationships between observed actions, as well as the variability and complexity of human activities. Despite relying on strong supervision via expensive human annotations, state-of-the-art forecasting approaches often generalize poorly to unseen data. To alleviate this issu… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

  2. arXiv:2307.05784  [pdf, other

    cs.CV cs.AI

    EgoAdapt: A multi-stream evaluation study of adaptation to real-world egocentric user video

    Authors: Matthias De Lange, Hamid Eghbalzadeh, Reuben Tan, Michael Iuzzolino, Franziska Meier, Karl Ridgeway

    Abstract: In egocentric action recognition a single population model is typically trained and subsequently embodied on a head-mounted device, such as an augmented reality headset. While this model remains static for new users and environments, we introduce an adaptive paradigm of two phases, where after pretraining a population model, the model adapts on-device and online to the user's experience. This sett… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

    Comments: Preprint

  3. arXiv:2304.09179  [pdf, other

    cs.CV cs.AI

    Pretrained Language Models as Visual Planners for Human Assistance

    Authors: Dhruvesh Patel, Hamid Eghbalzadeh, Nitin Kamra, Michael Louis Iuzzolino, Unnat Jain, Ruta Desai

    Abstract: In our pursuit of advancing multi-modal AI assistants capable of guiding users to achieve complex multi-step goals, we propose the task of "Visual Planning for Assistance (VPA)". Given a succinct natural language goal, e.g., "make a shelf", and a video of the user's progress so far, the aim of VPA is to devise a plan, i.e., a sequence of actions such as "sand shelf", "paint shelf", etc. to realize… ▽ More

    Submitted 26 August, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: Accepted at ICCV 2023

  4. arXiv:2302.05330  [pdf, other

    cs.CV cs.AI cs.LG

    Action Dynamics Task Graphs for Learning Plannable Representations of Procedural Tasks

    Authors: Weichao Mao, Ruta Desai, Michael Louis Iuzzolino, Nitin Kamra

    Abstract: Given video demonstrations and paired narrations of an at-home procedural task such as changing a tire, we present an approach to extract the underlying task structure -- relevant actions and their temporal dependencies -- via action-centric task graphs. Learnt structured representations from our method, Action Dynamics Task Graphs (ADTG), can then be used for understanding such tasks in unseen vi… ▽ More

    Submitted 11 January, 2023; originally announced February 2023.

    Comments: AAAI 2023 Workshop on User-Centric Artificial Intelligence for Assistance in At-Home Tasks

  5. arXiv:2109.05675  [pdf, other

    cs.CV cs.LG stat.ML

    Online Unsupervised Learning of Visual Representations and Categories

    Authors: Mengye Ren, Tyler R. Scott, Michael L. Iuzzolino, Michael C. Mozer, Richard Zemel

    Abstract: Real world learning scenarios involve a nonstationary distribution of classes with sequential dependencies among the samples, in contrast to the standard machine learning formulation of drawing samples independently from a fixed, typically uniform distribution. Furthermore, real world interactions demand learning on-the-fly from few or no class labels. In this work, we propose an unsupervised mode… ▽ More

    Submitted 28 May, 2022; v1 submitted 12 September, 2021; originally announced September 2021.

    Comments: Technical report, 32 pages

  6. arXiv:2102.09808  [pdf, other

    cs.LG cs.CV

    Improving Anytime Prediction with Parallel Cascaded Networks and a Temporal-Difference Loss

    Authors: Michael L. Iuzzolino, Michael C. Mozer, Samy Bengio

    Abstract: Although deep feedforward neural networks share some characteristics with the primate visual system, a key distinction is their dynamics. Deep nets typically operate in serial stages wherein each layer completes its computation before processing begins in subsequent layers. In contrast, biological systems have cascaded dynamics: information propagates from neurons at all layers in parallel but tra… ▽ More

    Submitted 2 November, 2021; v1 submitted 19 February, 2021; originally announced February 2021.

  7. arXiv:2007.04546  [pdf, other

    cs.LG cs.CV stat.ML

    Wandering Within a World: Online Contextualized Few-Shot Learning

    Authors: Mengye Ren, Michael L. Iuzzolino, Michael C. Mozer, Richard S. Zemel

    Abstract: We aim to bridge the gap between typical human and machine-learning environments by extending the standard framework of few-shot learning to an online, continual setting. In this setting, episodes do not have separate training and testing phases, and instead models are evaluated online while learning novel classes. As in the real world, where the presence of spatiotemporal context helps us retriev… ▽ More

    Submitted 22 April, 2021; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: ICLR 2021

  8. arXiv:2004.00762  [pdf, other

    cs.LG cs.HC stat.ML

    In Automation We Trust: Investigating the Role of Uncertainty in Active Learning Systems

    Authors: Michael L. Iuzzolino, Tetsumichi Umada, Nisar R. Ahmed, Danielle A. Szafir

    Abstract: We investigate how different active learning (AL) query policies coupled with classification uncertainty visualizations affect analyst trust in automated classification systems. A current standard policy for AL is to query the oracle (e.g., the analyst) to refine labels for datapoints where the classifier has the highest uncertainty. This is an optimal policy for the automation system as it yields… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

  9. arXiv:1911.08670  [pdf, other

    cs.CV cs.LG

    MMTM: Multimodal Transfer Module for CNN Fusion

    Authors: Hamid Reza Vaezi Joze, Amirreza Shaban, Michael L. Iuzzolino, Kazuhito Koishida

    Abstract: In late fusion, each modality is processed in a separate unimodal Convolutional Neural Network (CNN) stream and the scores of each modality are fused at the end. Due to its simplicity late fusion is still the predominant approach in many state-of-the-art multimodal applications. In this paper, we present a simple neural network module for leveraging the knowledge from multiple modalities in convol… ▽ More

    Submitted 30 March, 2020; v1 submitted 19 November, 2019; originally announced November 2019.

    Journal ref: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020

  10. arXiv:1906.03504  [pdf, other

    cs.LG cs.NE stat.ML

    Convolutional Bipartite Attractor Networks

    Authors: Michael Iuzzolino, Yoram Singer, Michael C. Mozer

    Abstract: In human perception and cognition, a fundamental operation that brains perform is interpretation: constructing coherent neural states from noisy, incomplete, and intrinsically ambiguous evidence. The problem of interpretation is well matched to an early and often overlooked architecture, the attractor network---a recurrent neural net that performs constraint satisfaction, imputation of missing fea… ▽ More

    Submitted 26 September, 2019; v1 submitted 8 June, 2019; originally announced June 2019.

  11. arXiv:1901.05599  [pdf, other

    cs.LG cs.CV cs.RO stat.ML

    Virtual-to-Real-World Transfer Learning for Robots on Wilderness Trails

    Authors: Michael L. Iuzzolino, Michael E. Walker, Daniel Szafir

    Abstract: Robots hold promise in many scenarios involving outdoor use, such as search-and-rescue, wildlife management, and collecting data to improve environment, climate, and weather forecasting. However, autonomous navigation of outdoor trails remains a challenging problem. Recent work has sought to address this issue using deep learning. Although this approach has achieved state-of-the-art results, the d… ▽ More

    Submitted 16 January, 2019; originally announced January 2019.

    Comments: iROS 2018

    Journal ref: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 576-582)