Zum Hauptinhalt springen

Showing 1–13 of 13 results for author: Kolve, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2206.06994  [pdf, other

    cs.AI cs.CV cs.RO

    ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

    Authors: Matt Deitke, Eli VanderBilt, Alvaro Herrasti, Luca Weihs, Jordi Salvador, Kiana Ehsani, Winson Han, Eric Kolve, Ali Farhadi, Aniruddha Kembhavi, Roozbeh Mottaghi

    Abstract: Massive datasets and high-capacity models have driven many recent advancements in computer vision and natural language understanding. This work presents a platform to enable similar success stories in Embodied AI. We propose ProcTHOR, a framework for procedural generation of Embodied AI environments. ProcTHOR enables us to sample arbitrarily large datasets of diverse, interactive, customizable, an… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

    Comments: ProcTHOR website: https://procthor.allenai.org

  2. arXiv:2202.02317  [pdf, other

    cs.CV cs.CL

    Webly Supervised Concept Expansion for General Purpose Vision Models

    Authors: Amita Kamath, Christopher Clark, Tanmay Gupta, Eric Kolve, Derek Hoiem, Aniruddha Kembhavi

    Abstract: General Purpose Vision (GPV) systems are models that are designed to solve a wide array of visual tasks without requiring architectural changes. Today, GPVs primarily learn both skills and concepts from large fully supervised datasets. Scaling GPVs to tens of thousands of concepts by acquiring data to learn each concept for every skill quickly becomes prohibitive. This work presents an effective a… ▽ More

    Submitted 20 July, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

    Comments: ECCV 2022

  3. arXiv:2112.00800  [pdf, other

    cs.CL cs.AI

    Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text

    Authors: Christopher Clark, Jordi Salvador, Dustin Schwenk, Derrick Bonafilia, Mark Yatskar, Eric Kolve, Alvaro Herrasti, Jonghyun Choi, Sachin Mehta, Sam Skjonsberg, Carissa Schoenick, Aaron Sarnat, Hannaneh Hajishirzi, Aniruddha Kembhavi, Oren Etzioni, Ali Farhadi

    Abstract: Communicating with humans is challenging for AIs because it requires a shared understanding of the world, complex semantics (e.g., metaphors or analogies), and at times multi-modal gestures (e.g., pointing with a finger, or an arrow in a diagram). We investigate these challenges in the context of Iconary, a collaborative game of drawing and guessing based on Pictionary, that poses a novel challeng… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

    Comments: In EMNLP 2021

  4. arXiv:2110.10067  [pdf, other

    cs.LG cs.AI

    CORA: Benchmarks, Baselines, and Metrics as a Platform for Continual Reinforcement Learning Agents

    Authors: Sam Powers, Eliot Xing, Eric Kolve, Roozbeh Mottaghi, Abhinav Gupta

    Abstract: Progress in continual reinforcement learning has been limited due to several barriers to entry: missing code, high compute requirements, and a lack of suitable benchmarks. In this work, we present CORA, a platform for Continual Reinforcement Learning Agents that provides benchmarks, baselines, and metrics in a single code package. The benchmarks we provide are designed to evaluate different aspect… ▽ More

    Submitted 31 December, 2022; v1 submitted 19 October, 2021; originally announced October 2021.

    Comments: Repository available at https://github.com/AGI-Labs/continual_rl

    Journal ref: Proceedings of The 1st Conference on Lifelong Learning Agents, PMLR 199:705-743, 2022

  5. arXiv:2104.11213  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    ManipulaTHOR: A Framework for Visual Object Manipulation

    Authors: Kiana Ehsani, Winson Han, Alvaro Herrasti, Eli VanderBilt, Luca Weihs, Eric Kolve, Aniruddha Kembhavi, Roozbeh Mottaghi

    Abstract: The domain of Embodied AI has recently witnessed substantial progress, particularly in navigating agents within their environments. These early successes have laid the building blocks for the community to tackle tasks that require agents to actively interact with objects in their environment. Object manipulation is an established research domain within the robotics community and poses several chal… ▽ More

    Submitted 22 April, 2021; originally announced April 2021.

    Comments: CVPR 2021 -- (Oral presentation)

  6. arXiv:2007.04979  [pdf, other

    cs.CV cs.AI cs.LG cs.MA

    A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

    Authors: Unnat Jain, Luca Weihs, Eric Kolve, Ali Farhadi, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing

    Abstract: Autonomous agents must learn to collaborate. It is not scalable to develop a new centralized agent every time a task's difficulty outpaces a single agent's abilities. While multi-agent collaboration research has flourished in gridworld-like environments, relatively little work has considered visually rich domains. Addressing this, we introduce the novel task FurnMove in which agents work together… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

    Comments: Accepted to ECCV 2020 (spotlight); Project page: https://unnat.github.io/cordial-sync

  7. arXiv:2004.06799  [pdf, other

    cs.CV cs.RO

    RoboTHOR: An Open Simulation-to-Real Embodied AI Platform

    Authors: Matt Deitke, Winson Han, Alvaro Herrasti, Aniruddha Kembhavi, Eric Kolve, Roozbeh Mottaghi, Jordi Salvador, Dustin Schwenk, Eli VanderBilt, Matthew Wallingford, Luca Weihs, Mark Yatskar, Ali Farhadi

    Abstract: Visual recognition ecosystems (e.g. ImageNet, Pascal, COCO) have undeniably played a prevailing role in the evolution of modern computer vision. We argue that interactive and embodied visual AI has reached a stage of development similar to visual recognition prior to the advent of these ecosystems. Recently, various synthetic environments have been introduced to facilitate research in embodied AI.… ▽ More

    Submitted 14 April, 2020; originally announced April 2020.

    Comments: CVPR 2020

  8. arXiv:1912.08195  [pdf, other

    cs.CV cs.AI cs.LG

    Learning Generalizable Visual Representations via Interactive Gameplay

    Authors: Luca Weihs, Aniruddha Kembhavi, Kiana Ehsani, Sarah M Pratt, Winson Han, Alvaro Herrasti, Eric Kolve, Dustin Schwenk, Roozbeh Mottaghi, Ali Farhadi

    Abstract: A growing body of research suggests that embodied gameplay, prevalent not just in human cultures but across a variety of animal species including turtles and ravens, is critical in developing the neural flexibility for creative problem solving, decision making, and socialization. Comparatively little is known regarding the impact of embodied gameplay upon artificial agents. While recent work has p… ▽ More

    Submitted 25 February, 2021; v1 submitted 17 December, 2019; originally announced December 2019.

    Comments: Replaced with version accepted to ICLR'21

  9. arXiv:1904.05879  [pdf, other

    cs.CV cs.AI cs.MA

    Two Body Problem: Collaborative Visual Task Completion

    Authors: Unnat Jain, Luca Weihs, Eric Kolve, Mohammad Rastegari, Svetlana Lazebnik, Ali Farhadi, Alexander Schwing, Aniruddha Kembhavi

    Abstract: Collaboration is a necessary skill to perform tasks that are beyond one agent's capabilities. Addressed extensively in both conventional and modern AI, multi-agent collaboration has often been studied in the context of simple grid worlds. We argue that there are inherently visual aspects to collaboration which should be studied in visually rich environments. A key element in collaboration is commu… ▽ More

    Submitted 11 April, 2019; originally announced April 2019.

    Comments: Accepted to CVPR 2019

  10. arXiv:1712.05474  [pdf, other

    cs.CV cs.AI cs.LG

    AI2-THOR: An Interactive 3D Environment for Visual AI

    Authors: Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Matt Deitke, Kiana Ehsani, Daniel Gordon, Yuke Zhu, Aniruddha Kembhavi, Abhinav Gupta, Ali Farhadi

    Abstract: We introduce The House Of inteRactions (THOR), a framework for visual AI research, available at http://ai2thor.allenai.org. AI2-THOR consists of near photo-realistic 3D indoor scenes, where AI agents can navigate in the scenes and interact with objects to perform tasks. AI2-THOR enables research in many different domains including but not limited to deep reinforcement learning, imitation learning,… ▽ More

    Submitted 26 August, 2022; v1 submitted 14 December, 2017; originally announced December 2017.

  11. arXiv:1705.08080  [pdf, other

    cs.CV cs.LG cs.RO

    Visual Semantic Planning using Deep Successor Representations

    Authors: Yuke Zhu, Daniel Gordon, Eric Kolve, Dieter Fox, Li Fei-Fei, Abhinav Gupta, Roozbeh Mottaghi, Ali Farhadi

    Abstract: A crucial capability of real-world intelligent agents is their ability to plan a sequence of actions to achieve their goals in the visual world. In this work, we address the problem of visual semantic planning: the task of predicting a sequence of actions from visual observations that transform a dynamic environment from an initial state to a goal state. Doing so entails knowledge about objects an… ▽ More

    Submitted 15 August, 2017; v1 submitted 23 May, 2017; originally announced May 2017.

    Comments: ICCV 2017 camera ready

  12. arXiv:1609.05143  [pdf, other

    cs.CV

    Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning

    Authors: Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J. Lim, Abhinav Gupta, Li Fei-Fei, Ali Farhadi

    Abstract: Two less addressed issues of deep reinforcement learning are (1) lack of generalization capability to new target goals, and (2) data inefficiency i.e., the model requires several (and often costly) episodes of trial and error to converge, which makes it impractical to be applied to real-world scenarios. In this paper, we address these two issues and apply our model to the task of target-driven vis… ▽ More

    Submitted 16 September, 2016; originally announced September 2016.

  13. arXiv:1603.07396  [pdf, other

    cs.CV cs.AI

    A Diagram Is Worth A Dozen Images

    Authors: Aniruddha Kembhavi, Mike Salvato, Eric Kolve, Minjoon Seo, Hannaneh Hajishirzi, Ali Farhadi

    Abstract: Diagrams are common tools for representing complex concepts, relationships and events, often when it would be difficult to portray the same information with natural images. Understanding natural images has been extensively studied in computer vision, while diagram understanding has received little attention. In this paper, we study the problem of diagram interpretation and reasoning, the challengi… ▽ More

    Submitted 23 March, 2016; originally announced March 2016.