Zum Hauptinhalt springen

Showing 1–4 of 4 results for author: Rawles, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.03679  [pdf, other

    cs.AI cs.LG

    On the Effects of Data Scale on Computer Control Agents

    Authors: Wei Li, William Bishop, Alice Li, Chris Rawles, Folawiyo Campbell-Ajala, Divya Tyamagundlu, Oriana Riva

    Abstract: Autonomous agents that control computer interfaces to accomplish human tasks are emerging. Leveraging LLMs to power such agents has been of special interest, but unless fine-tuned on human-collected task demonstrations, performance is still relatively low. In this work we study whether fine-tuning alone is a viable approach for building real-world computer control agents. In particularly, we inves… ▽ More

    Submitted 24 August, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  2. arXiv:2405.14573  [pdf, other

    cs.AI cs.LG

    AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents

    Authors: Christopher Rawles, Sarah Clinckemaillie, Yifan Chang, Jonathan Waltz, Gabrielle Lau, Marybeth Fair, Alice Li, William Bishop, Wei Li, Folawiyo Campbell-Ajala, Daniel Toyama, Robert Berry, Divya Tyamagundlu, Timothy Lillicrap, Oriana Riva

    Abstract: Autonomous agents that execute human tasks by controlling computers can enhance human productivity and application accessibility. However, progress in this field will be driven by realistic and reproducible benchmarks. We present AndroidWorld, a fully functional Android environment that provides reward signals for 116 programmatic tasks across 20 real-world Android apps. Unlike existing interactiv… ▽ More

    Submitted 10 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  3. arXiv:2405.11120  [pdf, other

    cs.AI cs.LG

    Latent State Estimation Helps UI Agents to Reason

    Authors: William E Bishop, Alice Li, Christopher Rawles, Oriana Riva

    Abstract: A common problem for agents operating in real-world environments is that the response of an environment to their actions may be non-deterministic and observed through noise. This renders environmental state and progress towards completing a task latent. Despite recent impressive demonstrations of LLM's reasoning abilities on various benchmarks, whether LLMs can build estimates of latent state and… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  4. arXiv:2307.10088  [pdf, other

    cs.LG cs.CL cs.HC

    Android in the Wild: A Large-Scale Dataset for Android Device Control

    Authors: Christopher Rawles, Alice Li, Daniel Rodriguez, Oriana Riva, Timothy Lillicrap

    Abstract: There is a growing interest in device-control systems that can interpret human natural language instructions and execute them on a digital device by directly controlling its user interface. We present a dataset for device-control research, Android in the Wild (AITW), which is orders of magnitude larger than current datasets. The dataset contains human demonstrations of device interactions, includi… ▽ More

    Submitted 27 October, 2023; v1 submitted 19 July, 2023; originally announced July 2023.