Skip to main content

Showing 1–1 of 1 results for author: Vikram, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.04620  [pdf, other

    cs.LG cs.AI cs.CL

    Learning to (Learn at Test Time): RNNs with Expressive Hidden States

    Authors: Yu Sun, Xinhao Li, Karan Dalal, Jiarui Xu, Arjun Vikram, Genghan Zhang, Yann Dubois, Xinlei Chen, Xiaolong Wang, Sanmi Koyejo, Tatsunori Hashimoto, Carlos Guestrin

    Abstract: Self-attention performs well in long context but has quadratic complexity. Existing RNN layers have linear complexity, but their performance in long context is limited by the expressive power of their hidden state. We propose a new class of sequence modeling layers with linear complexity and an expressive hidden state. The key idea is to make the hidden state a machine learning model itself, and t… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.