Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Tong, W L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.15618  [pdf, other

    cs.LG cs.NE

    MLPs Learn In-Context

    Authors: William L. Tong, Cengiz Pehlevan

    Abstract: In-context learning (ICL), the remarkable ability to solve a task from only input exemplars, has commonly been assumed to be a unique hallmark of Transformer models. In this study, we demonstrate that multi-layer perceptrons (MLPs) can also learn in-context. Moreover, we find that MLPs, and the closely related MLP-Mixer models, learn in-context competitively with Transformers given the same comput… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 29 pages, 9 figures, code available at https://github.com/wtong98/mlp-icl

  2. arXiv:2203.00573  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    Contrasting random and learned features in deep Bayesian linear regression

    Authors: Jacob A. Zavatone-Veth, William L. Tong, Cengiz Pehlevan

    Abstract: Understanding how feature learning affects generalization is among the foremost goals of modern deep learning theory. Here, we study how the ability to learn representations affects the generalization performance of a simple class of models: deep Bayesian linear neural networks trained on unstructured Gaussian data. By comparing deep random feature models to deep networks in which all layers are t… ▽ More

    Submitted 16 June, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: 35 pages, 7 figures. v2: minor typos corrected and references added; published in PRE

    Journal ref: Physical Review E 105, 064118 (2022)