Pay Attention to MLPs

Liu, Hanxiao; Dai, Zihang; So, David R.; Le, Quoc V.

Computer Science > Machine Learning

arXiv:2105.08050 (cs)

[Submitted on 17 May 2021 (v1), last revised 1 Jun 2021 (this version, v2)]

Title:Pay Attention to MLPs

Authors:Hanxiao Liu, Zihang Dai, David R. So, Quoc V. Le

View PDF

Abstract:Transformers have become one of the most important architectural innovations in deep learning and have enabled many breakthroughs over the past few years. Here we propose a simple network architecture, gMLP, based on MLPs with gating, and show that it can perform as well as Transformers in key language and vision applications. Our comparisons show that self-attention is not critical for Vision Transformers, as gMLP can achieve the same accuracy. For BERT, our model achieves parity with Transformers on pretraining perplexity and is better on some downstream NLP tasks. On finetuning tasks where gMLP performs worse, making the gMLP model substantially larger can close the gap with Transformers. In general, our experiments show that gMLP can scale as well as Transformers over increased data and compute.

Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2105.08050 [cs.LG]
	(or arXiv:2105.08050v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2105.08050

Submission history

From: Hanxiao Liu [view email]
[v1] Mon, 17 May 2021 17:55:04 UTC (642 KB)
[v2] Tue, 1 Jun 2021 20:24:06 UTC (582 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-05

Change to browse by:

cs
cs.CV
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Hanxiao Liu
Zihang Dai
David R. So
Quoc V. Le

export BibTeX citation

Computer Science > Machine Learning

Title:Pay Attention to MLPs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Pay Attention to MLPs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators