Zum Hauptinhalt springen

Showing 1–1 of 1 results for author: Jaegle, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2301.09595  [pdf, other

    cs.CV

    Zorro: the masked multimodal transformer

    Authors: Adrià Recasens, Jason Lin, Joāo Carreira, Drew Jaegle, Luyu Wang, Jean-baptiste Alayrac, Pauline Luc, Antoine Miech, Lucas Smaira, Ross Hemsley, Andrew Zisserman

    Abstract: Attention-based models are appealing for multimodal processing because inputs from multiple modalities can be concatenated and fed to a single backbone network - thus requiring very little fusion engineering. The resulting representations are however fully entangled throughout the network, which may not always be desirable: in learning, contrastive audio-visual self-supervised learning requires in… ▽ More

    Submitted 22 February, 2023; v1 submitted 23 January, 2023; originally announced January 2023.