Zum Hauptinhalt springen

Showing 1–1 of 1 results for author: Sirnam, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2308.13077  [pdf, other

    cs.CV

    Preserving Modality Structure Improves Multi-Modal Learning

    Authors: Swetha Sirnam, Mamshad Nayeem Rizve, Nina Shvetsova, Hilde Kuehne, Mubarak Shah

    Abstract: Self-supervised learning on large-scale multi-modal datasets allows learning semantically meaningful embeddings in a joint multi-modal representation space without relying on human annotations. These joint embeddings enable zero-shot cross-modal tasks like retrieval and classification. However, these methods often struggle to generalize well on out-of-domain data as they ignore the semantic struct… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: Accepted at ICCV 2023