Showing 1–2 of 2 results for author: Ovsianas, A

Search v0.5.6 released 2020-02-24

arXiv:2210.16365 [pdf, other]

cs.LG

Elastic Weight Consolidation Improves the Robustness of Self-Supervised Learning Methods under Transfer

Authors: Andrius Ovsianas, Jason Ramapuram, Dan Busbridge, Eeshan Gunesh Dhekane, Russ Webb

Abstract: Self-supervised representation learning (SSL) methods provide an effective label-free initial condition for fine-tuning downstream tasks. However, in numerous realistic scenarios, the downstream task might be biased with respect to the target label distribution. This in turn moves the learned fine-tuned model posterior away from the initial (label) bias-free self-supervised model posterior. In thi… ▽ More Self-supervised representation learning (SSL) methods provide an effective label-free initial condition for fine-tuning downstream tasks. However, in numerous realistic scenarios, the downstream task might be biased with respect to the target label distribution. This in turn moves the learned fine-tuned model posterior away from the initial (label) bias-free self-supervised model posterior. In this work, we re-interpret SSL fine-tuning under the lens of Bayesian continual learning and consider regularization through the Elastic Weight Consolidation (EWC) framework. We demonstrate that self-regularization against an initial SSL backbone improves worst sub-group performance in Waterbirds by 5% and Celeb-A by 2% when using the ViT-B/16 architecture. Furthermore, to help simplify the use of EWC with SSL, we pre-compute and publicly release the Fisher Information Matrix (FIM), evaluated with 10,000 ImageNet-1K variates evaluated on large modern SSL architectures including ViT-B/16 and ResNet50 trained with DINO. △ Less

Submitted 28 October, 2022; originally announced October 2022.

Comments: NeurIPS 2022 Workshop: Self-Supervised Learning - Theory and Practice
arXiv:2111.10510 [pdf, other]

stat.ML cs.LG

Bayesian Learning via Neural Schrödinger-Föllmer Flows

Authors: Francisco Vargas, Andrius Ovsianas, David Fernandes, Mark Girolami, Neil D. Lawrence, Nikolas Nüsken

Abstract: In this work we explore a new framework for approximate Bayesian inference in large datasets based on stochastic control (i.e. Schrödinger bridges). We advocate stochastic control as a finite time and low variance alternative to popular steady-state methods such as stochastic gradient Langevin dynamics (SGLD). Furthermore, we discuss and adapt the existing theoretical guarantees of this framework… ▽ More In this work we explore a new framework for approximate Bayesian inference in large datasets based on stochastic control (i.e. Schrödinger bridges). We advocate stochastic control as a finite time and low variance alternative to popular steady-state methods such as stochastic gradient Langevin dynamics (SGLD). Furthermore, we discuss and adapt the existing theoretical guarantees of this framework and establish connections to already existing VI routines in SDE-based models. △ Less

Submitted 25 October, 2022; v1 submitted 19 November, 2021; originally announced November 2021.

Search v0.5.6 released 2020-02-24