Efficiently Training Low-Curvature Neural Networks

Srinivas, Suraj; Matoba, Kyle; Lakkaraju, Himabindu; Fleuret, Francois

Computer Science > Machine Learning

arXiv:2206.07144 (cs)

[Submitted on 14 Jun 2022 (v1), last revised 10 Jan 2023 (this version, v3)]

Title:Efficiently Training Low-Curvature Neural Networks

Authors:Suraj Srinivas, Kyle Matoba, Himabindu Lakkaraju, Francois Fleuret

View PDF

Abstract:The highly non-linear nature of deep neural networks causes them to be susceptible to adversarial examples and have unstable gradients which hinders interpretability. However, existing methods to solve these issues, such as adversarial training, are expensive and often sacrifice predictive accuracy.
In this work, we consider curvature, which is a mathematical quantity which encodes the degree of non-linearity. Using this, we demonstrate low-curvature neural networks (LCNNs) that obtain drastically lower curvature than standard models while exhibiting similar predictive performance, which leads to improved robustness and stable gradients, with only a marginally increased training time. To achieve this, we minimize a data-independent upper bound on the curvature of a neural network, which decomposes overall curvature in terms of curvatures and slopes of its constituent layers. To efficiently minimize this bound, we introduce two novel architectural components: first, a non-linearity called centered-softplus that is a stable variant of the softplus non-linearity, and second, a Lipschitz-constrained batch normalization layer.
Our experiments show that LCNNs have lower curvature, more stable gradients and increased off-the-shelf adversarial robustness when compared to their standard high-curvature counterparts, all without affecting predictive performance. Our approach is easy to use and can be readily incorporated into existing neural network models.

Comments:	NeurIPS 2022
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2206.07144 [cs.LG]
	(or arXiv:2206.07144v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2206.07144

Submission history

From: Suraj Srinivas [view email]
[v1] Tue, 14 Jun 2022 20:09:04 UTC (85 KB)
[v2] Thu, 15 Dec 2022 22:18:44 UTC (87 KB)
[v3] Tue, 10 Jan 2023 15:59:31 UTC (87 KB)

Computer Science > Machine Learning

Title:Efficiently Training Low-Curvature Neural Networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Efficiently Training Low-Curvature Neural Networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators