The University of Southampton
University of Southampton Institutional Repository

Implicit bias of adversarial training for deep neural networks

Implicit bias of adversarial training for deep neural networks
Implicit bias of adversarial training for deep neural networks
We provide theoretical understandings of the implicit bias imposed by adversarial training for homogeneous deep neural networks without any explicit regularization. In particular, for deep linear networks adversarially trained by gradient descent on a linearly separable dataset, we prove that the direction of the product of weight matrices converges to the direction of the max-margin solution of the original dataset. Furthermore, we generalize this result to the case of adversarial training for non-linear homogeneous deep neural networks without the linear separability of the dataset. We show that, when the neural network is adversarially trained with ℓ2 or ℓ∞ FGSM, FGM and PGD perturbations, the direction of the limit point of normalized parameters of the network along the trajectory of the gradient flow converges to a KKT point of a constrained optimization problem that aims to maximize the margin for adversarial examples. Our results theoretically justify the longstanding conjecture that adversarial training modifies the decision boundary by utilizing adversarial examples to improve robustness, and potentially provides insights for designing new robust training strategies.
Lyu, Bochen
2d571283-73b1-4741-9798-15f9d144f7a6
Zhu, Zhanxing
e55e7385-8ba2-4a85-8bae-e00defb7d7f0
Lyu, Bochen
2d571283-73b1-4741-9798-15f9d144f7a6
Zhu, Zhanxing
e55e7385-8ba2-4a85-8bae-e00defb7d7f0

Lyu, Bochen and Zhu, Zhanxing (2022) Implicit bias of adversarial training for deep neural networks. 10th International Conference on Learning Representations, ICLR 2022, , Virtual, Online. 25 - 29 Apr 2022.

Record type: Conference or Workshop Item (Paper)

Abstract

We provide theoretical understandings of the implicit bias imposed by adversarial training for homogeneous deep neural networks without any explicit regularization. In particular, for deep linear networks adversarially trained by gradient descent on a linearly separable dataset, we prove that the direction of the product of weight matrices converges to the direction of the max-margin solution of the original dataset. Furthermore, we generalize this result to the case of adversarial training for non-linear homogeneous deep neural networks without the linear separability of the dataset. We show that, when the neural network is adversarially trained with ℓ2 or ℓ∞ FGSM, FGM and PGD perturbations, the direction of the limit point of normalized parameters of the network along the trajectory of the gradient flow converges to a KKT point of a constrained optimization problem that aims to maximize the margin for adversarial examples. Our results theoretically justify the longstanding conjecture that adversarial training modifies the decision boundary by utilizing adversarial examples to improve robustness, and potentially provides insights for designing new robust training strategies.

This record has no associated files available for download.

More information

Published date: 25 April 2022
Additional Information: Funding Information: this project is supported by Beijing Nova Program (No. 202072) from Beijing Municipal Science Technology Commission.
Venue - Dates: 10th International Conference on Learning Representations, ICLR 2022, , Virtual, Online, 2022-04-25 - 2022-04-29

Identifiers

Local EPrints ID: 486052
URI: http://eprints.soton.ac.uk/id/eprint/486052
PURE UUID: 32a94e35-7781-4df8-a4ad-06cf15877060

Catalogue record

Date deposited: 08 Jan 2024 17:34
Last modified: 17 Mar 2024 13:42

Export record

Contributors

Author: Bochen Lyu
Author: Zhanxing Zhu

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×