Implicit bias of adversarial training for deep neural networks

Lyu, Bochen and Zhu, Zhanxing (2022) Implicit bias of adversarial training for deep neural networks. 10th International Conference on Learning Representations, ICLR 2022, , Virtual, Online. 25 - 29 Apr 2022.

Record type: Conference or Workshop Item (Paper)

Abstract

We provide theoretical understandings of the implicit bias imposed by adversarial training for homogeneous deep neural networks without any explicit regularization. In particular, for deep linear networks adversarially trained by gradient descent on a linearly separable dataset, we prove that the direction of the product of weight matrices converges to the direction of the max-margin solution of the original dataset. Furthermore, we generalize this result to the case of adversarial training for non-linear homogeneous deep neural networks without the linear separability of the dataset. We show that, when the neural network is adversarially trained with ℓ2 or ℓ∞ FGSM, FGM and PGD perturbations, the direction of the limit point of normalized parameters of the network along the trajectory of the gradient flow converges to a KKT point of a constrained optimization problem that aims to maximize the margin for adversarial examples. Our results theoretically justify the longstanding conjecture that adversarial training modifies the decision boundary by utilizing adversarial examples to improve robustness, and potentially provides insights for designing new robust training strategies.

This record has no associated files available for download.

More information

Published date: 25 April 2022

Additional Information: Funding Information: this project is supported by Beijing Nova Program (No. 202072) from Beijing Municipal Science Technology Commission.

Venue - Dates: 10th International Conference on Learning Representations, ICLR 2022, , Virtual, Online, 2022-04-25 - 2022-04-29

Related URLs:

Learn more about Vision, Learning and Control research