The University of Southampton
University of Southampton Institutional Repository

You only propagate once: accelerating adversarial training via maximal principle

You only propagate once: accelerating adversarial training via maximal principle
You only propagate once: accelerating adversarial training via maximal principle
Deep learning achieves state-of-the-art results in many tasks in computer vision and natural language processing. However, recent works have shown that deep networks can be vulnerable to adversarial perturbations, which raised a serious robustness issue of deep networks. Adversarial training, typically formulated as a robust optimization problem, is an effective way of improving the robustness of deep networks. A major drawback of existing adversarial training algorithms is the computational overhead of the generation of adversarial examples, typically far greater than that of the network training. This leads to the unbearable overall computational cost of adversarial training. In this paper, we show that adversarial training can be cast as a discrete time differential game. Through analyzing the Pontryagin's Maximum Principle (PMP) of the problem, we observe that the adversary update is only coupled with the parameters of the first layer of the network. This inspires us to restrict most of the forward and back propagation within the first layer of the network during adversary updates. This effectively reduces the total number of full forward and backward propagation to only one for each group of adversary updates. Therefore, we refer to this algorithm YOPO (You Only Propagate Once). Numerical experiments demonstrate that YOPO can achieve comparable defense accuracy with approximately 1/5 ~ 1/4 GPU time of the projected gradient descent (PGD) algorithm.
227-238
Curran Associates, Inc.
Zhang, Dinghuai
f65f010c-e6e1-4198-a2a7-101639a75e14
Zhang, Tianyuan
4554f1cd-2437-4f14-a366-482b3e4e060a
Lu, Yiping
c413c77f-4bbd-401e-a097-30358c5e0b24
Zhu, Zhanxing
e55e7385-8ba2-4a85-8bae-e00defb7d7f0
Dong, Bin
71379120-549e-4042-9bb0-c7f8b1328b14
Wallach, Hanna M.
Larochelle, Hugo
Beygelzimer, Alina
d'Alché-Buc, Florence
Fox, Emily B.
Zhang, Dinghuai
f65f010c-e6e1-4198-a2a7-101639a75e14
Zhang, Tianyuan
4554f1cd-2437-4f14-a366-482b3e4e060a
Lu, Yiping
c413c77f-4bbd-401e-a097-30358c5e0b24
Zhu, Zhanxing
e55e7385-8ba2-4a85-8bae-e00defb7d7f0
Dong, Bin
71379120-549e-4042-9bb0-c7f8b1328b14
Wallach, Hanna M.
Larochelle, Hugo
Beygelzimer, Alina
d'Alché-Buc, Florence
Fox, Emily B.

Zhang, Dinghuai, Zhang, Tianyuan, Lu, Yiping, Zhu, Zhanxing and Dong, Bin (2019) You only propagate once: accelerating adversarial training via maximal principle. Wallach, Hanna M., Larochelle, Hugo, Beygelzimer, Alina, d'Alché-Buc, Florence and Fox, Emily B. (eds.) In Proceedings of the 33rd International Conference on Neural Information Processing Systems. Curran Associates, Inc. pp. 227-238 . (doi:10.5555/3454287.3454308).

Record type: Conference or Workshop Item (Paper)

Abstract

Deep learning achieves state-of-the-art results in many tasks in computer vision and natural language processing. However, recent works have shown that deep networks can be vulnerable to adversarial perturbations, which raised a serious robustness issue of deep networks. Adversarial training, typically formulated as a robust optimization problem, is an effective way of improving the robustness of deep networks. A major drawback of existing adversarial training algorithms is the computational overhead of the generation of adversarial examples, typically far greater than that of the network training. This leads to the unbearable overall computational cost of adversarial training. In this paper, we show that adversarial training can be cast as a discrete time differential game. Through analyzing the Pontryagin's Maximum Principle (PMP) of the problem, we observe that the adversary update is only coupled with the parameters of the first layer of the network. This inspires us to restrict most of the forward and back propagation within the first layer of the network during adversary updates. This effectively reduces the total number of full forward and backward propagation to only one for each group of adversary updates. Therefore, we refer to this algorithm YOPO (You Only Propagate Once). Numerical experiments demonstrate that YOPO can achieve comparable defense accuracy with approximately 1/5 ~ 1/4 GPU time of the projected gradient descent (PGD) algorithm.

This record has no associated files available for download.

More information

Published date: 8 December 2019
Venue - Dates: 33rd International Conference on Neural Information Processing Systems, , Vancouver, Canada, 2019-12-08 - 2019-12-14

Identifiers

Local EPrints ID: 486000
URI: http://eprints.soton.ac.uk/id/eprint/486000
PURE UUID: ddc15265-9f44-4d73-93af-d4dc2bdf76bf

Catalogue record

Date deposited: 05 Jan 2024 17:43
Last modified: 17 Mar 2024 06:41

Export record

Altmetrics

Contributors

Author: Dinghuai Zhang
Author: Tianyuan Zhang
Author: Yiping Lu
Author: Zhanxing Zhu
Author: Bin Dong
Editor: Hanna M. Wallach
Editor: Hugo Larochelle
Editor: Alina Beygelzimer
Editor: Florence d'Alché-Buc
Editor: Emily B. Fox

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×