The University of Southampton
University of Southampton Institutional Repository

Positive-negative momentum: manipulating stochastic gradient noise to improve generalization

Positive-negative momentum: manipulating stochastic gradient noise to improve generalization
Positive-negative momentum: manipulating stochastic gradient noise to improve generalization
It is well-known that stochastic gradient noise (SGN) acts as implicit regularization for deep learning and is essentially important for both optimization and generalization of deep networks. Some works attempted to artificially simulate SGN by injecting random noise to improve deep learning. However, it turned out that the injected simple random noise cannot work as well as SGN, which is anisotropic and parameter-dependent. For simulating SGN at low computational costs and without changing the learning rate or batch size, we propose the Positive-Negative Momentum (PNM) approach that is a powerful alternative to conventional Momentum in classic optimizers. The introduced PNM method maintains two approximate independent momentum terms. Then, we can control the magnitude of SGN explicitly by adjusting the momentum difference. We theoretically prove the convergence guarantee and the generalization advantage of PNM over Stochastic Gradient Descent (SGD). By incorporating PNM into the two conventional optimizers, SGD with Momentum and Adam, our extensive experiments empirically verified the significant advantage of the PNM-based variants over the corresponding conventional Momentum-based optimizers. Code: https://github.com/zeke-xie/Positive-Negative-Momentum.
11448-11458
PMLR
Xie, Zeke
764947c6-b5d2-489d-9916-7489df2320dc
Yuan, Li
b0360f1f-750a-4726-97d7-8bd365de4439
Zhu, Zhanxing
e55e7385-8ba2-4a85-8bae-e00defb7d7f0
Sugiyama, Masashi
0fd0a683-9f97-42c3-baab-9668da0f48d3
Meila, Marina
Zhang, Tong
Xie, Zeke
764947c6-b5d2-489d-9916-7489df2320dc
Yuan, Li
b0360f1f-750a-4726-97d7-8bd365de4439
Zhu, Zhanxing
e55e7385-8ba2-4a85-8bae-e00defb7d7f0
Sugiyama, Masashi
0fd0a683-9f97-42c3-baab-9668da0f48d3
Meila, Marina
Zhang, Tong

Xie, Zeke, Yuan, Li, Zhu, Zhanxing and Sugiyama, Masashi (2021) Positive-negative momentum: manipulating stochastic gradient noise to improve generalization. In, Meila, Marina and Zhang, Tong (eds.) Proceedings of the 38th International Conference on Machine Learning. (Proceedings of Machine Learning Research, 139) 38th International Conference of Machine Learning (18/07/21 - 24/07/21) PMLR, pp. 11448-11458.

Record type: Book Section

Abstract

It is well-known that stochastic gradient noise (SGN) acts as implicit regularization for deep learning and is essentially important for both optimization and generalization of deep networks. Some works attempted to artificially simulate SGN by injecting random noise to improve deep learning. However, it turned out that the injected simple random noise cannot work as well as SGN, which is anisotropic and parameter-dependent. For simulating SGN at low computational costs and without changing the learning rate or batch size, we propose the Positive-Negative Momentum (PNM) approach that is a powerful alternative to conventional Momentum in classic optimizers. The introduced PNM method maintains two approximate independent momentum terms. Then, we can control the magnitude of SGN explicitly by adjusting the momentum difference. We theoretically prove the convergence guarantee and the generalization advantage of PNM over Stochastic Gradient Descent (SGD). By incorporating PNM into the two conventional optimizers, SGD with Momentum and Adam, our extensive experiments empirically verified the significant advantage of the PNM-based variants over the corresponding conventional Momentum-based optimizers. Code: https://github.com/zeke-xie/Positive-Negative-Momentum.

This record has no associated files available for download.

More information

Published date: 2021
Venue - Dates: 38th International Conference of Machine Learning, virtual, 2021-07-18 - 2021-07-24

Identifiers

Local EPrints ID: 486060
URI: http://eprints.soton.ac.uk/id/eprint/486060
PURE UUID: 742fd338-5d1f-4fd8-b001-f1cb74dd5265

Catalogue record

Date deposited: 08 Jan 2024 17:38
Last modified: 17 Mar 2024 06:41

Export record

Contributors

Author: Zeke Xie
Author: Li Yuan
Author: Zhanxing Zhu
Author: Masashi Sugiyama
Editor: Marina Meila
Editor: Tong Zhang

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×