Ding, X., Chen, H., Zhang, X. et al. (3 more authors) (2023) Re-parameterizing your optimizers rather than architectures. In: Proceedings of the Eleventh International Conference on Learning Representations (ICLR 2023). The Eleventh International Conference on Learning Representations (ICLR), 01-05 May 2023, Kigali Rwanda. OpenReview
Abstract
The well-designed structures in neural networks reflect the prior knowledge incorporated into the models. However, though different models have various priors, we are used to training them with model-agnostic optimizers such as SGD. In this paper, we propose to incorporate model-specific prior knowledge into optimizers by modifying the gradients according to a set of model-specific hyper-parameters. Such a methodology is referred to as Gradient Re-parameterization, and the optimizers are named RepOptimizers. For the extreme simplicity of model structure, we focus on a VGG-style plain model and showcase that such a simple model trained with a RepOptimizer, which is referred to as RepOpt-VGG, performs on par with or better than the recent well-designed models. From a practical perspective, RepOpt-VGG is a favorable base model because of its simple structure, high inference speed and training efficiency. Compared to Structural Re-parameterization, which adds priors into models via constructing extra training-time structures, RepOptimizers require no extra forward/backward computations and solve the problem of quantization. We hope to spark further research beyond the realms of model structure design. Code and models https://github.com/DingXiaoH/RepOptimizers.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2023 The Author(s). |
Keywords: | Deep Learning; Model Architecture; Optimizer; Re-parameterization |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 22 Feb 2023 17:06 |
Last Modified: | 28 Sep 2023 15:14 |
Published Version: | https://openreview.net/forum?id=B92TMCG_7rp |
Status: | Published |
Publisher: | OpenReview |
Refereed: | Yes |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:196679 |
Download
Filename: 1626_re_parameterizing_your_optimiz-ICLR2023.pdf
