EvoGrad: Efficient Gradient-Based Hyperparameter Optimization

Published

Neural Information Processing Systems (NeurIPS)

Date

2021.12.07

Research Areas

Abstract

Gradient-based meta-learning and hyperparameter optimization have seen tremendous progress recently, allowing practical end-to-end training of neural networks
together with hyperparameters. Nevertheless, existing approaches are relatively expensive as they need to compute second-order derivatives of the loss with respect
to model parameters and hyperparameters. This cost prevents scaling these methods to large network architectures and larger numbers of hyperparameters. We
present EvoGrad, a new approach to hyper-gradient calculation that is inspired by evolutionary methods, but retains the efficacy of gradient-based methods. Crucially,
our approach provides a way to avoid the calculation of higher-order gradients, leading to significant improvements in memory and time efficiency. In practice,
EvoGrad enables various existing meta-learning frameworks to scale to bigger CNN architectures than was previously practical.

View publication

https://arxiv.org/abs/2106.10575

LIST

Publications

EvoGrad: Efficient Gradient-Based Hyperparameter Optimization

Published

Date

Research Areas

Abstract

View publication