RMSProp#
- class brainpy.optim.RMSProp(lr, train_vars=None, weight_decay=None, epsilon=1e-06, rho=0.9, name=None)[source]#
Optimizer that implements the RMSprop algorithm.
RMSprop [5] and Adadelta have both been developed independently around the same time stemming from the need to resolve Adagrad’s radically diminishing learning rates.
The gist of RMSprop is to:
Maintain a moving (discounted) average of the square of gradients
Divide the gradient by the root of this average
\[\begin{split}\begin{split}c_t &= \rho c_{t-1} + (1-\rho)*g^2\\ p_t &= \frac{\eta}{\sqrt{c_t + \epsilon}} * g \end{split}\end{split}\]The centered version additionally maintains a moving average of the gradients, and uses that average to estimate the variance.
References