brainpy.optim.Adagrad

brainpy.optim.Adagrad#

class brainpy.optim.Adagrad(lr, train_vars=None, weight_decay=None, epsilon=1e-06, name=None)[source]#

Optimizer that implements the Adagrad algorithm.

Adagrad [3] is an optimizer with parameter-specific learning rates, which are adapted relative to how frequently a parameter gets updated during training. The more updates a parameter receives, the smaller the updates.

\[\theta_{t+1} = \theta_{t} - \dfrac{\eta}{\sqrt{G_{t} + \epsilon}} \odot g_{t}\]

where \(G(t)\) contains the sum of the squares of the past gradients

One of Adagrad’s main benefits is that it eliminates the need to manually tune the learning rate. Most implementations use a default value of 0.01 and leave it at that. Adagrad’s main weakness is its accumulation of the squared gradients in the denominator: Since every added term is positive, the accumulated sum keeps growing during training. This in turn causes the learning rate to shrink and eventually become infinitesimally small, at which point the algorithm is no longer able to acquire additional knowledge.

Parameters:: lr (float, Scheduler) – learning rate.

References

__init__(lr, train_vars=None, weight_decay=None, epsilon=1e-06, name=None)[source]#

Methods

`__init__`(lr[, train_vars, weight_decay, ...])
`check_grads`(grads)
`cpu`()	Move all variable into the CPU device.
`cuda`()	Move all variables into the GPU device.
`load_state_dict`(state_dict[, warn, compatible])	Copy parameters and buffers from `state_dict` into this module and its descendants.
`load_states`(filename[, verbose])	Load the model states.
`nodes`([method, level, include_self])	Collect all children nodes.
`register_implicit_nodes`(*nodes[, node_cls])
`register_implicit_vars`(*variables[, var_cls])
`register_train_vars`([train_vars])
`register_vars`([train_vars])
`save_states`(filename[, variables])	Save the model states.
`state_dict`()	Returns a dictionary containing a whole state of the module.
`to`(device)	Moves all variables into the given device.
`tpu`()	Move all variables into the TPU device.
`train_vars`([method, level, include_self])	The shortcut for retrieving all trainable variables.
`tree_flatten`()	Flattens the object as a PyTree.
`tree_unflatten`(aux, dynamic_values)	Unflatten the data to construct an object of this class.
`unique_name`([name, type_])	Get the unique name for this object.
`update`(grads)
`vars`([method, level, include_self, ...])	Collect all variables in this node and the children nodes.

Attributes

name

Name of the model.

brainpy.optim.Adagrad

Contents

brainpy.optim.Adagrad#