brainpy.nn.nodes.ANN.GRU#

class brainpy.nn.nodes.ANN.GRU(num_unit, wi_initializer=Orthogonal(scale=1.0, axis=- 1, seed=None), wh_initializer=Orthogonal(scale=1.0, axis=- 1, seed=None), bias_initializer=ZeroInit, state_initializer=ZeroInit, **kwargs)[source]#

Gated Recurrent Unit.

The implementation is based on (Chung, et al., 2014) 1 with biases.

Given \(x_t\) and the previous state \(h_{t-1}\) the core computes

\[\begin{split}\begin{array}{ll} z_t &= \sigma(W_{iz} x_t + W_{hz} h_{t-1} + b_z) \\ r_t &= \sigma(W_{ir} x_t + W_{hr} h_{t-1} + b_r) \\ a_t &= \tanh(W_{ia} x_t + W_{ha} (r_t \bigodot h_{t-1}) + b_a) \\ h_t &= (1 - z_t) \bigodot h_{t-1} + z_t \bigodot a_t \end{array}\end{split}\]

where \(z_t\) and \(r_t\) are reset and update gates.

The output is equal to the new hidden state, \(h_t\).

Warning: Backwards compatibility of GRU weights is currently unsupported.

Parameters

num_unit (int) – The number of hidden unit in the node.
state_initializer (callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The state initializer.
wi_initializer (callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The input weight initializer.
wh_initializer (callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The hidden weight initializer.
bias_initializer (optional, callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The bias weight initializer.
activation (str, callable) – The activation function. It can be a string or a callable function. See brainpy.math.activations for more details.
trainable (bool) – Whether set the node is trainable.

References

1: Chung, J., Gulcehre, C., Cho, K. and Bengio, Y., 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.

__init__(num_unit, wi_initializer=Orthogonal(scale=1.0, axis=- 1, seed=None), wh_initializer=Orthogonal(scale=1.0, axis=- 1, seed=None), bias_initializer=ZeroInit, state_initializer=ZeroInit, **kwargs)[source]#

Methods

`__init__`(num_unit[, wi_initializer, ...])
`copy`([name, shallow])	Returns a copy of the Node.
`feedback`(ff_output, **shared_kwargs)	The feedback computation function of a node.
`forward`(ff[, fb])	The feedforward computation function of a node.
`init_fb_conn`()	Initialize the feedback connections.
`init_fb_output`([num_batch])	Set the initial node feedback state.
`init_ff_conn`()	Initialize the feedforward connections.
`init_state`([num_batch])	Set the initial node state.
`initialize`([num_batch])	Initialize the node.
`load_states`(filename[, verbose])	Load the model states.
`nodes`([method, level, include_self])	Collect all children nodes.
`offline_fit`(targets, ffs[, fbs])	Offline training interface.
`online_fit`(target, ff[, fb])	Online training fitting interface.
`online_init`()	Online training initialization interface.
`register_implicit_nodes`(nodes)
`register_implicit_vars`(variables)
`save_states`(filename[, variables])	Save the model states.
`set_fb_output`(state)	Safely set the feedback state of the node.
`set_feedback_shapes`(fb_shapes)
`set_feedforward_shapes`(feedforward_shapes)
`set_output_shape`(shape)
`set_state`(state)	Safely set the state of the node.
`train_vars`([method, level, include_self])	The shortcut for retrieving all trainable variables.
`unique_name`([name, type_])	Get the unique name for this object.
`vars`([method, level, include_self])	Collect all variables in this node and the children nodes.

Attributes

`data_pass`	Offline fitting method.
`fb_output`	rtype `Optional`[`TypeVar`(`Tensor`, `JaxArray`, `ndarray`)]
`feedback_shapes`	Output data size.
`feedforward_shapes`	Input data size.
`is_feedback_input_supported`
`is_feedback_supported`
`is_initialized`	rtype `bool`
`name`
`output_shape`	Output data size.
`state`	Node current internal state.
`state_trainable`	Returns if the Node can be trained.
`train_state`
`trainable`	Returns if the Node can be trained.