brainpy.dyn.layers.GRU#

class brainpy.dyn.layers.GRU(num_in, num_out, Wi_initializer=Orthogonal(scale=1.0, axis=- 1, seed=9781403), Wh_initializer=Orthogonal(scale=1.0, axis=- 1, seed=3621171), b_initializer=ZeroInit, state_initializer=ZeroInit, activation='tanh', mode=TrainingMode, train_state=False, name=None)[source]#

Gated Recurrent Unit.

The implementation is based on (Chung, et al., 2014) 1 with biases.

Given \(x_t\) and the previous state \(h_{t-1}\) the core computes

\[\begin{split}\begin{array}{ll} z_t &= \sigma(W_{iz} x_t + W_{hz} h_{t-1} + b_z) \\ r_t &= \sigma(W_{ir} x_t + W_{hr} h_{t-1} + b_r) \\ a_t &= \tanh(W_{ia} x_t + W_{ha} (r_t \bigodot h_{t-1}) + b_a) \\ h_t &= (1 - z_t) \bigodot h_{t-1} + z_t \bigodot a_t \end{array}\end{split}\]

where \(z_t\) and \(r_t\) are reset and update gates.

The output is equal to the new hidden state, \(h_t\).

Warning: Backwards compatibility of GRU weights is currently unsupported.

Parameters
  • num_out (int) – The number of hidden unit in the node.

  • state_initializer (callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The state initializer.

  • Wi_initializer (callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The input weight initializer.

  • Wh_initializer (callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The hidden weight initializer.

  • b_initializer (optional, callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The bias weight initializer.

  • activation (str, callable) – The activation function. It can be a string or a callable function. See brainpy.math.activations for more details.

  • trainable (bool) – Whether set the node is trainable.

References

1

Chung, J., Gulcehre, C., Cho, K. and Bengio, Y., 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.

__init__(num_in, num_out, Wi_initializer=Orthogonal(scale=1.0, axis=- 1, seed=9781403), Wh_initializer=Orthogonal(scale=1.0, axis=- 1, seed=3621171), b_initializer=ZeroInit, state_initializer=ZeroInit, activation='tanh', mode=TrainingMode, train_state=False, name=None)[source]#

Methods

__init__(num_in, num_out[, Wi_initializer, ...])

clear_input()

get_delay_data(identifier, delay_step, *indices)

Get delay data according to the provided delay steps.

load_states(filename[, verbose])

Load the model states.

nodes([method, level, include_self])

Collect all children nodes.

offline_fit(target, fit_record)

offline_init()

online_fit(target, fit_record)

online_init()

register_delay(identifier, delay_step, ...)

Register delay variable.

register_implicit_nodes(*nodes, **named_nodes)

register_implicit_vars(*variables, ...)

reset([batch_size])

Reset function which reset the whole variables in the model.

reset_local_delays([nodes])

Reset local delay variables.

reset_state([batch_size])

Reset function which reset the states in the model.

save_states(filename[, variables])

Save the model states.

train_vars([method, level, include_self])

The shortcut for retrieving all trainable variables.

unique_name([name, type_])

Get the unique name for this object.

update(sha, x)

The function to specify the updating rule.

update_local_delays([nodes])

Update local delay variables.

vars([method, level, include_self])

Collect all variables in this node and the children nodes.

Attributes

global_delay_data

mode

Mode of the model, which is useful to control the multiple behaviors of the model.

name

Name of the model.