GRUCell#
- class brainpy.dyn.GRUCell(num_in, num_out, Wi_initializer=Orthogonal(scale=1.0, axis=-1, rng=RandomState(Array((), dtype=key<fry>) overlaying: [ 216744582 1008666480])), Wh_initializer=Orthogonal(scale=1.0, axis=-1, rng=RandomState(Array((), dtype=key<fry>) overlaying: [ 216744582 1008666480])), b_initializer=ZeroInit, state_initializer=ZeroInit, activation='tanh', mode=None, train_state=False, name=None)[source]#
Gated Recurrent Unit.
The implementation is based on (Chung, et al., 2014) [1] with biases.
Given \(x_t\) and the previous state \(h_{t-1}\) the core computes
\[\begin{split}\begin{array}{ll} z_t &= \sigma(W_{iz} x_t + W_{hz} h_{t-1} + b_z) \\ r_t &= \sigma(W_{ir} x_t + W_{hr} h_{t-1} + b_r) \\ a_t &= \tanh(W_{ia} x_t + W_{ha} (r_t \bigodot h_{t-1}) + b_a) \\ h_t &= (1 - z_t) \bigodot h_{t-1} + z_t \bigodot a_t \end{array}\end{split}\]where \(z_t\) and \(r_t\) are reset and update gates.
The output is equal to the new hidden state, \(h_t\).
Warning: Backwards compatibility of GRU weights is currently unsupported.
- Parameters:
num_in (
int) – The dimension of the input vectornum_out (
int) – The number of hidden unit in the node.state_initializer (
Union[TypeVar(ArrayType,Array,Variable,TrainVar,Array,ndarray),Callable,Initializer]) – The state initializer.Wi_initializer (
Union[TypeVar(ArrayType,Array,Variable,TrainVar,Array,ndarray),Callable,Initializer]) – The input weight initializer.Wh_initializer (
Union[TypeVar(ArrayType,Array,Variable,TrainVar,Array,ndarray),Callable,Initializer]) – The hidden weight initializer.b_initializer (
Union[TypeVar(ArrayType,Array,Variable,TrainVar,Array,ndarray),Callable,Initializer]) – The bias weight initializer.activation (
str) – The activation function. It can be a string or a callable function. Seebrainpy.math.activationsfor more details.
References