Conv2dLSTMCell#
- class brainpy.dyn.Conv2dLSTMCell(input_shape, in_channels, out_channels, kernel_size, stride=1, padding='SAME', lhs_dilation=1, rhs_dilation=1, groups=1, w_initializer=XavierNormal(scale=1.0, mode=fan_avg, in_axis=-2, out_axis=-1, distribution=truncated_normal, rng=[2233284200 4000014544]), b_initializer=ZeroInit, state_initializer=ZeroInit, train_state=False, name=None, mode=None)[source]#
2-D convolutional LSTM.
The implementation is based on :cite:`xingjian2015convolutional`. Given \(x_t\) and the previous state \((h_{t-1}, c_{t-1})\) the core computes
\[\begin{split}\begin{array}{ll} i_t = \sigma(W_{ii} * x_t + W_{hi} * h_{t-1} + b_i) \\ f_t = \sigma(W_{if} * x_t + W_{hf} * h_{t-1} + b_f) \\ g_t = \tanh(W_{ig} * x_t + W_{hg} * h_{t-1} + b_g) \\ o_t = \sigma(W_{io} * x_t + W_{ho} * h_{t-1} + b_o) \\ c_t = f_t c_{t-1} + i_t g_t \\ h_t = o_t \tanh(c_t) \end{array}\end{split}\]where \(*\) denotes the convolution operator; \(i_t\), \(f_t\), \(o_t\) are input, forget and output gate activations, and \(g_t\) is a vector of cell updates.
The output is equal to the new hidden state, \(h_t\).
Notes
- Forget gate initialization:
Following :cite:`jozefowicz2015empirical` we add 1.0 to \(b_f\) after initialization in order to reduce the scale of forgetting in the beginning of the training.