brainpy.layers.Conv1dLSTMCell#

class brainpy.layers.Conv1dLSTMCell(input_shape, in_channels, out_channels, kernel_size, stride=1, padding='SAME', lhs_dilation=1, rhs_dilation=1, groups=1, w_initializer=XavierNormal(scale=1.0, mode=fan_avg, in_axis=-2, out_axis=-1, distribution=truncated_normal, rng=[3434845255 3873689435]), b_initializer=ZeroInit, state_initializer=ZeroInit, train_state=False, name=None, mode=None)[source]#

1-D convolutional LSTM.

The implementation is based on :cite:`xingjian2015convolutional`. Given \(x_t\) and the previous state \((h_{t-1}, c_{t-1})\) the core computes

\[\begin{split}\begin{array}{ll} i_t = \sigma(W_{ii} * x_t + W_{hi} * h_{t-1} + b_i) \\ f_t = \sigma(W_{if} * x_t + W_{hf} * h_{t-1} + b_f) \\ g_t = \tanh(W_{ig} * x_t + W_{hg} * h_{t-1} + b_g) \\ o_t = \sigma(W_{io} * x_t + W_{ho} * h_{t-1} + b_o) \\ c_t = f_t c_{t-1} + i_t g_t \\ h_t = o_t \tanh(c_t) \end{array}\end{split}\]

where \(*\) denotes the convolution operator; \(i_t\), \(f_t\), \(o_t\) are input, forget and output gate activations, and \(g_t\) is a vector of cell updates.

The output is equal to the new hidden state, \(h_t\).

Notes

Forget gate initialization:

Following :cite:`jozefowicz2015empirical` we add 1.0 to \(b_f\) after initialization in order to reduce the scale of forgetting in the beginning of the training.

__init__(input_shape, in_channels, out_channels, kernel_size, stride=1, padding='SAME', lhs_dilation=1, rhs_dilation=1, groups=1, w_initializer=XavierNormal(scale=1.0, mode=fan_avg, in_axis=-2, out_axis=-1, distribution=truncated_normal, rng=[3434845255 3873689435]), b_initializer=ZeroInit, state_initializer=ZeroInit, train_state=False, name=None, mode=None)[source]#

Constructs a 1-D convolutional LSTM.

Parameters:
  • input_shape (Tuple[int, ...]) – Shape of the inputs excluding batch size.

  • out_channels (int) – Number of output channels.

  • kernel_size (Union[int, Sequence[int]]) – Sequence of kernel sizes (of length 1), or an int. kernel_shape will be expanded to define a kernel size in all dimensions.

  • name (Optional[str]) – Name of the module.

Methods

__init__(input_shape, in_channels, ...[, ...])

Constructs a 1-D convolutional LSTM.

clear_input()

cpu()

Move all variable into the CPU device.

cuda()

Move all variables into the GPU device.

get_delay_data(identifier, delay_step, *indices)

Get delay data according to the provided delay steps.

load_state_dict(state_dict[, warn, compatible])

Copy parameters and buffers from state_dict into this module and its descendants.

load_states(filename[, verbose])

Load the model states.

nodes([method, level, include_self])

Collect all children nodes.

register_delay(identifier, delay_step, ...)

Register delay variable.

register_implicit_nodes(*nodes[, node_cls])

register_implicit_vars(*variables[, var_cls])

reset(*args, **kwargs)

Reset function which reset the whole variables in the model.

reset_local_delays([nodes])

Reset local delay variables.

reset_state([batch_size])

Reset function which reset the states in the model.

save_states(filename[, variables])

Save the model states.

state_dict()

Returns a dictionary containing a whole state of the module.

to(device)

Moves all variables into the given device.

tpu()

Move all variables into the TPU device.

train_vars([method, level, include_self])

The shortcut for retrieving all trainable variables.

tree_flatten()

Flattens the object as a PyTree.

tree_unflatten(aux, dynamic_values)

Unflatten the data to construct an object of this class.

unique_name([name, type_])

Get the unique name for this object.

update(x)

The function to specify the updating rule.

update_local_delays([nodes])

Update local delay variables.

vars([method, level, include_self, ...])

Collect all variables in this node and the children nodes.

Attributes

global_delay_data

Global delay data, which stores the delay variables and corresponding delay targets.

mode

Mode of the model, which is useful to control the multiple behaviors of the model.

name

Name of the model.