# brainpy.dyn.layers.LSTM#

class brainpy.dyn.layers.LSTM(num_in, num_out, Wi_initializer=XavierNormal(scale=1.0, mode=fan_avg, in_axis=- 2, out_axis=- 1, distribution=truncated_normal, seed=3417531), Wh_initializer=XavierNormal(scale=1.0, mode=fan_avg, in_axis=- 2, out_axis=- 1, distribution=truncated_normal, seed=3917797), b_initializer=ZeroInit, state_initializer=ZeroInit, activation='tanh', mode=TrainingMode, train_state=False, name=None)[source]#

Long short-term memory (LSTM) RNN core.

The implementation is based on (zaremba, et al., 2014) 1. Given $$x_t$$ and the previous state $$(h_{t-1}, c_{t-1})$$ the core computes

$\begin{split}\begin{array}{ll} i_t = \sigma(W_{ii} x_t + W_{hi} h_{t-1} + b_i) \\ f_t = \sigma(W_{if} x_t + W_{hf} h_{t-1} + b_f) \\ g_t = \tanh(W_{ig} x_t + W_{hg} h_{t-1} + b_g) \\ o_t = \sigma(W_{io} x_t + W_{ho} h_{t-1} + b_o) \\ c_t = f_t c_{t-1} + i_t g_t \\ h_t = o_t \tanh(c_t) \end{array}\end{split}$

where $$i_t$$, $$f_t$$, $$o_t$$ are input, forget and output gate activations, and $$g_t$$ is a vector of cell updates.

The output is equal to the new hidden, $$h_t$$.

Notes

Forget gate initialization: Following (Jozefowicz, et al., 2015) 2 we add 1.0 to $$b_f$$ after initialization in order to reduce the scale of forgetting in the beginning of the training.

Parameters
• num_out (int) – The number of hidden unit in the node.

• state_initializer (callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The state initializer.

• Wi_initializer (callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The input weight initializer.

• Wh_initializer (callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The hidden weight initializer.

• b_initializer (optional, callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The bias weight initializer.

• activation (str, callable) – The activation function. It can be a string or a callable function. See brainpy.math.activations for more details.

• trainable (bool) – Whether set the node is trainable.

References

1

Zaremba, Wojciech, Ilya Sutskever, and Oriol Vinyals. “Recurrent neural network regularization.” arXiv preprint arXiv:1409.2329 (2014).

2

Jozefowicz, Rafal, Wojciech Zaremba, and Ilya Sutskever. “An empirical exploration of recurrent network architectures.” In International conference on machine learning, pp. 2342-2350. PMLR, 2015.

__init__(num_in, num_out, Wi_initializer=XavierNormal(scale=1.0, mode=fan_avg, in_axis=- 2, out_axis=- 1, distribution=truncated_normal, seed=3417531), Wh_initializer=XavierNormal(scale=1.0, mode=fan_avg, in_axis=- 2, out_axis=- 1, distribution=truncated_normal, seed=3917797), b_initializer=ZeroInit, state_initializer=ZeroInit, activation='tanh', mode=TrainingMode, train_state=False, name=None)[source]#

Methods

 __init__(num_in, num_out[, Wi_initializer, ...]) clear_input() get_delay_data(identifier, delay_step, *indices) Get delay data according to the provided delay steps. load_states(filename[, verbose]) Load the model states. nodes([method, level, include_self]) Collect all children nodes. offline_fit(target, fit_record) offline_init() online_fit(target, fit_record) online_init() register_delay(identifier, delay_step, ...) Register delay variable. register_implicit_nodes(*nodes, **named_nodes) register_implicit_vars(*variables, ...) reset([batch_size]) Reset function which reset the whole variables in the model. reset_local_delays([nodes]) Reset local delay variables. reset_state([batch_size]) Reset function which reset the states in the model. save_states(filename[, variables]) Save the model states. train_vars([method, level, include_self]) The shortcut for retrieving all trainable variables. unique_name([name, type_]) Get the unique name for this object. update(sha, x) The function to specify the updating rule. update_local_delays([nodes]) Update local delay variables. vars([method, level, include_self]) Collect all variables in this node and the children nodes.

Attributes

 c Memory cell. global_delay_data h Hidden state. mode Mode of the model, which is useful to control the multiple behaviors of the model. name Name of the model.