brainpy.nn.nodes.ANN.LSTM
brainpy.nn.nodes.ANN.LSTM#
- class brainpy.nn.nodes.ANN.LSTM(num_unit, wi_initializer=XavierNormal(scale=1.0, mode=fan_avg, in_axis=- 2, out_axis=- 1, distribution=truncated_normal, seed=None), wh_initializer=XavierNormal(scale=1.0, mode=fan_avg, in_axis=- 2, out_axis=- 1, distribution=truncated_normal, seed=None), bias_initializer=ZeroInit, state_initializer=ZeroInit, **kwargs)[source]#
Long short-term memory (LSTM) RNN core.
The implementation is based on (zaremba, et al., 2014) 1. Given \(x_t\) and the previous state \((h_{t-1}, c_{t-1})\) the core computes
\[\begin{split}\begin{array}{ll} i_t = \sigma(W_{ii} x_t + W_{hi} h_{t-1} + b_i) \\ f_t = \sigma(W_{if} x_t + W_{hf} h_{t-1} + b_f) \\ g_t = \tanh(W_{ig} x_t + W_{hg} h_{t-1} + b_g) \\ o_t = \sigma(W_{io} x_t + W_{ho} h_{t-1} + b_o) \\ c_t = f_t c_{t-1} + i_t g_t \\ h_t = o_t \tanh(c_t) \end{array}\end{split}\]where \(i_t\), \(f_t\), \(o_t\) are input, forget and output gate activations, and \(g_t\) is a vector of cell updates.
The output is equal to the new hidden, \(h_t\).
Notes
Forget gate initialization: Following (Jozefowicz, et al., 2015) 2 we add 1.0 to \(b_f\) after initialization in order to reduce the scale of forgetting in the beginning of the training.
- Parameters
num_unit (int) – The number of hidden unit in the node.
state_initializer (callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The state initializer.
wi_initializer (callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The input weight initializer.
wh_initializer (callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The hidden weight initializer.
bias_initializer (optional, callable, Initializer, bm.ndarray, jax.numpy.ndarray) – The bias weight initializer.
activation (str, callable) – The activation function. It can be a string or a callable function. See
brainpy.math.activations
for more details.trainable (bool) – Whether set the node is trainable.
References
- 1
Zaremba, Wojciech, Ilya Sutskever, and Oriol Vinyals. “Recurrent neural network regularization.” arXiv preprint arXiv:1409.2329 (2014).
- 2
Jozefowicz, Rafal, Wojciech Zaremba, and Ilya Sutskever. “An empirical exploration of recurrent network architectures.” In International conference on machine learning, pp. 2342-2350. PMLR, 2015.
- __init__(num_unit, wi_initializer=XavierNormal(scale=1.0, mode=fan_avg, in_axis=- 2, out_axis=- 1, distribution=truncated_normal, seed=None), wh_initializer=XavierNormal(scale=1.0, mode=fan_avg, in_axis=- 2, out_axis=- 1, distribution=truncated_normal, seed=None), bias_initializer=ZeroInit, state_initializer=ZeroInit, **kwargs)[source]#
Methods
__init__
(num_unit[, wi_initializer, ...])copy
([name, shallow])Returns a copy of the Node.
feedback
(ff_output, **shared_kwargs)The feedback computation function of a node.
forward
(ff[, fb])The feedforward computation function of a node.
init_fb_conn
()Initialize the feedback connections.
init_fb_output
([num_batch])Set the initial node feedback state.
init_ff_conn
()Initialize the feedforward connections.
init_state
([num_batch])Set the initial node state.
initialize
([num_batch])Initialize the node.
load_states
(filename[, verbose])Load the model states.
nodes
([method, level, include_self])Collect all children nodes.
offline_fit
(targets, ffs[, fbs])Offline training interface.
online_fit
(target, ff[, fb])Online training fitting interface.
online_init
()Online training initialization interface.
register_implicit_nodes
(nodes)register_implicit_vars
(variables)save_states
(filename[, variables])Save the model states.
set_fb_output
(state)Safely set the feedback state of the node.
set_feedback_shapes
(fb_shapes)set_feedforward_shapes
(feedforward_shapes)set_output_shape
(shape)set_state
(state)Safely set the state of the node.
train_vars
([method, level, include_self])The shortcut for retrieving all trainable variables.
unique_name
([name, type_])Get the unique name for this object.
vars
([method, level, include_self])Collect all variables in this node and the children nodes.
Attributes
c
Memory cell.
data_pass
Offline fitting method.
fb_output
feedback_shapes
Output data size.
feedforward_shapes
Input data size.
h
Hidden state.
is_feedback_input_supported
is_feedback_supported
is_initialized
- rtype
name
output_shape
Output data size.
state
Node current internal state.
state_trainable
Returns if the Node can be trained.
train_state
trainable
Returns if the Node can be trained.