Building Training Models#
In this section, we are going to talk about how to build models for training.
import brainpy as bp
import brainpy.math as bm
bm.set_platform('cpu')
bp.__version__
'2.4.0'
Use built-in models#
brainpy.DynamicalSystem
provided in BrainPy can be used for model training.
mode
settings#
Some built-in models have implemented the training interface for their training. Users can instantiate these models by providing the parameter mode=brainpy.modes.training
for training model customization.
For example, brainpy.neurons.LIF
is a model commonly used in computational simulation, but it can also be used in training.
# Instantiate a LIF model for simulation
lif = bp.neurons.LIF(1)
lif.mode
NonBatchingMode
# Instantiate a LIF model for training.
# In this mode, the model implement variables and functions
# compatible with BrainPy's training interface.
lif = bp.neurons.LIF(1, mode=bm.TrainingMode())
lif.mode
TrainingMode(batch_size=1)
But some build-in models does not support training.
try:
bp.layers.NVAR(1, 1, mode=bm.TrainingMode())
except Exception as e:
print(type(e), e)
<class 'NotImplementedError'> NVAR does not support TrainingMode(batch_size=1). We only support BatchingMode, NonBatchingMode.
The mode
can be used to control the weight types. Let’s take a synaptic model for another example. For a non-trainable dense layer, the weights and bias are Array instances.
l = bp.layers.Dense(3, 4, mode=bm.batching_mode)
l.W
Array(value=DeviceArray([[-0.31531182, -0.07892124, -0.7207848 , -0.79600596],
[ 0.43365675, -0.7257636 , -0.42986184, 0.2427496 ],
[-0.6706509 , 1.0398958 , 0.20784897, 0.53136575]], dtype=float32),
dtype=float32)
l = bp.layers.Dense(3, 4, mode=bm.training_mode)
l.W
TrainVar(value=DeviceArray([[-0.78135514, -0.08054283, 0.35119462, 0.1645825 ],
[ 0.09323493, 0.36790657, -0.47392672, -0.7648337 ],
[-0.9817612 , -0.5418812 , 0.5456801 , -1.2071232 ]], dtype=float32),
dtype=float32)
Moreover, for some recurrent models, e.g., LSTM
or GRU
, the state
can be set to be trainable or not trainable by train_state
argument. When setting train_state=True
for the recurrent instance, a new attribute .state2train will be created.
rnn = bp.layers.RNNCell(1, 3, train_state=True, mode=bm.training_mode)
rnn.state2train
TrainVar(value=DeviceArray([0., 0., 0.]), dtype=float32)
Note the difference between the .state2train and the original .state:
.state2train has no batch axis.
When using
node.reset_state()
function, all values in the .state will be filled with .state2train.
rnn.reset_state(batch_size=5)
rnn.state
Variable(value=DeviceArray([[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]]),
dtype=float32)
Naming a node#
For convenience, you can name a layer by specifying the name keyword argument:
bp.layers.Dense(128, 100, name='hidden_layer')
Dense(name=hidden_layer, num_in=128, num_out=100, mode=NonBatchingMode)
Initializing parameters#
Many models have their parameters. We can set the parameter of a model with the following methods.
Arrays
If an array is provided, this is used unchanged as the parameter variable. For example:
l = bp.layers.Dense(10, 50, W_initializer=bm.random.normal(0, 0.01, size=(10, 50)))
l.W.shape
(10, 50)
Callable function
If a callable function (which receives a shape
argument) is provided, the callable will be called with the desired shape to generate suitable initial parameter values. The variable is then initialized with those values. For example:
def init(shape):
return bm.random.random(shape)
l = bp.layers.Dense(20, 30, W_initializer=init)
l.W.shape
(20, 30)
Instance of
brainpy.init.Initializer
If a brainpy.init.Initializer
instance is provided, the initial parameter values will be generated with the desired shape by using the Initializer instance. For example:
l = bp.layers.Dense(20, 30, W_initializer=bp.init.Normal(0.01))
l.W.shape
(20, 30)
The weight matrix \(W\) of this dense layer will be initialized using samples from a normal distribution with standard deviation 0.01 (see brainpy.init for more information).
None parameter
Some types of parameter variables can also be set to None
at initialization (e.g. biases). In that case, the parameter variable will be omitted. For example, creating a dense layer without biases is done as follows:
l = bp.layers.Dense(20, 100, b_initializer=None)
print(l.b)
None
Customize your models#
Customizing your training models is simple. You just need to subclass brainpy.DynamicalSystem
, and implement its update()
and reset_state()
functions.
Here, we demonstrate the model customization using two examples. The first is a recurrent layer.
class RecurrentLayer(bp.DynamicalSystemNS):
def __init__(self, num_in, num_out):
super(RecurrentLayer, self).__init__()
bp.check.is_subclass(self.mode, (bm.TrainingMode, bm.BatchingMode))
# define parameters
self.num_in = num_in
self.num_out = num_out
# define variables
self.state = bm.Variable(bm.zeros(1, num_out), batch_axis=0)
# define weights
self.win = bm.TrainVar(bm.random.normal(0., 1./num_in ** 0.5, size=(num_in, num_out)))
self.wrec = bm.TrainVar(bm.random.normal(0., 1./num_out ** 0.5, size=(num_out, num_out)))
def reset_state(self, batch_size):
# this function defines how to reset the mode states
self.state.value = bm.zeros((batch_size, self.num_out))
def update(self, x):
# this function defined how the model update its state and produce its output
out = bm.dot(x, self.win) + bm.dot(self.state, self.wrec)
self.state.value = bm.tanh(out)
return self.state.value
This simple example illustrates many features essential for a training model. reset_state()
function defines how to reset model states, which will be called at the first time step; update()
function defines how the model states are evolving, which will be called at every time step.
Another example is the dropout layer, which can be useful to demonstrate how to define a model with multiple behaviours.
class Dropout(bp.DynamicalSystemNS):
def __init__(self, prob: float, seed: int = None, name: str = None):
super(Dropout, self).__init__(name=name)
bp.check.is_subclass(self.mode, (bm.TrainingMode, bm.BatchingMode, bm.NonBatchingMode))
self.prob = prob
self.rng = bm.random.RandomState(seed=seed)
def update(self, x):
if bp.share.load('fit'):
keep_mask = self.rng.bernoulli(self.prob, x.shape)
return bm.where(keep_mask, x / self.prob, 0.)
else:
return x
Here, the model makes different outputs according to the different values of a shared parameter fit
.
You can define your own shared parameters, and then provide their shared parameters when calling the trainer objects (see the following section).
Examples of training models#
In the following, we illustrate several examples to build a trainable neural network model.
Artificial neural networks#
BrainPy provides neural network layers which can be useful to define artificial neural networks.
Here, let’s define a deep RNN model.
class DeepRNN(bp.DynamicalSystemNS):
def __init__(self, num_in, num_recs, num_out):
super(DeepRNN, self).__init__()
self.l1 = bp.layers.LSTMCell(num_in, num_recs[0])
self.d1 = bp.layers.Dropout(0.2)
self.l2 = bp.layers.LSTMCell(num_recs[0], num_recs[1])
self.d2 = bp.layers.Dropout(0.2)
self.l3 = bp.layers.LSTMCell(num_recs[1], num_recs[2])
self.d3 = bp.layers.Dropout(0.2)
self.l4 = bp.layers.LSTMCell(num_recs[2], num_recs[3])
self.d4 = bp.layers.Dropout(0.2)
self.lout = bp.layers.Dense(num_recs[3], num_out)
def update(self, x):
x = x >> self.l2 >> self.d1
x = x >> self.l2 >> self.d2
x = x >> self.l3 >> self.d3
x = x >> self.l4 >> self.d4
return self.lout(x)
with bm.training_environment():
model = DeepRNN(100, [200, 200, 200, 100], 10)
Note here the difference of the model building from PyTorch is that the first argument in update()
function should be the shared parameters sha
(i.e., these parameters are shared across all models, like the time t
, the running index i
, and the model running phase fit
). Then other individual arguments can all be customized by users. The details of the model definition specification can be seen in ????
Moreover, it is worthy to note that this model only defines the one step updating rule of how the model evolves according to the input x
.
Reservoir computing models#
In this example, we define a reservoir computing model called next generation reservoir computing by using the built-in models provided in BrainPy.
class NGRC(bp.DynamicalSystemNS):
def __init__(self, num_in, num_out):
super(NGRC, self).__init__(mode=bm.batching_mode)
self.r = bp.layers.NVAR(num_in, delay=4, order=2, stride=5, mode=bm.batching_mode)
self.o = bp.layers.Dense(self.r.num_out, num_out, mode=bm.training_mode)
def update(self, x):
return x >> self.r >> self.o
In the above model, brainpy.layers.NVAR
is a nonlinear vector autoregression machine, which does not have the training features. Therefore, we define its mode
as batching mode. On the contrary, brainpy.layers.Dense
has the trainable weights for model training.
Spiking Neural Networks#
Building trainable spiking neural networks in BrainPy is also a piece of cake. We provided commonly used spiking models for traditional dynamics simulation. But most of them can be used for training too.
In the following, we provide an implementation of spiking neural networks in (Neftci, Mostafa, & Zenke, 2019) for surrogate gradient learning.
class SNN(bp.Network):
def __init__(self, num_in, num_rec, num_out):
super(SNN, self).__init__()
# neuron groups
self.i = bp.neurons.InputGroup(num_in)
self.r = bp.neurons.LIF(num_rec, tau=10, V_reset=0, V_rest=0, V_th=1.)
self.o = bp.neurons.LeakyIntegrator(num_out, tau=5)
# synapse: i->r
self.i2r = bp.synapses.Exponential(self.i, self.r, bp.conn.All2All(),
output=bp.synouts.CUBA(), tau=10.,
g_max=bp.init.KaimingNormal(scale=20.))
# synapse: r->o
self.r2o = bp.synapses.Exponential(self.r, self.o, bp.conn.All2All(),
output=bp.synouts.CUBA(), tau=10.,
g_max=bp.init.KaimingNormal(scale=20.))
def update(self, tdi, spike):
self.i2r(tdi, spike)
self.r2o(tdi)
self.r(tdi)
self.o(tdi)
return self.o.V.value
with bm.training_environment():
snn = SNN(10, 100, 2)
Note here the mode in all models are specified as brainpy.modes.TrainingMode
.