Building Training Models

Building Training Models#

In this section, we are going to talk about how to build models for training.

import brainpy as bp
import brainpy.math as bm

bm.set_platform('cpu')

bp.__version__

'2.4.0'

Use built-in models#

brainpy.DynamicalSystem provided in BrainPy can be used for model training.

`mode` settings#

Some built-in models have implemented the training interface for their training. Users can instantiate these models by providing the parameter mode=brainpy.modes.training for training model customization.

For example, brainpy.neurons.LIF is a model commonly used in computational simulation, but it can also be used in training.

# Instantiate a LIF model for simulation

lif = bp.neurons.LIF(1)
lif.mode

NonBatchingMode

# Instantiate a LIF model for training.
# In this mode, the model implement variables and functions
# compatible with BrainPy's training interface.

lif = bp.neurons.LIF(1, mode=bm.TrainingMode())
lif.mode

TrainingMode(batch_size=1)

But some build-in models does not support training.

try:
    bp.layers.NVAR(1, 1, mode=bm.TrainingMode())
except Exception as e:
    print(type(e), e)

<class 'NotImplementedError'> NVAR does not support TrainingMode(batch_size=1). We only support BatchingMode, NonBatchingMode. 

The mode can be used to control the weight types. Let’s take a synaptic model for another example. For a non-trainable dense layer, the weights and bias are Array instances.

l = bp.layers.Dense(3, 4, mode=bm.batching_mode)

l.W

Array(value=DeviceArray([[-0.31531182, -0.07892124, -0.7207848 , -0.79600596],
                         [ 0.43365675, -0.7257636 , -0.42986184,  0.2427496 ],
                         [-0.6706509 ,  1.0398958 ,  0.20784897,  0.53136575]],            dtype=float32),
      dtype=float32)

l = bp.layers.Dense(3, 4, mode=bm.training_mode)

l.W

TrainVar(value=DeviceArray([[-0.78135514, -0.08054283,  0.35119462,  0.1645825 ],
                            [ 0.09323493,  0.36790657, -0.47392672, -0.7648337 ],
                            [-0.9817612 , -0.5418812 ,  0.5456801 , -1.2071232 ]],            dtype=float32),
         dtype=float32)

Moreover, for some recurrent models, e.g., LSTM or GRU, the state can be set to be trainable or not trainable by train_state argument. When setting train_state=True for the recurrent instance, a new attribute .state2train will be created.

rnn = bp.dyn.RNNCell(1, 3, train_state=True, mode=bm.training_mode)

rnn.state2train

TrainVar(value=DeviceArray([0., 0., 0.]), dtype=float32)

Note the difference between the .state2train and the original .state:

.state2train has no batch axis.
When using node.reset() function, all values in the .state will be filled with .state2train.

rnn.reset(batch_size=5)
rnn.state

Variable(value=DeviceArray([[0., 0., 0.],
                            [0., 0., 0.],
                            [0., 0., 0.],
                            [0., 0., 0.],
                            [0., 0., 0.]]),
         dtype=float32)

Naming a node#

For convenience, you can name a layer by specifying the name keyword argument:

bp.layers.Dense(128, 100, name='hidden_layer')

Dense(name=hidden_layer, num_in=128, num_out=100, mode=NonBatchingMode)

Initializing parameters#

Many models have their parameters. We can set the parameter of a model with the following methods.

Arrays

If an array is provided, this is used unchanged as the parameter variable. For example:

l = bp.layers.Dense(10, 50, W_initializer=bm.random.normal(0, 0.01, size=(10, 50)))

l.W.shape

(10, 50)

Callable function

If a callable function (which receives a shape argument) is provided, the callable will be called with the desired shape to generate suitable initial parameter values. The variable is then initialized with those values. For example:

def init(shape):
    return bm.random.random(shape)

l = bp.layers.Dense(20, 30, W_initializer=init)

l.W.shape

(20, 30)

Instance of brainpy.init.Initializer

If a brainpy.init.Initializer instance is provided, the initial parameter values will be generated with the desired shape by using the Initializer instance. For example:

l = bp.layers.Dense(20, 30, W_initializer=bp.init.Normal(0.01))

l.W.shape

(20, 30)

The weight matrix \(W\) of this dense layer will be initialized using samples from a normal distribution with standard deviation 0.01 (see brainpy.init for more information).

None parameter

Some types of parameter variables can also be set to None at initialization (e.g. biases). In that case, the parameter variable will be omitted. For example, creating a dense layer without biases is done as follows:

l = bp.layers.Dense(20, 100, b_initializer=None)

print(l.b)

None

Customize your models#

Customizing your training models is simple. You just need to subclass brainpy.DynamicalSystem, and implement its update() and reset_state() functions.

Here, we demonstrate the model customization using two examples. The first is a recurrent layer.

class RecurrentLayer(bp.DynamicalSystemNS):
    def __init__(self, num_in, num_out):
        super(RecurrentLayer, self).__init__()

        bp.check.is_subclass(self.mode, (bm.TrainingMode, bm.BatchingMode))

        # define parameters
        self.num_in = num_in
        self.num_out = num_out

        # define variables
        self.state = bm.Variable(bm.zeros(1, num_out), batch_axis=0)

        # define weights
        self.win = bm.TrainVar(bm.random.normal(0., 1./num_in ** 0.5, size=(num_in, num_out)))
        self.wrec = bm.TrainVar(bm.random.normal(0., 1./num_out ** 0.5, size=(num_out, num_out)))

    def reset_state(self, batch_size):
        # this function defines how to reset the mode states
        self.state.value = bm.zeros((batch_size, self.num_out))

    def update(self, x):
        # this function defined how the model update its state and produce its output
        out = bm.dot(x, self.win) + bm.dot(self.state, self.wrec)
        self.state.value = bm.tanh(out)
        return self.state.value

This simple example illustrates many features essential for a training model. reset_state() function defines how to reset model states, which will be called at the first time step; update() function defines how the model states are evolving, which will be called at every time step.

Another example is the dropout layer, which can be useful to demonstrate how to define a model with multiple behaviours.

class Dropout(bp.DynamicalSystemNS):
  def __init__(self, prob: float, seed: int = None, name: str = None):
    super(Dropout, self).__init__(name=name)

    bp.check.is_subclass(self.mode, (bm.TrainingMode, bm.BatchingMode, bm.NonBatchingMode))
    self.prob = prob
    self.rng = bm.random.RandomState(seed=seed)

  def update(self, x):
    if bp.share.load('fit'):
      keep_mask = self.rng.bernoulli(self.prob, x.shape)
      return bm.where(keep_mask, x / self.prob, 0.)
    else:
      return x

Here, the model makes different outputs according to the different values of a shared parameter fit.

You can define your own shared parameters, and then provide their shared parameters when calling the trainer objects (see the following section).

Examples of training models#

In the following, we illustrate several examples to build a trainable neural network model.

Artificial neural networks#

BrainPy provides neural network layers which can be useful to define artificial neural networks.

Here, let’s define a deep RNN model.

class DeepRNN(bp.DynamicalSystemNS):
    def __init__(self, num_in, num_recs, num_out):
        super(DeepRNN, self).__init__()

        self.l1 = bp.layers.LSTMCell(num_in, num_recs[0])
        self.d1 = bp.layers.Dropout(0.2)
        self.l2 = bp.layers.LSTMCell(num_recs[0], num_recs[1])
        self.d2 = bp.layers.Dropout(0.2)
        self.l3 = bp.layers.LSTMCell(num_recs[1], num_recs[2])
        self.d3 = bp.layers.Dropout(0.2)
        self.l4 = bp.layers.LSTMCell(num_recs[2], num_recs[3])
        self.d4 = bp.layers.Dropout(0.2)
        self.lout = bp.layers.Dense(num_recs[3], num_out)

    def update(self, x):
        x = x >> self.l2 >> self.d1
        x = x >> self.l2 >> self.d2
        x = x >> self.l3 >> self.d3
        x = x >> self.l4 >> self.d4
        return self.lout(x)

with bm.training_environment():
    model = DeepRNN(100, [200, 200, 200, 100], 10)

Note here the difference of the model building from PyTorch is that the first argument in update() function should be the shared parameters sha (i.e., these parameters are shared across all models, like the time t, the running index i, and the model running phase fit). Then other individual arguments can all be customized by users. The details of the model definition specification can be seen in ????

Moreover, it is worthy to note that this model only defines the one step updating rule of how the model evolves according to the input x.

Reservoir computing models#

In this example, we define a reservoir computing model called next generation reservoir computing by using the built-in models provided in BrainPy.

class NGRC(bp.DynamicalSystemNS):
  def __init__(self, num_in, num_out):
    super(NGRC, self).__init__(mode=bm.batching_mode)
    self.r = bp.layers.NVAR(num_in, delay=4, order=2, stride=5, mode=bm.batching_mode)
    self.o = bp.layers.Dense(self.r.num_out, num_out, mode=bm.training_mode)

  def update(self, x):
    return x >> self.r >> self.o

In the above model, brainpy.layers.NVAR is a nonlinear vector autoregression machine, which does not have the training features. Therefore, we define its mode as batching mode. On the contrary, brainpy.layers.Dense has the trainable weights for model training.

Spiking Neural Networks#

Building trainable spiking neural networks in BrainPy is also a piece of cake. We provided commonly used spiking models for traditional dynamics simulation. But most of them can be used for training too.

In the following, we provide an implementation of spiking neural networks in (Neftci, Mostafa, & Zenke, 2019) for surrogate gradient learning.

class SNN(bp.Network):
  def __init__(self, num_in, num_rec, num_out):
    super(SNN, self).__init__()

    # neuron groups
    self.i = bp.neurons.InputGroup(num_in)
    self.r = bp.neurons.LIF(num_rec, tau=10, V_reset=0, V_rest=0, V_th=1.)
    self.o = bp.neurons.LeakyIntegrator(num_out, tau=5)

    # synapse: i->r
    self.i2r = bp.synapses.Exponential(self.i, self.r, bp.conn.All2All(),
                                       output=bp.synouts.CUBA(), tau=10.,
                                       g_max=bp.init.KaimingNormal(scale=20.))
    # synapse: r->o
    self.r2o = bp.synapses.Exponential(self.r, self.o, bp.conn.All2All(),
                                       output=bp.synouts.CUBA(), tau=10.,
                                       g_max=bp.init.KaimingNormal(scale=20.))

  def update(self, tdi, spike):
    self.i2r(tdi, spike)
    self.r2o(tdi)
    self.r(tdi)
    self.o(tdi)
    return self.o.V.value

with bm.training_environment():
    snn = SNN(10, 100, 2)

Note here the mode in all models are specified as brainpy.modes.TrainingMode.