Build Artificial Neural Networks

Artificial neural networks in BrainPy are used to build dynamical systems. Here we only talk about how to build a neural network and how to train it.

The brainpy.simulation.layers module provides various classes representing the layers of a neural network. All of them are subclasses of the brainpy.simulation.layers.Module base class.

import brainpy as bp
bp.set_platform('cpu')

import brainpy.simulation.layers as nn
import brainpy.math.jax as bm
bp.math.use_backend('jax')

Creating a layer

A layer can be created as an instance of a brainpy.layers.Module subclass. For example, a dense layer can be created as follows:

l = nn.Dense(num_hidden=100, num_input=128) 

type(l)

brainpy.simulation.layers.dense.Dense

This will create a dense layer with 100 units, connected to another input layer with 128 dimension.

Creating a network

Chaining layer instances together like this will allow you to specify your desired network structure.

This can be done with inheritance from brainpy.layers.Module,

class MLP(nn.Module):
    def __init__(self, n_in, n_l1, n_l2, n_out):
        super(MLP, self).__init__()
        
        self.l1 = nn.Dense(num_hidden=n_l1, num_input=n_in)
        self.l2 = nn.Dense(num_hidden=n_l2, num_input=n_l1)
        self.l3 = nn.Dense(num_hidden=n_out, num_input=n_l2)
        
    def update(self, x):
        x = bm.relu(self.l1(x))
        x = bm.relu(self.l2(x))
        x = self.l3(x)
        return x

mlp1 = MLP(10, 50, 100, 2)

Or using brainpy.layers.Sequential,

mlp2 = nn.Sequential(
    l1=nn.Dense(num_hidden=50, num_input=10),
    r1=nn.Activation('relu'), 
    l2=nn.Dense(num_hidden=100, num_input=50),
    r2=nn.Activation('relu'), 
    l3=nn.Dense(num_hidden=2, num_input=100),
)

Naming a layer

For convenience, you can name a layer by specifying the name keyword argument:

l_hidden = nn.Dense(num_hidden=50, num_input=10, name='hidden_layer')

Initializing parameters

Many types of layers, such as brainpy.layers.Dense, have trainable parameters. These are referred to by short names that match the conventions used in modern deep learning literature. For example, a weight matrix will usually be called w, and a bias vector will usually be b.

When creating a layer with trainable parameters, TrainVar will be created for them and initialized automatically. You can optionally specify your own initialization strategy by using keyword arguments that match the parameter variable names. For example:

l = nn.Dense(num_hidden=50, num_input=10, w=bp.initialize.Normal(0.01))

The weight matrix w of this dense layer will be initialized using samples from a normal distribution with standard deviation 0.01 (see brainpy.initialize for more information).

There are several ways to manually initialize parameters:

Tensors

If a tensor variable instance is provided, this is used unchanged as the parameter variable. For example:

w = bm.random.normal(0, 0.01, size=(10, 50))
nn.Dense(num_hidden=50, num_input=10, w=w)

<brainpy.simulation.layers.dense.Dense at 0x23cff9bb910>

callable

If a callable is provided (e.g. a function or a brainpy.initialize.Initializer instance), the callable will be called with the desired shape to generate suitable initial parameter values. The variable is then initialized with those values. For example:

nn.Dense(num_hidden=50, num_input=10, w=bp.initialize.Normal(0.01))

<brainpy.simulation.layers.dense.Dense at 0x23cff9bf2b0>

Or, using a custom initialization function:

def init_w(shape):
    return bm.random.normal(0, 0.01, shape)

nn.Dense(num_hidden=50, num_input=10, w=init_w)

<brainpy.simulation.layers.dense.Dense at 0x23cff9ac670>

Some types of parameter variables can also be set to None at initialization (e.g. biases). In that case, the parameter variable will be omitted. For example, creating a dense layer without biases is done as follows:

nn.Dense(num_hidden=50, num_input=10, b=None)

<brainpy.simulation.layers.dense.Dense at 0x23cff99fa30>

Setup a training

Here, we show an example to train MLP to classify the MNIST images.

import numpy as np
import tensorflow as tf

# Data
(X_train, Y_train), (X_test, Y_test) = tf.keras.datasets.mnist.load_data()
num_train, num_test = X_train.shape[0], X_test.shape[0]
num_dim = bp.tools.size2num(X_train.shape[1:])
X_train = np.asarray(X_train.reshape((num_train, num_dim)) / 255.0, dtype=bm.float_)
X_test = np.asarray(X_test.reshape((num_test, num_dim)) / 255.0, dtype=bm.float_)
Y_train = np.asarray(Y_train.flatten(), dtype=bm.float_)
Y_test = np.asarray(Y_test.flatten(), dtype=bm.float_)

model = MLP(n_in=num_dim, n_l1=256, n_l2=128, n_out=10)

opt = bm.optimizers.Momentum(lr=1e-3, train_vars=model.train_vars())

gv = bm.grad(lambda X, Y: bm.losses.cross_entropy_loss(model(X), Y),
             dyn_vars=model.vars(),
             grad_vars=model.train_vars(),
             return_value=True)

@bm.jit
@bm.function(nodes=(model, opt))
def train(x, y):
    grads, loss = gv(x, y)
    opt.update(grads=grads)
    return loss

predict = bm.jit(lambda X: bm.softmax(model(X)), dyn_vars=model.vars())

# Training
num_batch = 128
for epoch in range(30):
  # Train
  loss = []
  sel = np.arange(len(X_train))
  np.random.shuffle(sel)
  for it in range(0, X_train.shape[0], num_batch):
    l = train(X_train[sel[it:it + num_batch]], Y_train[sel[it:it + num_batch]])
    loss.append(l)

  # Eval
  test_predictions = predict(X_test).argmax(1)
  accuracy = np.array(test_predictions).flatten() == Y_test
  print(f'Epoch {epoch + 1:4d}  Train Loss {np.mean(loss):.3f}  Test Accuracy {100 * np.mean(accuracy):.3f}')

Epoch    1  Train Loss 1.212  Test Accuracy 86.410
Epoch    2  Train Loss 0.467  Test Accuracy 89.810
Epoch    3  Train Loss 0.367  Test Accuracy 90.670
Epoch    4  Train Loss 0.325  Test Accuracy 91.470
Epoch    5  Train Loss 0.298  Test Accuracy 92.220
Epoch    6  Train Loss 0.278  Test Accuracy 92.890
Epoch    7  Train Loss 0.261  Test Accuracy 93.140
Epoch    8  Train Loss 0.248  Test Accuracy 93.530
Epoch    9  Train Loss 0.235  Test Accuracy 93.810
Epoch   10  Train Loss 0.224  Test Accuracy 94.010
Epoch   11  Train Loss 0.214  Test Accuracy 94.100
Epoch   12  Train Loss 0.205  Test Accuracy 94.350
Epoch   13  Train Loss 0.196  Test Accuracy 94.540
Epoch   14  Train Loss 0.189  Test Accuracy 94.680
Epoch   15  Train Loss 0.182  Test Accuracy 94.910
Epoch   16  Train Loss 0.175  Test Accuracy 95.070
Epoch   17  Train Loss 0.169  Test Accuracy 95.190
Epoch   18  Train Loss 0.163  Test Accuracy 95.280
Epoch   19  Train Loss 0.157  Test Accuracy 95.410
Epoch   20  Train Loss 0.153  Test Accuracy 95.570
Epoch   21  Train Loss 0.148  Test Accuracy 95.760
Epoch   22  Train Loss 0.143  Test Accuracy 95.760
Epoch   23  Train Loss 0.139  Test Accuracy 95.930
Epoch   24  Train Loss 0.135  Test Accuracy 95.910
Epoch   25  Train Loss 0.131  Test Accuracy 96.150
Epoch   26  Train Loss 0.128  Test Accuracy 96.110
Epoch   27  Train Loss 0.124  Test Accuracy 96.310
Epoch   28  Train Loss 0.121  Test Accuracy 96.330
Epoch   29  Train Loss 0.118  Test Accuracy 96.410
Epoch   30  Train Loss 0.115  Test Accuracy 96.410