Build Artificial Neural Networks
Artificial neural networks in BrainPy are used to build dynamical systems. Here we only talk about how to build a neural network and how to train it.
The brainpy.simulation.layers module provides various classes representing the layers of a neural network. All of them are subclasses of the brainpy.simulation.layers.Module
base class.
import brainpy as bp
bp.set_platform('cpu')
import brainpy.simulation.layers as nn
import brainpy.math.jax as bm
bp.math.use_backend('jax')
Creating a layer
A layer can be created as an instance of a brainpy.layers.Module
subclass. For example, a dense layer can be created as follows:
l = nn.Dense(num_hidden=100, num_input=128)
type(l)
brainpy.simulation.layers.dense.Dense
This will create a dense layer with 100 units, connected to another input layer with 128 dimension.
Creating a network
Chaining layer instances together like this will allow you to specify your desired network structure.
This can be done with inheritance from brainpy.layers.Module
,
class MLP(nn.Module):
def __init__(self, n_in, n_l1, n_l2, n_out):
super(MLP, self).__init__()
self.l1 = nn.Dense(num_hidden=n_l1, num_input=n_in)
self.l2 = nn.Dense(num_hidden=n_l2, num_input=n_l1)
self.l3 = nn.Dense(num_hidden=n_out, num_input=n_l2)
def update(self, x):
x = bm.relu(self.l1(x))
x = bm.relu(self.l2(x))
x = self.l3(x)
return x
mlp1 = MLP(10, 50, 100, 2)
Or using brainpy.layers.Sequential
,
mlp2 = nn.Sequential(
l1=nn.Dense(num_hidden=50, num_input=10),
r1=nn.Activation('relu'),
l2=nn.Dense(num_hidden=100, num_input=50),
r2=nn.Activation('relu'),
l3=nn.Dense(num_hidden=2, num_input=100),
)
Naming a layer
For convenience, you can name a layer by specifying the name keyword argument:
l_hidden = nn.Dense(num_hidden=50, num_input=10, name='hidden_layer')
Initializing parameters
Many types of layers, such as brainpy.layers.Dense
, have trainable parameters. These are referred to by short names that match the conventions used in modern deep learning literature. For example, a weight matrix will usually be called w, and a bias vector will usually be b.
When creating a layer with trainable parameters, TrainVar
will be created for them and initialized automatically. You can optionally specify your own initialization strategy by using keyword arguments that match the parameter variable names. For example:
l = nn.Dense(num_hidden=50, num_input=10, w=bp.initialize.Normal(0.01))
The weight matrix w of this dense layer will be initialized using samples from a normal distribution with standard deviation 0.01 (see brainpy.initialize for more information).
There are several ways to manually initialize parameters:
Tensors
If a tensor variable instance is provided, this is used unchanged as the parameter variable. For example:
w = bm.random.normal(0, 0.01, size=(10, 50))
nn.Dense(num_hidden=50, num_input=10, w=w)
<brainpy.simulation.layers.dense.Dense at 0x23cff9bb910>
callable
If a callable is provided (e.g. a function or a brainpy.initialize.Initializer
instance), the callable will be called with the desired shape to generate suitable initial parameter values. The variable is then initialized with those values. For example:
nn.Dense(num_hidden=50, num_input=10, w=bp.initialize.Normal(0.01))
<brainpy.simulation.layers.dense.Dense at 0x23cff9bf2b0>
Or, using a custom initialization function:
def init_w(shape):
return bm.random.normal(0, 0.01, shape)
nn.Dense(num_hidden=50, num_input=10, w=init_w)
<brainpy.simulation.layers.dense.Dense at 0x23cff9ac670>
Some types of parameter variables can also be set to None
at initialization (e.g. biases). In that case, the parameter variable will be omitted. For example, creating a dense layer without biases is done as follows:
nn.Dense(num_hidden=50, num_input=10, b=None)
<brainpy.simulation.layers.dense.Dense at 0x23cff99fa30>
Setup a training
Here, we show an example to train MLP to classify the MNIST images.
import numpy as np
import tensorflow as tf
# Data
(X_train, Y_train), (X_test, Y_test) = tf.keras.datasets.mnist.load_data()
num_train, num_test = X_train.shape[0], X_test.shape[0]
num_dim = bp.tools.size2num(X_train.shape[1:])
X_train = np.asarray(X_train.reshape((num_train, num_dim)) / 255.0, dtype=bm.float_)
X_test = np.asarray(X_test.reshape((num_test, num_dim)) / 255.0, dtype=bm.float_)
Y_train = np.asarray(Y_train.flatten(), dtype=bm.float_)
Y_test = np.asarray(Y_test.flatten(), dtype=bm.float_)
model = MLP(n_in=num_dim, n_l1=256, n_l2=128, n_out=10)
opt = bm.optimizers.Momentum(lr=1e-3, train_vars=model.train_vars())
gv = bm.grad(lambda X, Y: bm.losses.cross_entropy_loss(model(X), Y),
dyn_vars=model.vars(),
grad_vars=model.train_vars(),
return_value=True)
@bm.jit
@bm.function(nodes=(model, opt))
def train(x, y):
grads, loss = gv(x, y)
opt.update(grads=grads)
return loss
predict = bm.jit(lambda X: bm.softmax(model(X)), dyn_vars=model.vars())
# Training
num_batch = 128
for epoch in range(30):
# Train
loss = []
sel = np.arange(len(X_train))
np.random.shuffle(sel)
for it in range(0, X_train.shape[0], num_batch):
l = train(X_train[sel[it:it + num_batch]], Y_train[sel[it:it + num_batch]])
loss.append(l)
# Eval
test_predictions = predict(X_test).argmax(1)
accuracy = np.array(test_predictions).flatten() == Y_test
print(f'Epoch {epoch + 1:4d} Train Loss {np.mean(loss):.3f} Test Accuracy {100 * np.mean(accuracy):.3f}')
Epoch 1 Train Loss 1.212 Test Accuracy 86.410
Epoch 2 Train Loss 0.467 Test Accuracy 89.810
Epoch 3 Train Loss 0.367 Test Accuracy 90.670
Epoch 4 Train Loss 0.325 Test Accuracy 91.470
Epoch 5 Train Loss 0.298 Test Accuracy 92.220
Epoch 6 Train Loss 0.278 Test Accuracy 92.890
Epoch 7 Train Loss 0.261 Test Accuracy 93.140
Epoch 8 Train Loss 0.248 Test Accuracy 93.530
Epoch 9 Train Loss 0.235 Test Accuracy 93.810
Epoch 10 Train Loss 0.224 Test Accuracy 94.010
Epoch 11 Train Loss 0.214 Test Accuracy 94.100
Epoch 12 Train Loss 0.205 Test Accuracy 94.350
Epoch 13 Train Loss 0.196 Test Accuracy 94.540
Epoch 14 Train Loss 0.189 Test Accuracy 94.680
Epoch 15 Train Loss 0.182 Test Accuracy 94.910
Epoch 16 Train Loss 0.175 Test Accuracy 95.070
Epoch 17 Train Loss 0.169 Test Accuracy 95.190
Epoch 18 Train Loss 0.163 Test Accuracy 95.280
Epoch 19 Train Loss 0.157 Test Accuracy 95.410
Epoch 20 Train Loss 0.153 Test Accuracy 95.570
Epoch 21 Train Loss 0.148 Test Accuracy 95.760
Epoch 22 Train Loss 0.143 Test Accuracy 95.760
Epoch 23 Train Loss 0.139 Test Accuracy 95.930
Epoch 24 Train Loss 0.135 Test Accuracy 95.910
Epoch 25 Train Loss 0.131 Test Accuracy 96.150
Epoch 26 Train Loss 0.128 Test Accuracy 96.110
Epoch 27 Train Loss 0.124 Test Accuracy 96.310
Epoch 28 Train Loss 0.121 Test Accuracy 96.330
Epoch 29 Train Loss 0.118 Test Accuracy 96.410
Epoch 30 Train Loss 0.115 Test Accuracy 96.410