brainpy.math.autograd.grad(func, grad_vars=None, dyn_vars=None, argnums=None, holomorphic=False, allow_int=False, reduce_axes=(), has_aux=None, return_value=False)[source]#

Automatic gradient computation for functions or class objects.

This gradient function only support scalar return. It creates a function which evaluates the gradient of func.

It’s worthy to note that the returns are different for different argument settings (where arg_grads refers to the gradients of “argnums”, and var_grads refers to the gradients of “grad_vars”).

  1. When “grad_vars” is None

  • “has_aux=False” + “return_value=False” => arg_grads.

  • “has_aux=True” + “return_value=False” => (arg_grads, aux_data).

  • “has_aux=False” + “return_value=True” => (arg_grads, loss_value).

  • “has_aux=True” + “return_value=True” => (arg_grads, loss_value, aux_data).

  1. When “grad_vars” is not None and “argnums” is None

  • “has_aux=False” + “return_value=False” => var_grads.

  • “has_aux=True” + “return_value=False” => (var_grads, aux_data).

  • “has_aux=False” + “return_value=True” => (var_grads, loss_value).

  • “has_aux=True” + “return_value=True” => (var_grads, loss_value, aux_data).

  1. When “grad_vars” is not None and “argnums” is not None

  • “has_aux=False” + “return_value=False” => (var_grads, arg_grads).

  • “has_aux=True” + “return_value=False” => ((var_grads, arg_grads), aux_data).

  • “has_aux=False” + “return_value=True” => ((var_grads, arg_grads), loss_value).

  • “has_aux=True” + “return_value=True” => ((var_grads, arg_grads), loss_value, aux_data).

Let’s see some examples below.

Before start, let’s figure out what should be provided as grad_vars? And, what should be labeled in argnums? Take the following codes as example:

>>> import brainpy as bp
>>> import brainpy.math as bm
>>> class Example(bp.Base):
>>>   def __init__(self):
>>>     super(Example, self).__init__()
>>>     self.x = bm.TrainVar(bm.zeros(1))
>>>     self.y = bm.random.rand(10)
>>>   def __call__(self, z, v):
>>>     t1 = self.x * self.y.sum()
>>>     t2 = bm.tanh(z * v + t1)
>>>     return t2.mean()
>>> # This code is equivalent to the following function:
>>> x = bm.TrainVar(bm.zeros(1))
>>> y = bm.random.rand(10)
>>> def f(z, v):
>>>   t1 = x * y.sum()
>>>   t2 = bm.tanh(z * v + t1)
>>>   return t2.mean()

Generally speaking, all gradient variables which not provided in arguments should be labeled as grad_vars, while all gradient variables provided in the function arguments should be declared in argnums. In above codes, we try to take gradients of self.x and arguments z and v, we should call brainpy.math.grad as:

>>> f = Example()
>>> f_grad = bm.grad(f, grad_vars=f.x, argnums=(0, 1))


Grad for a pure function:

>>> import brainpy as bp
>>> grad_tanh = grad(bp.math.tanh)
>>> print(grad_tanh(0.2))
  • func (function, Base) – Function to be differentiated. Its arguments at positions specified by argnums should be arrays, scalars, or standard Python containers. Argument arrays in the positions specified by argnums must be of inexact (i.e., floating-point or complex) type. It should return a scalar (which includes arrays with shape () but not arrays with shape (1,) etc.)

  • dyn_vars (optional, JaxArray, sequence of JaxArray, dict) – The dynamically changed variables used in func.

  • grad_vars (optional, JaxArray, sequence of JaxArray, dict) – The variables in func to take their gradients.

  • argnums (optional, integer or sequence of integers) – Specifies which positional argument(s) to differentiate with respect to (default 0).

  • has_aux (optional, bool) – Indicates whether fun returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data. Default False.

  • return_value (bool) – Whether return the loss value.

  • holomorphic (optional, bool) – Indicates whether fun is promised to be holomorphic. If True, inputs and outputs must be complex. Default False.

  • allow_int (optional, bool) – Whether to allow differentiating with respect to integer valued inputs. The gradient of an integer input will have a trivial vector-space dtype (float0). Default False.

  • reduce_axes (optional, tuple of int) – tuple of axis names. If an axis is listed here, and fun implicitly broadcasts a value over that axis, the backward pass will perform a psum of the corresponding gradient. Otherwise, the gradient will be per-example over named axes. For example, if 'batch' is a named batch axis, grad(f, reduce_axes=('batch',)) will create a function that computes the total gradient while grad(f) will create one that computes the per-example gradient.


func – A function with the same arguments as fun, that evaluates the gradient of fun. If argnums is an integer then the gradient has the same shape and type as the positional argument indicated by that integer. If argnums is a tuple of integers, the gradient is a tuple of values with the same shapes and types as the corresponding arguments. If has_aux is True then a pair of (gradient, auxiliary_data) is returned.

Return type