grad#
- class brainpy.math.grad(func=None, grad_vars=None, argnums=None, holomorphic=False, allow_int=False, reduce_axes=(), has_aux=None, return_value=False, dyn_vars=None, child_objs=None)[source]#
Automatic gradient computation for functions or class objects.
This gradient function only support scalar return. It creates a function which evaluates the gradient of
func
.It’s worthy to note that the returns are different for different argument settings (where
arg_grads
refers to the gradients of “argnums”, andvar_grads
refers to the gradients of “grad_vars”).When “grad_vars” is None
“has_aux=False” + “return_value=False” =>
arg_grads
.“has_aux=True” + “return_value=False” =>
(arg_grads, aux_data)
.“has_aux=False” + “return_value=True” =>
(arg_grads, loss_value)
.“has_aux=True” + “return_value=True” =>
(arg_grads, loss_value, aux_data)
.
When “grad_vars” is not None and “argnums” is None
“has_aux=False” + “return_value=False” =>
var_grads
.“has_aux=True” + “return_value=False” =>
(var_grads, aux_data)
.“has_aux=False” + “return_value=True” =>
(var_grads, loss_value)
.“has_aux=True” + “return_value=True” =>
(var_grads, loss_value, aux_data)
.
When “grad_vars” is not None and “argnums” is not None
“has_aux=False” + “return_value=False” =>
(var_grads, arg_grads)
.“has_aux=True” + “return_value=False” =>
((var_grads, arg_grads), aux_data)
.“has_aux=False” + “return_value=True” =>
((var_grads, arg_grads), loss_value)
.“has_aux=True” + “return_value=True” =>
((var_grads, arg_grads), loss_value, aux_data)
.
Let’s see some examples below.
Before start, let’s figure out what should be provided as
grad_vars
? And, what should be labeled inargnums
? Take the following codes as example:>>> import brainpy as bp >>> import brainpy.math as bm >>> >>> class Example(bp.BrainPyObject): >>> def __init__(self): >>> super(Example, self).__init__() >>> self.x = bm.TrainVar(bm.zeros(1)) >>> self.y = bm.random.rand(10) >>> def __call__(self, z, v): >>> t1 = self.x * self.y.sum() >>> t2 = bm.tanh(z * v + t1) >>> return t2.mean() >>> >>> # This code is equivalent to the following function: >>> >>> x = bm.TrainVar(bm.zeros(1)) >>> y = bm.random.rand(10) >>> def f(z, v): >>> t1 = x * y.sum() >>> t2 = bm.tanh(z * v + t1) >>> return t2.mean()
Generally speaking, all gradient variables which not provided in arguments should be labeled as
grad_vars
, while all gradient variables provided in the function arguments should be declared inargnums
. In above codes, we try to take gradients ofself.x
and argumentsz
andv
, we should callbrainpy.math.grad
as:>>> f = Example() >>> f_grad = bm.grad(f, grad_vars=f.x, argnums=(0, 1))
Examples
Grad for a pure function:
>>> import brainpy as bp >>> grad_tanh = grad(bp.math.tanh) >>> print(grad_tanh(0.2)) 0.961043
- Parameters:
func (callable, function, BrainPyObject) – Function to be differentiated. Its arguments at positions specified by
argnums
should be arrays, scalars, or standard Python containers. Argument arrays in the positions specified byargnums
must be of inexact (i.e., floating-point or complex) type. It should return a scalar (which includes arrays with shape()
but not arrays with shape(1,)
etc.)grad_vars (optional, ArrayType, sequence of ArrayType, dict) – The variables in
func
to take their gradients.argnums (optional, integer or sequence of integers) – Specifies which positional argument(s) to differentiate with respect to (default 0).
has_aux (optional, bool) – Indicates whether
fun
returns a pair where the first element is considered the output of the mathematical function to be differentiated and the second element is auxiliary data. Default False.return_value (bool) – Whether return the loss value.
holomorphic (optional, bool) – Indicates whether
fun
is promised to be holomorphic. If True, inputs and outputs must be complex. Default False.allow_int (optional, bool) – Whether to allow differentiating with respect to integer valued inputs. The gradient of an integer input will have a trivial vector-space dtype (float0). Default False.
reduce_axes (optional, tuple of int) – tuple of axis names. If an axis is listed here, and
fun
implicitly broadcasts a value over that axis, the backward pass will perform apsum
of the corresponding gradient. Otherwise, the gradient will be per-example over named axes. For example, if'batch'
is a named batch axis,grad(f, reduce_axes=('batch',))
will create a function that computes the total gradient whilegrad(f)
will create one that computes the per-example gradient.dyn_vars (optional, ArrayType, sequence of ArrayType, dict) –
The dynamically changed variables used in
func
.Deprecated since version 2.4.0: No longer need to provide
dyn_vars
. This function is capable of automatically collecting the dynamical variables used in the targetfunc
.child_objs (optional, BrainPyObject, sequnce, dict) –
New in version 2.3.1.
Deprecated since version 2.4.0: No longer need to provide
child_objs
. This function is capable of automatically collecting the children objects used in the targetfunc
.
- Returns:
func – A function with the same arguments as
fun
, that evaluates the gradient offun
. Ifargnums
is an integer then the gradient has the same shape and type as the positional argument indicated by that integer. If argnums is a tuple of integers, the gradient is a tuple of values with the same shapes and types as the corresponding arguments. Ifhas_aux
is True then a pair of (gradient, auxiliary_data) is returned.- Return type:
GradientTransform