brainpy.layers.GroupNorm#

class brainpy.layers.GroupNorm(num_groups, num_channels, epsilon=1e-05, affine=True, bias_initializer=ZeroInit, scale_initializer=OneInit(value=1.0), mode=None, name=None)[source]#

Group normalization layer.

\[y = \frac{x - \mathrm{E}[x]}{ \sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]

This layer divides channels into groups and normalizes the features within each group. Its computation is also independent of the batch size. The feature size must be multiple of the group size.

The shape of the data should be (b, d1, d2, …, c), where d denotes the batch size and c denotes the feature (channel) size.

Parameters:
  • num_groups (int) – The number of groups. It should be a factor of the number of channels.

  • num_channels (int) – The number of channels expected in input.

  • epsilon (float) – a value added to the denominator for numerical stability. Default: 1e-5

  • affine (bool) – A boolean value that when set to True, this module has learnable per-channel affine parameters initialized to ones (for weights) and zeros (for biases). Default: True.

  • bias_initializer (Initializer, ArrayType, Callable) – An initializer generating the original translation matrix

  • scale_initializer (Initializer, ArrayType, Callable) – An initializer generating the original scaling matrix

Examples

>>> import brainpy as bp
>>> import brainpy.math as bm
>>> input = bm.random.randn(20, 10, 10, 6)
>>> # Separate 6 channels into 3 groups
>>> m = bp.layers.GroupNorm(3, 6)
>>> # Separate 6 channels into 6 groups (equivalent with InstanceNorm)
>>> m = bp.layers.GroupNorm(6, 6)
>>> # Put all 6 channels into a single group (equivalent with LayerNorm)
>>> m = bp.layers.GroupNorm(1, 6)
>>> # Activating the module
>>> output = m(input)
__init__(num_groups, num_channels, epsilon=1e-05, affine=True, bias_initializer=ZeroInit, scale_initializer=OneInit(value=1.0), mode=None, name=None)[source]#

Methods

__init__(num_groups, num_channels[, ...])

clear_input()

cpu()

Move all variable into the CPU device.

cuda()

Move all variables into the GPU device.

get_delay_data(identifier, delay_step, *indices)

Get delay data according to the provided delay steps.

load_state_dict(state_dict[, warn, compatible])

Copy parameters and buffers from state_dict into this module and its descendants.

load_states(filename[, verbose])

Load the model states.

nodes([method, level, include_self])

Collect all children nodes.

register_delay(identifier, delay_step, ...)

Register delay variable.

register_implicit_nodes(*nodes[, node_cls])

register_implicit_vars(*variables[, var_cls])

reset(*args, **kwargs)

Reset function which reset the whole variables in the model.

reset_local_delays([nodes])

Reset local delay variables.

reset_state(*args, **kwargs)

Reset function which reset the states in the model.

save_states(filename[, variables])

Save the model states.

state_dict()

Returns a dictionary containing a whole state of the module.

to(device)

Moves all variables into the given device.

tpu()

Move all variables into the TPU device.

train_vars([method, level, include_self])

The shortcut for retrieving all trainable variables.

tree_flatten()

Flattens the object as a PyTree.

tree_unflatten(aux, dynamic_values)

Unflatten the data to construct an object of this class.

unique_name([name, type_])

Get the unique name for this object.

update(x)

The function to specify the updating rule.

update_local_delays([nodes])

Update local delay variables.

vars([method, level, include_self, ...])

Collect all variables in this node and the children nodes.

Attributes

global_delay_data

Global delay data, which stores the delay variables and corresponding delay targets.

mode

Mode of the model, which is useful to control the multiple behaviors of the model.

name

Name of the model.