CosineAnnealingWarmRestarts

CosineAnnealingWarmRestarts#

class brainpy.optim.CosineAnnealingWarmRestarts(lr, num_call_per_epoch, T_0, T_mult=1, eta_min=0.0, last_epoch=-1, last_call=-1)[source]#
Set the learning rate of each parameter group using a cosine annealing

schedule, where \(\eta_{max}\) is set to the initial lr, \(T_{cur}\) is the number of epochs since the last restart and \(T_{i}\) is the number of epochs between two warm restarts in SGDR:

\[\eta_t = \eta_{min} +\]
rac{1}{2}(eta_{max} - eta_{min})left(1 +

cosleft(

rac{T_{cur}}{T_{i}}pi ight) ight)

When \(T_{cur}=T_{i}\), set \(\eta_t = \eta_{min}\). When \(T_{cur}=0\) after restart, set \(\eta_t=\eta_{max}\).

It has been proposed in SGDR: Stochastic Gradient Descent with Warm Restarts.

lr: float

Initial learning rate.

num_call_per_epoch: int

The number the scheduler to call in each epoch. This usually means the number of batch in each epoch training.

T_0: int

Number of iterations for the first restart.

T_mult: int

A factor increases \(T_{i}\) after a restart. Default: 1.

eta_min: float

Minimum learning rate. Default: 0.

last_call: int

The index of last call. Default: -1.