CosineAnnealingWarmRestarts

Contents

CosineAnnealingWarmRestarts#

class brainpy.optim.CosineAnnealingWarmRestarts(lr, num_call_per_epoch, T_0, T_mult=1, eta_min=0.0, last_epoch=-1, last_call=-1)[source]#

Set the learning rate of each parameter group using a cosine annealing: schedule, where \(\eta_{max}\) is set to the initial lr, \(T_{cur}\) is the number of epochs since the last restart and \(T_{i}\) is the number of epochs between two warm restarts in SGDR:

\[\eta_t = \eta_{min} +\]
rac{1}{2}(eta_{max} - eta_{min})left(1 +: cosleft(

rac{T_{cur}}{T_{i}}pi ight) ight)

When \(T_{cur}=T_{i}\), set \(\eta_t = \eta_{min}\). When \(T_{cur}=0\) after restart, set \(\eta_t=\eta_{max}\).

It has been proposed in SGDR: Stochastic Gradient Descent with Warm Restarts.

lr: float
Initial learning rate.

num_call_per_epoch: int
The number the scheduler to call in each epoch. This usually means the number of batch in each epoch training.

T_0: int
Number of iterations for the first restart.

T_mult: int
A factor increases \(T_{i}\) after a restart. Default: 1.

eta_min: float
Minimum learning rate. Default: 0.

last_call: int
The index of last call. Default: -1.