CosineAnnealingWarmRestarts#
- class brainpy.optim.CosineAnnealingWarmRestarts(lr, num_call_per_epoch, T_0, T_mult=1, eta_min=0.0, last_epoch=-1, last_call=-1)[source]#
- Set the learning rate of each parameter group using a cosine annealing
schedule, where \(\eta_{max}\) is set to the initial lr, \(T_{cur}\) is the number of epochs since the last restart and \(T_{i}\) is the number of epochs between two warm restarts in SGDR:
\[\eta_t = \eta_{min} +\]- rac{1}{2}(eta_{max} - eta_{min})left(1 +
cosleft(
rac{T_{cur}}{T_{i}}pi ight) ight)
When \(T_{cur}=T_{i}\), set \(\eta_t = \eta_{min}\). When \(T_{cur}=0\) after restart, set \(\eta_t=\eta_{max}\).
It has been proposed in SGDR: Stochastic Gradient Descent with Warm Restarts.
- lr: float
Initial learning rate.
- num_call_per_epoch: int
The number the scheduler to call in each epoch. This usually means the number of batch in each epoch training.
- T_0: int
Number of iterations for the first restart.
- T_mult: int
A factor increases \(T_{i}\) after a restart. Default: 1.
- eta_min: float
Minimum learning rate. Default: 0.
- last_call: int
The index of last call. Default: -1.