What is: Cosine Annealing?
Source | SGDR: Stochastic Gradient Descent with Warm Restarts |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Cosine Annealing is a type of learning rate schedule that has the effect of starting with a large learning rate that is relatively rapidly decreased to a minimum value before being increased rapidly again. The resetting of the learning rate acts like a simulated restart of the learning process and the re-use of good weights as the starting point of the restart is referred to as a "warm restart" in contrast to a "cold restart" where a new set of small random numbers may be used as a starting point.
Where where and are ranges for the learning rate, and account for how many epochs have been performed since the last restart.
Text Source: Jason Brownlee
Image Source: Gao Huang