Viet-Anh on Software Logo

What is: Demon CM?

SourceDemon: Improved Neural Network Training with Momentum Decay
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Demon CM, or SGD with Momentum and Demon, is the Demon momentum rule applied to SGD with momentum.

β_t=β_init(1tT)(1β_init)+β_init(1tT)\beta\_{t} = \beta\_{init}\cdot\frac{\left(1-\frac{t}{T}\right)}{\left(1-\beta\_{init}\right) + \beta\_{init}\left(1-\frac{t}{T}\right)}

θ_t+1=θ_tηg_t+β_tv_t\theta\_{t+1} = \theta\_{t} - \eta{g}\_{t} + \beta\_{t}v\_{t}

v_t+1=β_tv_tηg_tv\_{t+1} = \beta\_{t}{v\_{t}} - \eta{g\_{t}}