Viet-Anh on Software Logo

What is: NADAM?

Year2015
Data SourceCC BY-SA - https://paperswithcode.com

NADAM, or Nesterov-accelerated Adaptive Moment Estimation, combines Adam and Nesterov Momentum. The update rule is of the form:

θ_t+1=θ_tηv^_t+ϵ(β_1m^_t+(1β_t)g_t1βt_1) \theta\_{t+1} = \theta\_{t} - \frac{\eta}{\sqrt{\hat{v}\_{t}}+\epsilon}\left(\beta\_{1}\hat{m}\_{t} + \frac{(1-\beta\_{t})g\_{t}}{1-\beta^{t}\_{1}}\right)

Image Source: Incorporating Nesterov Momentum into Adam