Viet-Anh on Software Logo

What is: Kaiming Initialization?

SourceDelving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Kaiming Initialization, or He Initialization, is an initialization method for neural networks that takes into account the non-linearity of activation functions, such as ReLU activations.

A proper initialization method should avoid reducing or magnifying the magnitudes of input signals exponentially. Using a derivation they work out that the condition to stop this happening is:

12n_lVar[w_l]=1\frac{1}{2}n\_{l}\text{Var}\left[w\_{l}\right] = 1

This implies an initialization scheme of:

w_lN(0,2/n_l) w\_{l} \sim \mathcal{N}\left(0, 2/n\_{l}\right)

That is, a zero-centered Gaussian with standard deviation of 2/n_l\sqrt{2/{n}\_{l}} (variance shown in equation above). Biases are initialized at 00.