What is: Stochastic Depth?

Stochastic Depth aims to shrink the depth of a network during training, while keeping it unchanged during testing. This is achieved by randomly dropping entire ResBlocks during training and bypassing their transformations through skip connections.

Let $b\_{l} \in$ { $0, 1$ } denote a Bernoulli random variable, which indicates whether the $l$ th ResBlock is active ( $b\_{l} = 1$ ) or inactive ( $b\_{l} = 0$ ). Further, let us denote the “survival” probability of ResBlock $l$ as $p\_{l} = \text{Pr}\left(b\_{l} = 1\right)$ . With this definition we can bypass the $l$ th ResBlock by multiplying its function $f\_{l}$ with $b\_{l}$ and we extend the update rule to:

$H\_{l} = \text{ReLU}\left(b\_{l}f\_{l}\left(H\_{l-1}\right) + \text{id}\left(H\_{l-1}\right)\right)$

If $b\_{l} = 1$ , this reduces to the original ResNet update and this ResBlock remains unchanged. If $b\_{l} = 0$ , the ResBlock reduces to the identity function, $H\_{l} = \text{id}\left((H\_{l}−1\right)$ .

Source	Deep Networks with Stochastic Depth
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com

Viet-Anh on Software

What is: Stochastic Depth?

Viet-Anh on Software