What is: Adaptive Dropout?
Source | Adaptive dropout for training deep neural networks |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Adaptive Dropout is a regularization technique that extends dropout by allowing the dropout probability to be different for different units. The intuition is that there may be hidden units that can individually make confident predictions for the presence or absence of an important feature or combination of features. Dropout will ignore this confidence and drop the unit out 50% of the time.
Denote the activity of unit in a deep neural network by and assume that its inputs are {}. In dropout, is randomly set to zero with probability 0.5. Let be a binary variable that is used to mask, the activity , so that its value is:
where is the weight from unit to unit and is the activation function and accounts for biases. Whereas in standard dropout, is Bernoulli with probability , adaptive dropout uses adaptive dropout probabilities that depends on input activities:
where is the weight from unit to unit in the standout network or the adaptive dropout network; is a sigmoidal function. Here 'standout' refers to a binary belief network is that is overlaid on a neural network as part of the overall regularization technique.