What is: Attention Dropout?
Year | 2018 |
Data Source | CC BY-SA - https://paperswithcode.com |
Attention Dropout is a type of dropout used in attention-based architectures, where elements are randomly dropped out of the softmax in the attention equation. For example, for scaled-dot product attention, we would drop elements from the first term: