What is: Dot-Product Attention?
Source | Effective Approaches to Attention-based Neural Machine Translation |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Dot-Product Attention is an attention mechanism where the alignment score function is calculated as:
It is equivalent to multiplicative attention (without a trainable weight matrix, assuming this is instead an identity matrix). Here refers to the hidden states for the encoder, and is the hidden states for the decoder. The function above is thus a type of alignment score function.
Within a neural network, once we have the alignment scores, we calculate the final scores/weights using a softmax function of these alignment scores (ensuring it sums to 1).