What is: Attention-augmented Convolution?
Source | Attention Augmented Convolutional Networks |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Attention-augmented Convolution is a type of convolution with a two-dimensional relative self-attention mechanism that can replace convolutions as a stand-alone computational primitive for image classification. It employs scaled-dot product attention and multi-head attention as with Transformers.
It works by concatenating convolutional and attentional feature map. To see this, consider an original convolution operator with kernel size , input filters and output filters. The corresponding attention augmented convolution can be written as"
originates from an input tensor of shape . This is flattened to become which is passed into a multi-head attention module, as well as a convolution (see above).
Similarly to the convolution, the attention augmented convolution 1) is equivariant to translation and 2) can readily operate on inputs of different spatial dimensions.