What is: Multi-Head Linear Attention?
Source | Linformer: Self-Attention with Linear Complexity |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Multi-Head Linear Attention is a type of linear multi-head self-attention module, proposed with the Linformer architecture. The main idea is to add two linear projection matrices when computing key and value. We first project the original -dimensional key and value layers and into -dimensional projected key and value layers. We then compute a dimensional context mapping using scaled-dot product attention:
Finally, we compute context embeddings for each head using .