What is: Extended Transformer Construction?
Source | ETC: Encoding Long and Structured Inputs in Transformers |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Extended Transformer Construction, or ETC, is an extension of the Transformer architecture with a new attention mechanism that extends the original in two main ways: (1) it allows scaling up the input length from 512 to several thousands; and (2) it can ingesting structured inputs instead of just linear sequences. The key ideas that enable ETC to achieve these are a new global-local attention mechanism, coupled with relative position encodings. ETC also allows lifting weights from existing BERT models, saving computational resources while training.