What is: Gated Linear Unit?
Source | Language Modeling with Gated Convolutional Networks |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
A Gated Linear Unit, or GLU computes:
It is used in natural language processing architectures, for example the Gated CNN, because here is the gate that control what information from is passed up to the following layer. Intuitively, for a language modeling task, the gating mechanism allows selection of words or features that are important for predicting the next word. The GLU also has non-linear capabilities, but has a linear path for the gradient so diminishes the vanishing gradient problem.