What is: LayerDrop?
Source | Reducing Transformer Depth on Demand with Structured Dropout |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
LayerDrop is a form of structured dropout for Transformer models which has a regularization effect during training and allows for efficient pruning at inference time. It randomly drops layers from the Transformer according to an "every other" strategy where pruning with a rate means dropping the layers at depth such that d = 0\left\(\text{mod}\left(\text{floor}\left(\frac{1}{p}\right)\right)\right).