What is: ASGD Weight-Dropped LSTM?
Source | Regularizing and Optimizing LSTM Language Models |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
ASGD Weight-Dropped LSTM, or AWD-LSTM, is a type of recurrent neural network that employs DropConnect for regularization, as well as NT-ASGD for optimization - non-monotonically triggered averaged SGD - which returns an average of last iterations of weights. Additional regularization techniques employed include variable length backpropagation sequences, variational dropout, embedding dropout, weight tying, independent embedding/hidden size, activation regularization and temporal activation regularization.