What is: Mixture of Softmaxes?
Source | Breaking the Softmax Bottleneck: A High-Rank RNN Language Model |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Mixture of Softmaxes performs different softmaxes and mixes them. The motivation is that the traditional softmax suffers from a softmax bottleneck, i.e. the expressiveness of the conditional probability we can model is constrained by the combination of a dot product and the softmax. By using a mixture of softmaxes, we can model the conditional probability more expressively.