What is: lda2vec?
Source | Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
lda2vec builds representations over both words and documents by mixing word2vec’s skipgram architecture with Dirichlet-optimized sparse topic mixtures.
The Skipgram Negative-Sampling (SGNS) objective of word2vec is modified to utilize document-wide feature vectors while simultaneously learning continuous document weights loading onto topic vectors. The total loss term is the sum of the Skipgram Negative Sampling Loss (SGNS) with the addition of a Dirichlet-likelihood term over document weights, . The loss is conducted using a context vector, , pivot word vector , target word vector , and negatively-sampled word vector :