What is: Charformer?
Source | Charformer: Fast Character Transformers via Gradient-based Subword Tokenization |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Charformer is a type of Transformer model that learns a subword tokenization end-to-end as part of the model. Specifically it uses GBST that automatically learns latent subword representations from characters in a data-driven fashion. Following GBST, the soft subword sequence is passed through Transformer layers.