What is: Pythia?
Source | Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Pythia is a suite of decoder-only autoregressive language models all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters. The model architecture and hyperparameters largely follow GPT-3, with a few notable deviations based on recent advances in best practices for large scale language modeling.