What is: PanGu-$α$?
Source | PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
PanGu- is an autoregressive language model (ALM) with up to 200 billion parameters pretrained on a large corpus of text, mostly in Chinese language. The architecture of PanGu- is based on Transformer, which has been extensively used as the backbone of a variety of pretrained language models such as BERT and GPT. Different from them, there's an additional query layer developed on top of Transformer layers which aims to explicitly induce the expected output.