What is: RoBERTa?
Source | RoBERTa: A Robustly Optimized BERT Pretraining Approach |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
RoBERTa is an extension of BERT with changes to the pretraining procedure. The modifications include:
- training the model longer, with bigger batches, over more data
- removing the next sentence prediction objective
- training on longer sequences
- dynamically changing the masking pattern applied to the training data. The authors also collect a large new dataset () of comparable size to other privately used datasets, to better control for training set size effects