Building on the recent successes of distributed training of RL agents, R2D2 is an RL approach that trains a RNN-based RL agents from distributed prioritized experience replay. 
Using a single network architecture and fixed set of hyperparameters, Recurrent Replay Distributed DQN quadrupled the previous state of the art on Atari-57, and matches the state of the art on DMLab-30. 
It was the first agent to exceed human-level performance in 52 of the 57 Atari games.

**OPT-IML** is a version of OPT fine-tuned on a large collection of 1500+ NLP tasks divided into various task categories.

OPT-IML

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

R2D2

Recurrent Experience Replay in Distributed Reinforcement Learning

A novel low-resource intrinsic metric to evaluate word
embedding quality based on graph modularity.

Source	Recurrent Experience Replay in Distributed Reinforcement Learning
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com

What is: Recurrent Replay Distributed DQN?

Viet-Anh on Software