What is: Bayesian Reward Extrapolation?
Source | Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Bayesian Reward Extrapolation is a Bayesian reward learning algorithm that scales to high-dimensional imitation learning problems by pre-training a low-dimensional feature encoding via self-supervised tasks and then leveraging preferences over demonstrations to perform fast Bayesian inference.