Viet-Anh on Software Logo

What is: Robust Predictable Control?

SourceRobust Predictable Control
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Robust Predictable Control, or RPC, is an RL algorithm for learning policies that uses only a few bits of information. RPC brings together ideas from information bottlenecks, model-based RL, and bits-back coding. The main idea of RPC is that if the agent can accurately predict the future, then the agent will not need to observe as many bits from future observations. Precisely, the agent will learn a latent dynamics model that predicts the next representation using the current representation and action. In addition to predicting the future, the agent can also decrease the number of bits by changing its behavior. States where the dynamics are hard to predict will require more bits, so the agent will prefer visiting states where its learned model can accurately predict the next state.