What is: N-step Returns?
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
-step Returns are used for value function estimation in reinforcement learning. Specifically, for steps we can write the complete return as:
We can then write an -step backup, in the style of TD learning, as:
Multi-step returns often lead to faster learning with suitably tuned .
Image Credit: Sutton and Barto, Reinforcement Learning