What is: REINFORCE?
Year | 1999 |
Data Source | CC BY-SA - https://paperswithcode.com |
REINFORCE is a Monte Carlo variant of a policy gradient algorithm in reinforcement learning. The agent collects samples of an episode using its current policy, and uses it to update the policy parameter . Since one full trajectory must be completed to construct a sample space, it is updated as an off-policy algorithm.
Image Credit: Tingwu Wang