What is: Sarsa?
Year | 1994 |
Data Source | CC BY-SA - https://paperswithcode.com |
Sarsa is an on-policy TD control algorithm:
This update is done after every transition from a nonterminal state . if is terminal, then is defined as zero.
To design an on-policy control algorithm using Sarsa, we estimate for a behaviour policy and then change towards greediness with respect to .
Source: Sutton and Barto, Reinforcement Learning, 2nd Edition