Viet-Anh on Software Logo

What is: Deep Deterministic Policy Gradient?

SourceContinuous control with deep reinforcement learning
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

DDPG, or Deep Deterministic Policy Gradient, is an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. It combines the actor-critic approach with insights from DQNs: in particular, the insights that 1) the network is trained off-policy with samples from a replay buffer to minimize correlations between samples, and 2) the network is trained with a target Q network to give consistent targets during temporal difference backups. DDPG makes use of the same ideas along with batch normalization.