**Clipped Double Q-learning** is a variant on [Double Q-learning](https://paperswithcode.com/method/double-q-learning) that upper-bounds the less biased Q estimate $Q\_{\theta\_{2}}$ by the biased estimate $Q\_{\theta\_{1}}$. This is equivalent to taking the minimum of the two estimates, resulting in the following target update:

$$ y\_{1} = r + \gamma\min\_{i=1,2}Q\_{\theta'\_{i}}\left(s', \pi\_{\phi\_{1}}\left(s'\right)\right) $$

The motivation for this extension is that vanilla double [Q-learning](https://paperswithcode.com/method/q-learning) is sometimes ineffective if the target and current networks are too similar, e.g. with a slow-changing policy in an actor-critic framework.

**RegNetX** is a convolutional network design space with simple, regular models with parameters: depth $d$, initial width $w\_{0} > 0$, and slope $w\_{a} > 0$, and generates a different block width $u\_{j}$ for each block $j < d$. The key restriction for the RegNet types of model is that there is a linear parameterisation of block widths (the design space only contains models with this linear structure):

$$ u\_{j} = w\_{0} + w\_{a}\cdot{j} $$

For **RegNetX** we have additional restrictions: we set $b = 1$ (the bottleneck ratio), $12 \leq d \leq 28$, and $w\_{m} \geq 2$ (the width multiplier).

RegNetX

Designing Network Design Spaces

Clipped Double Q-learning

Addressing Function Approximation Error in Actor-Critic Methods

**Spiking Neural Networks** (**SNNs**)  are a class of artificial neural networks inspired by the structure and functioning of the brain's neural networks. Unlike traditional artificial neural networks that operate based on continuous firing rates, SNNs simulate the behavior of individual neurons through discrete spikes or action potentials. These spikes are triggered when the neuron's membrane potential reaches a certain threshold, and they propagate through the network, communicating information and triggering subsequent neuron activations. This spike-based communication allows SNNs to capture the temporal dynamics of information processing and exhibit asynchronous, event-driven behavior, making them well-suited for tasks such as temporal pattern recognition, event detection, and real-time processing. SNNs have gained attention due to their potential in efficiently processing and encoding information, offering advantages in energy efficiency, robustness, and compatibility with neuromorphic hardware architectures.

Source	Addressing Function Approximation Error in Actor-Critic Methods
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com

Viet-Anh on Software

What is: Clipped Double Q-learning?

Viet-Anh on Software