**Entropy Regularization** is a type of regularization used in [reinforcement learning](https://paperswithcode.com/methods/area/reinforcement-learning). For on-policy policy gradient based methods like [A3C](https://paperswithcode.com/method/a3c), the same mutual  reinforcement behaviour leads to a highly-peaked $\pi\left(a\mid{s}\right)$ towards a few actions or action sequences, since it is easier for the actor and critic to overoptimise to a small portion of the environment. To reduce this problem, entropy regularization adds an entropy term to the loss to promote action diversity:

$$H(X) = -\sum\pi\left(x\right)\log\left(\pi\left(x\right)\right) $$

Image Credit: Wikipedia

**NeuralRecon** is a framework for real-time 3D scene reconstruction from a monocular video. Unlike previous methods that estimate single-view depth maps separately on each key-frame and fuse them later, NeuralRecon proposes to directly reconstruct local surfaces represented as sparse TSDF volumes for each video fragment sequentially by a neural network. A learning-based TSDF fusion module based on gated recurrent units is used to guide the network to fuse features from previous fragments. This design allows the network to capture local smoothness prior and global shape prior of 3D surfaces.

NeuralRecon

NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video

Entropy Regularization

Asynchronous Methods for Deep Reinforcement Learning

**Vokenization** is an approach for extrapolating multimodal alignments to language-only data by contextually mapping language tokens to their related images ("vokens") by retrieval. Instead of directly supervising the language model with visually grounded language datasets (e.g., MS COCO) these relative small datasets are used to train the vokenization processor (i.e. the vokenizer). Vokens are generated for large language corpora (e.g., English Wikipedia), and the visually-supervised language model takes the
input supervision from these large datasets, thus bridging the gap between different data sources.

Source	Asynchronous Methods for Deep Reinforcement Learning
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com

Viet-Anh on Software

What is: Entropy Regularization?

Viet-Anh on Software