**Monte-Carlo Tree Search** is a planning algorithm that accumulates value estimates obtained from Monte Carlo simulations in order to successively direct simulations towards more highly-rewarded trajectories. We execute MCTS after encountering each new state to select an agent's action for that state: it is executed again to select the action for the next state. Each execution is an iterative process that simulates many trajectories starting from the current state to the terminal state. The core idea is to successively focus multiple simulations starting at the current state by extending the initial portions of trajectories that have received high evaluations from earlier simulations.

Source: Sutton and Barto, Reinforcement Learning (2nd Edition)

Image Credit: [Chaslot et al](https://www.aaai.org/Papers/AIIDE/2008/AIIDE08-036.pdf)

A **Bidirectional GRU**, or **BiGRU**, is a sequence processing model that consists of two [GRUs](https://paperswithcode.com/method/gru). one taking the input in a forward direction, and the other in a backwards direction. It is a bidirectional recurrent neural network with only the input and forget gates.

Image Source: *Rana R (2016). Gated Recurrent Unit (GRU) for Emotion Classification from Noisy Speech.*

BiGRU

Monte-Carlo Tree Search

**SNet** is a convolutional neural network architecture and object detection backbone used for the [ThunderNet](https://paperswithcode.com/method/thundernet) two-stage object detector. SNet uses ShuffleNetV2 basic blocks but replaces all 3×3 depthwise convolutions with 5×5 depthwise convolutions.

Year	2006
Data Source	CC BY-SA - https://paperswithcode.com

Viet-Anh on Software

What is: Monte-Carlo Tree Search?

Viet-Anh on Software