What is: Semantic Reasoning Network?

Semantic reasoning network, or SRN, is an end-to-end trainable framework for scene text recognition that consists of four parts: backbone network, parallel visual attention module (PVAM), global semantic reasoning module (GSRM), and visual-semantic fusion decoder (VSFD). Given an input image, the backbone network is first used to extract 2D features $V$ . Then, the PVAM is used to generate $N$ aligned 1-D features $G$ , where each feature corresponds to a character in the text and captures the aligned visual information. These $N$ 1-D features $G$ are then fed into a GSRM to capture the semantic information $S$ . Finally, the aligned visual features $G$ and the semantic information $S$ are fused by the VSFD to predict $N$ characters. For text string shorter than $N$ , ’EOS’ are padded.

Source	Towards Accurate Scene Text Recognition with Semantic Reasoning Networks
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com

Viet-Anh on Software

What is: Semantic Reasoning Network?

Viet-Anh on Software