What is: Semantic Reasoning Network?
Source | Towards Accurate Scene Text Recognition with Semantic Reasoning Networks |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Semantic reasoning network, or SRN, is an end-to-end trainable framework for scene text recognition that consists of four parts: backbone network, parallel visual attention module (PVAM), global semantic reasoning module (GSRM), and visual-semantic fusion decoder (VSFD). Given an input image, the backbone network is first used to extract 2D features . Then, the PVAM is used to generate aligned 1-D features , where each feature corresponds to a character in the text and captures the aligned visual information. These 1-D features are then fed into a GSRM to capture the semantic information . Finally, the aligned visual features and the semantic information are fused by the VSFD to predict characters. For text string shorter than , ’EOS’ are padded.