What is: Point-wise Spatial Attention?
Source | PSANet: Point-wise Spatial Attention Network for Scene Parsing |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Point-wise Spatial Attention (PSA) is a semantic segmentation module. The goal is capture contextual information, especially in the long range, by aggregating information. Through the PSA module, information aggregation is performed as a kind of information flow where we adaptively learn a pixel-wise global attention map for each position from two perspectives to aggregate contextual information over the entire feature map.
The PSA module takes a spatial feature map as input. We denote the spatial size of as . Through the two branches as illustrated, we generate pixel-wise global attention maps for each position in feature map through several convolutional layers.
We aggregate input feature maps based on attention maps to generate new feature representations with the long-range contextual information incorporated, i.e., from the ‘collect’ branch and from the ‘distribute’ branch.
We concatenate the new representations and and apply a convolutional layer with batch normalization and activation layers for dimension reduction and feature fusion. Then we concatenate the new global contextual feature with the local representation feature . It is followed by one or several convolutional layers with batch normalization and activation layers to generate the final feature map for following subnetworks.