What is: Relation-aware Global Attention?
Source | Relation-Aware Global Attention for Person Re-identification |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
In relation-aware global attention (RGA) stresses the importance of global structural information provided by pairwise relations, and uses it to produce attention maps.
RGA comes in two forms, spatial RGA (RGA-S) and channel RGA (RGA-C). RGA-S first reshapes the input feature map to and the pairwise relation matrix is computed using
\begin{align}
Q &= \delta(W^QX)
\end{align}
\begin{align}
K &= \delta(W^KX)
\end{align}
\begin{align}
R &= Q^TK
\end{align}
The relation vector at position is defined by stacking pairwise relations at all positions:
\begin{align}
r_i = [R(i, :); R(:,i)]
\end{align}
and the spatial relation-aware feature can be written as
\begin{align}
Y_i = [g^c_\text{avg}(\delta(W^\varphi x_i)); \delta(W^\phi r_i)]
\end{align}
where denotes global average pooling in the channel domain. Finally, the spatial attention score at position is given by
\begin{align}
a_i = \sigma(W_2\delta(W_1y_i))
\end{align}
RGA-C has the same form as RGA-S, except for taking the input feature map as a set of -dimensional features.
RGA uses global relations to generate the attention score for each feature node, so provides valuable structural information and significantly enhances the representational power. RGA-S and RGA-C are flexible enough to be used in any CNN network; Zhang et al. propose using them jointly in sequence to better capture both spatial and cross-channel relationships.