Viet-Anh on Software Logo

What is: Global second-order pooling convolutional networks?

SourceGlobal Second-order Pooling Convolutional Networks
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

A Gsop block has a squeeze module and an excitation module, and uses a second-order pooling to model high-order statistics while gathering global information. In the squeeze module, a GSoP block firstly reduces the number of channels from cc to cc' (c<cc' < c) using a 1×11 \times 1 convolution, then computes a c×cc' \times c' covariance matrix for the different channels to obtain their correlation. Next, row-wise normalization is performed on the covariance matrix. Each (i,j)(i, j) in the normalized covariance matrix explicitly relates channel ii to channel jj.

In the excitation module, a GSoP block performs row-wise convolution to maintain structural information and output a vector. Then a fully-connected layer and a sigmoid function are applied to get a cc-dimensional attention vector. Finally, it multiplies the input features by the attention vector, as in an SE block. A GSoP block can be formulated as: \begin{align} s = F_\text{gsop}(X, \theta) & = \sigma (W \text{RC}(\text{Cov}(\text{Conv}(X)))) \end{align} \begin{align} Y & = s X \end{align} Here, Conv()\text{Conv}(\cdot) reduces the number of channels, Cov()\text{Cov}(\cdot) computes the covariance matrix and RC()\text{RC}(\cdot) means row-wise convolution.