Viet-Anh on Software Logo

What is: Convolutional Block Attention Module?

SourceCBAM: Convolutional Block Attention Module
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Convolutional Block Attention Module (CBAM) is an attention module for convolutional neural networks. Given an intermediate feature map, the module sequentially infers attention maps along two separate dimensions, channel and spatial, then the attention maps are multiplied to the input feature map for adaptive feature refinement.

Given an intermediate feature map FRC×H×W\mathbf{F} \in \mathbb{R}^{C×H×W} as input, CBAM sequentially infers a 1D channel attention map M_cRC×1×1\mathbf{M}\_{c} \in \mathbb{R}^{C×1×1} and a 2D spatial attention map M_sR1×H×W\mathbf{M}\_{s} \in \mathbb{R}^{1×H×W}. The overall attention process can be summarized as:

F=M_c(F)F\mathbf{F}' = \mathbf{M}\_{c}\left(\mathbf{F}\right) \otimes \mathbf{F}

F=M_s(F)F\mathbf{F}'' = \mathbf{M}\_{s}\left(\mathbf{F'}\right) \otimes \mathbf{F'}

During multiplication, the attention values are broadcasted (copied) accordingly: channel attention values are broadcasted along the spatial dimension, and vice versa. F\mathbf{F}'' is the final refined output.