What is: Context Enhancement Module?
Source | ThunderNet: Towards Real-time Generic Object Detection |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Context Enhancement Module (CEM) is a feature extraction module used in object detection (specifically, ThunderNet) which aims to to enlarge the receptive field. The key idea of CEM is to aggregate multi-scale local context information and global context information to generate more discriminative features. In CEM, the feature maps from three scales are merged: , and . is the global context feature vector by applying a global average pooling on . We then apply a 1 × 1 convolution on each feature map to squeeze the number of channels to .
Afterwards, is upsampled by 2× and is broadcast so that the spatial dimensions of the three feature maps are equal. At last, the three generated feature maps are aggregated. By leveraging both local and global context, CEM effectively enlarges the receptive field and refines the representation ability of the thin feature map. Compared with prior FPN structures, CEM involves only two 1×1 convolutions and a fc layer.