Viet-Anh on Software Logo

What is: CPC v2?

SourceData-Efficient Image Recognition with Contrastive Predictive Coding
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Contrastive Predictive Coding v2 (CPC v2) is a self-supervised learning approach that builds upon the original CPC with several improvements. These improvements include:

  • Model capacity - The third residual stack of ResNet-101 (originally containing 23 blocks, 1024-dimensional feature maps, and 256-dimensional bottleneck layers), is converted to use 46 blocks, with 4096-dimensional feature maps and 512-dimensional bottleneck layers: ResNet-161.

  • Layer Normalization - The authors find CPC with batch normalization harms downstream performance. They hypothesize this is due to batch normalization allowing large models to find a trivial solution to CPC: it introduces a dependency between patches (through the batch statistics) that can be exploited to bypass the constraints on the receptive field. They replace batch normalization with layer normalization.

  • Predicting lengths and directions - patches are predicted with contexts from both directions rather than just spatially underneath.

  • Patch-based Augmentation - Utilising "color dropping" which randomly drops two of the three color channels in each patch, as well as random horizontal flips.

Consistent with prior results, this new architecture delivers better performance regardless of