What is: Content-Conditioned Style Encoder?
Source | COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
The Content-Conditioned Style Encoder, or COCO, is a style encoder used for image-to-image translation in the COCO-FUNIT architecture. Unlike the style encoder in FUNIT, COCO takes both content and style image as input. With this content conditioning scheme, we create a direct feedback path during learning to let the content image influence how the style code is computed. It also helps reduce the direct influence of the style image to the extract style code.
The bottom part of the Figure details architecture. First, the content image is fed into an encoder to compute a spatial feature map. This content feature map is then mean-pooled and mapped to a vector Similarly, the style image is fed into encoder to compute a spatial feature map. The style feature map is then mean-pooled and concatenated with an input-independent bias vector: the constant style bias (CSB). Note that while the regular bias in deep networks is added to the activations, in CSB, the bias is concatenated with the activations. The CSB provides a fixed input to the style encoder, which helps compute a style code that is less sensitive to the variations in the style image.
The concatenation of the style vector and the CSB is mapped to a vector via a fully connected layer. We then perform an element-wise product operation to and , which is the final style code. The style code is then mapped to produce the AdaIN parameters for generating the translation. Through this element-wise product operation, the resulting style code is heavily influenced by the content image. One way to look at this mechanism is that it produces a customized style code for the input content image.
The COCO is used as a drop-in replacement for the style encoder in FUNIT. Let denote the COCO mapping. The translation output is then computed via
The style code extracted by the COCO is more robust to variations in the style image. Note that we set to keep the number of parameters in our model similar to that in FUNIT.