CycleGAN, or Cycle-Consistent GAN, is a type of generative adversarial network for unpaired image-to-image translation. For two domains X and Y, CycleGAN learns a mapping G:X→Y and F:Y→X. The novelty lies in trying to enforce the intuition that these mappings should be reverses of each other and that both mappings should be bijections. This is achieved through a cycle consistency loss that encourages F(G(x))≈x and G(F(y))≈y. Combining this loss with the adversarial losses on X and Y yields the full objective for unpaired image-to-image translation.
For the mapping G:X→Y and its discriminator D_Y we have the objective:
L_GAN(G,D_Y,X,Y)=E_y∼p_data(y)[logD_Y(y)]+E_x∼p_data(x)[log(1−D_Y(G(x))]
where G tries to generate images G(x) that look similar to images from domain Y, while D_Y tries to discriminate between translated samples G(x) and real samples y. A similar loss is postulated for the mapping F:Y→X and its discriminator D_X.
The Cycle Consistency Loss reduces the space of possible mapping functions by enforcing forward and backwards consistency:
L_cyc(G,F)=E_x∼p_data(x)[∣∣F(G(x))−x∣∣_1]+E_y∼p_data(y)[∣∣G(F(y))−y∣∣_1]
The full objective is:
L_GAN(G,F,D_X,D_Y)=L_GAN(G,D_Y,X,Y)+L_GAN(F,D_X,X,Y)+λL_cyc(G,F)
Where we aim to solve:
G\*,F\*=argmin_G,Fmax_D_X,D_YL_GAN(G,F,D_X,D_Y)
For the original architecture the authors use:
- two stride-2 convolutions, several residual blocks, and two fractionally strided convolutions with stride 21.
- instance normalization
- PatchGANs for the discriminator
- Least Square Loss for the GAN objectives.