Viet-Anh on Software Logo

What is: Pix2Pix?

SourceImage-to-Image Translation with Conditional Adversarial Networks
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Pix2Pix is a conditional image-to-image translation architecture that uses a conditional GAN objective combined with a reconstruction loss. The conditional GAN objective for observed images xx, output images yy and the random noise vector zz is:

\mathbb{E}\_{x,z}\left[log(1 − D\left(x, G\left(x, z\right)\right)\right] $$ We augment this with a reconstruction term: $$ \mathcal{L}\_{L1}\left(G\right) = \mathbb{E}\_{x,y,z}\left[||y - G\left(x, z\right)||\_{1}\right] $$ and we get the final objective as: $$ G^{*} = \arg\min\_{G}\max\_{D}\mathcal{L}\_{cGAN}\left(G, D\right) + \lambda\mathcal{L}\_{L1}\left(G\right) $$ The architectures employed for the generator and discriminator closely follow [DCGAN](https://paperswithcode.com/method/dcgan), with a few modifications: - Concatenated skip connections are used to "shuttle" low-level information between the input and output, similar to a [U-Net](https://paperswithcode.com/method/u-net). - The use of a [PatchGAN](https://paperswithcode.com/method/patchgan) discriminator that only penalizes structure at the scale of patches.