Our main contribution is the Guiding Points Network, where we integrate all information from the conditions to generate guiding points. By applying transformation matrices to scene entities (human/objects) with attention weighting, we can forecast the spanning of the target object.

**Kernel Inducing Points**, or **KIP**, is a meta-learning algorithm for learning datasets that can mitigate the challenges which occur for naturally occurring datasets without a significant sacrifice in performance. KIP uses kernel-ridge regression to learn $\epsilon$-approximate datasets. It can be regarded as an adaption of the inducing point method for Gaussian processes to the case of Kernel Ridge Regression.

Dataset Meta-Learning from Kernel Ridge-Regression

LSDM

GANformer is a novel and efficient type of [transformer](https://paperswithcode.com/method/transformer) which can be used for visual generative modeling. The network employs a bipartite structure that enables long-range interactions across an image, while maintaining computation of linearly efficiency, that can readily scale to high-resolution synthesis. It iteratively propagates information from a set of latent variables to the evolving visual features and vice versa, to support the refinement of each in light of the other and encourage the emergence of compositional representations of objects and scenes.

Source: [Generative Adversarial Transformers](https://arxiv.org/pdf/2103.01209v2.pdf)

Image source: [Generative Adversarial Transformers](https://arxiv.org/pdf/2103.01209v2.pdf)

Year	2000
Data Source	CC BY-SA - https://paperswithcode.com

What is: Language-driven Scene Synthesis using Multi-conditional Diffusion Model?

Viet-Anh on Software