What is: Conditional Position Encoding Vision Transformer?
Source | Conditional Positional Encodings for Vision Transformers |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
CPVT, or Conditional Position Encoding Vision Transformer, is a type of vision transformer which utilizes conditional positional encoding. Other than the new encodings, it follows the same architecture of ViT and DeiT.