Viet-Anh on Software Logo

What is: RealFormer?

SourceRealFormer: Transformer Likes Residual Attention
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

RealFormer is a type of Transformer based on the idea of residual attention. It adds skip edges to the backbone Transformer to create multiple direct paths, one for each type of attention module. It adds no parameters or hyper-parameters. Specifically, RealFormer uses a Post-LN style Transformer as backbone and adds skip edges to connect Multi-Head Attention modules in adjacent layers.