Viet-Anh on Software Logo

What is: Sparse Sinkhorn Attention?

SourceSparse Sinkhorn Attention
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Sparse Sinkhorn Attention is an attention mechanism that reduces the memory complexity of the dot-product attention mechanism and is capable of learning sparse attention outputs. It is based on the idea of differentiable sorting of internal representations within the self-attention module. SSA incorporates a meta sorting network that learns to rearrange and sort input sequences. Sinkhorn normalization is used to normalize the rows and columns of the sorting matrix. The actual SSA attention mechanism then acts on the block sorted sequences.