What is: Siamese Multi-depth Transformer-based Hierarchical Encoder?
Source | Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Long-Form Document Matching |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
SMITH, or Siamese Multi-depth Transformer-based Hierarchical Encoder, is a Transformer-based model for document representation learning and matching. It contains several design choices to adapt self-attention models for long text inputs. For the model pre-training, a masked sentence block language modeling task is used in addition to the original masked word language model task used in BERT, to capture sentence block relations within a document. Given a sequence of sentence block representation, the document level Transformers learn the contextual representation for each sentence block and the final document representation.