Translations:Word Embeddings/41/en
- ELMo (Peters et al., 2018) — uses a bidirectional LSTM to generate context-dependent word representations.
- BERT (Devlin et al., 2019) — uses a Transformer encoder trained with masked language modelling.
- GPT series (Radford et al., 2018–) — uses a Transformer decoder trained autoregressively.