Translations:Attention Mechanisms/29/en

Learned positional embeddings and relative positional encodings (e.g., RoPE, ALiBi) are common alternatives that can generalise better to unseen sequence lengths.