Jump to content

Translations:Attention Mechanisms/31/en

From Marovi AI

Revision as of 21:57, 27 April 2026 by FuzzyBot (talk | contribs) (Importing a new version from external source)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Cross-attention is used when queries come from one sequence and keys/values come from another. In encoder-decoder Transformers, the decoder attends to encoder outputs via cross-attention, enabling the model to condition its generation on the full input context.

Retrieved from "https://marovi.ai/index.php?title=Translations:Attention_Mechanisms/31/en&oldid=14244"