Translations:Attention Mechanisms/31/en: Difference between revisions

Revision as of 21:57, 27 April 2026

Information about message (contribute)

This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.

Message definition (Attention Mechanisms)

'''Cross-attention''' is used when queries come from one sequence and keys/values come from another. In encoder-decoder {{Term|transformer|Transformers}}, the decoder attends to encoder outputs via cross-attention, enabling the model to condition its generation on the full input context.

Cross-attention is used when queries come from one sequence and keys/values come from another. In encoder-decoder Transformers, the decoder attends to encoder outputs via cross-attention, enabling the model to condition its generation on the full input context.

Revision as of 19:41, 27 April 2026 (view source) FuzzyBot (talk \| contribs) (Importing a new version from external source) ← Older edit		Revision as of 21:57, 27 April 2026 (view source) FuzzyBot (talk \| contribs) (Importing a new version from external source) Tag: Manual revert Newer edit →
Line 1:		Line 1:
	'''Cross-attention''' is used when queries come from one sequence and keys/values come from another. In encoder-decoder ~~{{Term\|transformer\|~~Transformers}}, the decoder attends to encoder outputs via cross-attention, enabling the model to condition its generation on the full input context.		'''Cross-attention''' is used when queries come from one sequence and keys/values come from another. In encoder-decoder Transformers, the decoder attends to encoder outputs via cross-attention, enabling the model to condition its generation on the full input context.