All translations

Enter a message name below to show all available translations.

Found 3 translations.

Name	Current message text
^h English (en)	The encoder consists of six identical layers, each containing a multi-head self-{{Term\|attention}} sublayer followed by a position-wise feed-forward network, with residual connections and {{Term\|layer normalization}} around each sublayer. The decoder adds a third sublayer that performs multi-head {{Term\|attention}} over the encoder output, and masks future positions in the self-{{Term\|attention}} to preserve the autoregressive property.
^h Spanish (es)	El codificador consta de seis capas idénticas, cada una con una subcapa de auto-{{Term\|attention\|atención}} multi-cabeza seguida de una red feed-forward por posición, con conexiones residuales y {{Term\|layer normalization\|normalización de capa}} alrededor de cada subcapa. El decodificador añade una tercera subcapa que realiza {{Term\|attention\|atención}} multi-cabeza sobre la salida del codificador, y enmascara las posiciones futuras en la auto-{{Term\|attention\|atención}} para preservar la propiedad autorregresiva.
^h Chinese (zh)	编码器由六个相同的层组成，每层包含一个多头自 {{Term\|attention\|注意力}} 子层，随后是一个逐位置前馈网络，每个子层周围都设有残差连接和 {{Term\|layer normalization\|层归一化}}。解码器额外增加了第三个子层，对编码器输出执行多头 {{Term\|attention\|注意力}}，并在自 {{Term\|attention\|注意力}} 中对未来位置进行掩码以保持自回归特性。