Translations:BERT Pre-training of Deep Bidirectional Transformers/14/en: Difference between revisions

Latest revision as of 21:37, 27 April 2026

Information about message (contribute)

This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.

Message definition (BERT Pre-training of Deep Bidirectional Transformers)

Input representation combines token {{Term|embedding|embeddings}}, segment {{Term|embedding|embeddings}} (indicating sentence A or B), and positional {{Term|embedding|embeddings}}. BERT uses WordPiece {{Term|tokenization}} with a 30,000-token vocabulary.

Input representation combines token embeddings, segment embeddings (indicating sentence A or B), and positional embeddings. BERT uses WordPiece tokenization with a 30,000-token vocabulary.

Revision as of 00:31, 27 April 2026 (view source) FuzzyBot (talk \| contribs) (Importing a new version from external source)		Latest revision as of 21:37, 27 April 2026 (view source) FuzzyBot (talk \| contribs) (Importing a new version from external source)
Line 1:		Line 1:
	Input representation combines token embeddings, segment embeddings (indicating sentence A or B), and positional embeddings. BERT uses WordPiece tokenization with a 30,000-token vocabulary.		Input representation combines token {{Term\|embedding\|embeddings}}, segment {{Term\|embedding\|embeddings}} (indicating sentence A or B), and positional {{Term\|embedding\|embeddings}}. BERT uses WordPiece {{Term\|tokenization}} with a 30,000-token vocabulary.