All translations

Enter a message name below to show all available translations.

Message

Found 3 translations.

NameCurrent message text
 h English (en){{Term|pre-training}} used the BooksCorpus (800M words) and English Wikipedia (2,500M words), running for 1M {{Term|training step|steps}} with a {{Term|batch size}} of 256 sequences. The total {{Term|pre-training}} compute was substantial for its time, requiring four days on 4 to 16 Cloud TPUs (for Base and Large respectively).
 h Spanish (es)El {{Term|pre-training|preentrenamiento}} utilizó BooksCorpus (800M de palabras) y la Wikipedia en inglés (2500M de palabras), ejecutándose durante 1M de {{Term|training step|pasos}} con un {{Term|batch size|tamaño de lote}} de 256 secuencias. El cómputo total de {{Term|pre-training|preentrenamiento}} fue considerable para su época, requiriendo cuatro días en 4 a 16 Cloud TPUs (para Base y Large respectivamente).
 h Chinese (zh){{Term|pre-training|预训练}} 使用了 BooksCorpus(8 亿单词)和英文维基百科(25 亿单词),以 256 个序列的 {{Term|batch size|批量大小}} 运行了 100 万 {{Term|training step|训练步}}。在当时,{{Term|pre-training|预训练}} 的总计算量相当庞大,需要在 4 到 16 个 Cloud TPU 上运行四天(分别对应 Base 和 Large)。