All translations
Enter a message name below to show all available translations.
Found 3 translations.
| Name | Current message text |
|---|---|
| h English (en) | {{Term|pre-training}} used the BooksCorpus (800M words) and English Wikipedia (2,500M words), running for 1M {{Term|training step|steps}} with a {{Term|batch size}} of 256 sequences. The total {{Term|pre-training}} compute was substantial for its time, requiring four days on 4 to 16 Cloud TPUs (for Base and Large respectively). |
| h Spanish (es) | El {{Term|pre-training|preentrenamiento}} utilizó BooksCorpus (800M de palabras) y la Wikipedia en inglés (2500M de palabras), ejecutándose durante 1M de {{Term|training step|pasos}} con un {{Term|batch size|tamaño de lote}} de 256 secuencias. El cómputo total de {{Term|pre-training|preentrenamiento}} fue considerable para su época, requiriendo cuatro días en 4 a 16 Cloud TPUs (para Base y Large respectivamente). |
| h Chinese (zh) | {{Term|pre-training|预训练}} 使用了 BooksCorpus(8 亿单词)和英文维基百科(25 亿单词),以 256 个序列的 {{Term|batch size|批量大小}} 运行了 100 万 {{Term|training step|训练步}}。在当时,{{Term|pre-training|预训练}} 的总计算量相当庞大,需要在 4 到 16 个 Cloud TPU 上运行四天(分别对应 Base 和 Large)。 |