Translations:Stochastic Gradient Descent/27/zh

    From Marovi AI
    Revision as of 03:38, 27 April 2026 by DeployBot (talk | contribs) (Batch translate Stochastic Gradient Descent unit 27 → zh)
    (diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
    • 数据洗牌 —— 在每个 epoch 重新打乱数据集,避免出现循环模式。
    • 梯度裁剪 —— 对梯度范数进行截断,以防止更新爆炸,尤其是在循环神经网络中。
    • 批归一化 —— 对层输入进行归一化可降低对学习率的敏感度。
    • 混合精度训练 —— 使用半精度浮点数能在现代 GPU 上加速 SGD,同时几乎不损失精度。