Jump to content

Translations:Stochastic Gradient Descent/27/zh

From Marovi AI

數據洗牌 —— 在每個 epoch 重新打亂數據集，避免出現循環模式。
梯度裁剪 —— 對梯度範數進行截斷，以防止更新爆炸，尤其是在循環神經網絡中。
批歸一化 —— 對層輸入進行歸一化可降低對學習率的敏感度。
混合精度訓練 —— 使用半精度浮點數能在現代 GPU 上加速 SGD，同時幾乎不損失精度。

Retrieved from "https://marovi.ai/index.php?title=Translations:Stochastic_Gradient_Descent/27/zh&oldid=5474"