Translations:Adam A Method for Stochastic Optimization/18/en: Difference between revisions
(Importing a new version from external source) |
(Importing a new version from external source) |
||
| Line 1: | Line 1: | ||
where <math>\alpha</math> is the step size (learning rate) and <math>\epsilon</math> is a small constant for numerical stability. | where <math>\alpha</math> is the {{Term|learning rate|step size}} ({{Term|learning rate}}) and <math>\epsilon</math> is a small constant for numerical stability. | ||
Latest revision as of 21:37, 27 April 2026
where $ \alpha $ is the step size (learning rate) and $ \epsilon $ is a small constant for numerical stability.