Translations:Adam A Method for Stochastic Optimization/2/en

Adam: A Method for Stochastic Optimization is a 2015 paper by Kingma and Ba that introduced the Adam optimizer, an algorithm for first-order gradient-based optimization of stochastic objective functions. Adam combines the advantages of two earlier methods — AdaGrad (which adapts learning rates per parameter) and RMSProp (which uses a running average of squared gradients) — into a single algorithm with bias-corrected moment estimates. Adam has become the default optimizer for training neural networks across most domains.