Translations:Adam A Method for Stochastic Optimization/2/en: Difference between revisions

Latest revision as of 21:37, 27 April 2026

Information about message (contribute)

This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.

Message definition (Adam A Method for Stochastic Optimization)

'''Adam: A Method for Stochastic Optimization''' is a 2015 paper by Kingma and Ba that introduced the '''Adam''' optimizer, an algorithm for first-order gradient-based optimization of stochastic {{Term|loss function|objective functions}}. Adam combines the advantages of two earlier methods — '''{{Term|adagrad}}''' (which adapts {{Term|learning rate|learning rates}} per parameter) and '''RMSProp''' (which uses a running average of squared gradients) — into a single algorithm with bias-corrected moment estimates. Adam has become the default optimizer for training neural networks across most domains.

Adam: A Method for Stochastic Optimization is a 2015 paper by Kingma and Ba that introduced the Adam optimizer, an algorithm for first-order gradient-based optimization of stochastic objective functions. Adam combines the advantages of two earlier methods — adagrad (which adapts learning rates per parameter) and RMSProp (which uses a running average of squared gradients) — into a single algorithm with bias-corrected moment estimates. Adam has become the default optimizer for training neural networks across most domains.

Revision as of 00:31, 27 April 2026 (view source) FuzzyBot (talk \| contribs) (Importing a new version from external source)		Latest revision as of 21:37, 27 April 2026 (view source) FuzzyBot (talk \| contribs) (Importing a new version from external source)
Line 1:		Line 1:
	'''Adam: A Method for Stochastic Optimization''' is a 2015 paper by Kingma and Ba that introduced the '''Adam''' optimizer, an algorithm for first-order gradient-based optimization of stochastic objective functions. Adam combines the advantages of two earlier methods — '''~~AdaGrad~~''' (which adapts learning rates per parameter) and '''RMSProp''' (which uses a running average of squared gradients) — into a single algorithm with bias-corrected moment estimates. Adam has become the default optimizer for training neural networks across most domains.		'''Adam: A Method for Stochastic Optimization''' is a 2015 paper by Kingma and Ba that introduced the '''Adam''' optimizer, an algorithm for first-order gradient-based optimization of stochastic {{Term\|loss function\|objective functions}}. Adam combines the advantages of two earlier methods — '''{{Term\|adagrad}}''' (which adapts {{Term\|learning rate\|learning rates}} per parameter) and '''RMSProp''' (which uses a running average of squared gradients) — into a single algorithm with bias-corrected moment estimates. Adam has become the default optimizer for training neural networks across most domains.