Translations:Adam A Method for Stochastic Optimization/23/en: Difference between revisions

Latest revision as of 21:37, 27 April 2026

Information about message (contribute)

This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.

Message definition (Adam A Method for Stochastic Optimization)

* '''{{Term|logistic regression}}''' on MNIST: Adam converged faster than {{Term|stochastic gradient descent|SGD}} with {{Term|momentum}}, {{Term|adagrad}}, and RMSProp.
* '''Multi-layer neural networks''' on MNIST: Adam achieved the lowest training cost, with {{Term|convergence}} speed comparable to or better than competing methods.
* '''Convolutional neural networks''' on CIFAR-10: Adam performed comparably to {{Term|stochastic gradient descent|SGD}} with carefully tuned {{Term|momentum}} and {{Term|learning rate}} schedules.
* '''Variational autoencoders''' (VAEs): Adam was used successfully to optimize the variational lower bound, demonstrating its applicability to generative models.

logistic regression on MNIST: Adam converged faster than SGD with momentum, adagrad, and RMSProp.
Multi-layer neural networks on MNIST: Adam achieved the lowest training cost, with convergence speed comparable to or better than competing methods.
Convolutional neural networks on CIFAR-10: Adam performed comparably to SGD with carefully tuned momentum and learning rate schedules.
Variational autoencoders (VAEs): Adam was used successfully to optimize the variational lower bound, demonstrating its applicability to generative models.