Translations:Adam A Method for Stochastic Optimization/23/en: Difference between revisions
(Importing a new version from external source) |
(Importing a new version from external source) |
||
| Line 1: | Line 1: | ||
* ''' | * '''{{Term|logistic regression}}''' on MNIST: Adam converged faster than {{Term|stochastic gradient descent|SGD}} with {{Term|momentum}}, {{Term|adagrad}}, and RMSProp. | ||
* '''Multi-layer neural networks''' on MNIST: Adam achieved the lowest training cost, with convergence speed comparable to or better than competing methods. | * '''Multi-layer neural networks''' on MNIST: Adam achieved the lowest training cost, with {{Term|convergence}} speed comparable to or better than competing methods. | ||
* '''Convolutional neural networks''' on CIFAR-10: Adam performed comparably to SGD with carefully tuned momentum and learning rate schedules. | * '''Convolutional neural networks''' on CIFAR-10: Adam performed comparably to {{Term|stochastic gradient descent|SGD}} with carefully tuned {{Term|momentum}} and {{Term|learning rate}} schedules. | ||
* '''Variational autoencoders''' (VAEs): Adam was used successfully to optimize the variational lower bound, demonstrating its applicability to generative models. | * '''Variational autoencoders''' (VAEs): Adam was used successfully to optimize the variational lower bound, demonstrating its applicability to generative models. | ||
Latest revision as of 21:37, 27 April 2026
- logistic regression on MNIST: Adam converged faster than SGD with momentum, adagrad, and RMSProp.
- Multi-layer neural networks on MNIST: Adam achieved the lowest training cost, with convergence speed comparable to or better than competing methods.
- Convolutional neural networks on CIFAR-10: Adam performed comparably to SGD with carefully tuned momentum and learning rate schedules.
- Variational autoencoders (VAEs): Adam was used successfully to optimize the variational lower bound, demonstrating its applicability to generative models.