Translations:Adam A Method for Stochastic Optimization/18/en
where $ \alpha $ is the step size (learning rate) and $ \epsilon $ is a small constant for numerical stability.
where $ \alpha $ is the step size (learning rate) and $ \epsilon $ is a small constant for numerical stability.