Translations:Gradient Descent/20/en

For a convex function with Lipschitz-continuous gradients (constant $$ L $$ ), gradient descent with a fixed learning rate $\eta \leq 1/L$ converges at a rate of $$ O(1/t) $$ . If the function is additionally strongly convex with parameter $\mu > 0$ , convergence accelerates to a linear (exponential) rate: