Translations:Gradient Descent/20/en

    From Marovi AI

    For a convex function with Lipschitz-continuous gradients (constant $ L $), gradient descent with a fixed learning rate $ \eta \leq 1/L $ converges at a rate of $ O(1/t) $. If the function is additionally strongly convex with parameter $ \mu > 0 $, convergence accelerates to a linear (exponential) rate: