Jump to content

Translations:Overfitting and Regularization/14/en

From Marovi AI

Revision as of 22:02, 27 April 2026 by FuzzyBot (talk | contribs) (Importing a new version from external source)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

The gradient of the regularization term is $\lambda \theta$ , so each weight is multiplicatively shrunk toward zero at every update — hence the name weight decay. The hyperparameter $\lambda$ controls the regularization strength.

Retrieved from "https://marovi.ai/index.php?title=Translations:Overfitting_and_Regularization/14/en&oldid=14787"