Translations:Overfitting and Regularization/14/en: Difference between revisions

Latest revision as of 23:34, 27 April 2026

Information about message (contribute)

This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.

Message definition (Overfitting and Regularization)

The gradient of the regularization term is <math>\lambda \theta</math>, so each weight is multiplicatively shrunk toward zero at every update — hence the name '''{{Term|weight decay}}'''. The {{Term|hyperparameter}} <math>\lambda</math> controls the regularization strength.

The gradient of the regularization term is $\lambda \theta$ , so each weight is multiplicatively shrunk toward zero at every update — hence the name weight decay. The hyperparameter $\lambda$ controls the regularization strength.

Revision as of 22:02, 27 April 2026 (view source) FuzzyBot (talk \| contribs) (Importing a new version from external source) Tag: Manual revert ← Older edit		Latest revision as of 23:34, 27 April 2026 (view source) FuzzyBot (talk \| contribs) (Importing a new version from external source) Tag: Manual revert
Line 1:		Line 1:
	The gradient of the regularization term is <math>\lambda \theta</math>, so each weight is multiplicatively shrunk toward zero at every update — hence the name '''weight decay'''. The hyperparameter <math>\lambda</math> controls the regularization strength.		The gradient of the regularization term is <math>\lambda \theta</math>, so each weight is multiplicatively shrunk toward zero at every update — hence the name '''{{Term\|weight decay}}'''. The {{Term\|hyperparameter}} <math>\lambda</math> controls the regularization strength.