Translations:Overfitting and Regularization/14/en: Difference between revisions

Revision as of 19:42, 27 April 2026

Information about message (contribute)

This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.

Message definition (Overfitting and Regularization)

The gradient of the regularization term is <math>\lambda \theta</math>, so each weight is multiplicatively shrunk toward zero at every update — hence the name '''{{Term|weight decay}}'''. The {{Term|hyperparameter}} <math>\lambda</math> controls the regularization strength.

The gradient of the regularization term is $\lambda \theta$ , so each weight is multiplicatively shrunk toward zero at every update — hence the name weight decay. The hyperparameter $\lambda$ controls the regularization strength.

Revision as of 00:30, 27 April 2026 (view source) FuzzyBot (talk \| contribs) (Importing a new version from external source)		Revision as of 19:42, 27 April 2026 (view source) FuzzyBot (talk \| contribs) (Importing a new version from external source) Newer edit →
Line 1:		Line 1:
	The gradient of the regularization term is <math>\lambda \theta</math>, so each weight is multiplicatively shrunk toward zero at every update — hence the name '''weight decay'''. The hyperparameter <math>\lambda</math> controls the regularization strength.		The gradient of the regularization term is <math>\lambda \theta</math>, so each weight is multiplicatively shrunk toward zero at every update — hence the name '''{{Term\|weight decay}}'''. The {{Term\|hyperparameter}} <math>\lambda</math> controls the regularization strength.