Translations:Overfitting and Regularization/14/en: Difference between revisions

    From Marovi AI
    (Importing a new version from external source)
     
    (Importing a new version from external source)
    Line 1: Line 1:
    The gradient of the regularization term is <math>\lambda \theta</math>, so each weight is multiplicatively shrunk toward zero at every update — hence the name '''weight decay'''. The hyperparameter <math>\lambda</math> controls the regularization strength.
    The gradient of the regularization term is <math>\lambda \theta</math>, so each weight is multiplicatively shrunk toward zero at every update — hence the name '''{{Term|weight decay}}'''. The {{Term|hyperparameter}} <math>\lambda</math> controls the regularization strength.

    Revision as of 19:42, 27 April 2026

    Information about message (contribute)
    This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.
    Message definition (Overfitting and Regularization)
    The gradient of the regularization term is <math>\lambda \theta</math>, so each weight is multiplicatively shrunk toward zero at every update — hence the name '''{{Term|weight decay}}'''. The {{Term|hyperparameter}} <math>\lambda</math> controls the regularization strength.

    The gradient of the regularization term is $ \lambda \theta $, so each weight is multiplicatively shrunk toward zero at every update — hence the name weight decay. The hyperparameter $ \lambda $ controls the regularization strength.