Translations:Batch Normalization Accelerating Deep Network Training/15/en: Difference between revisions
(Importing a new version from external source) |
(Importing a new version from external source) |
||
| Line 1: | Line 1: | ||
During training, the mean and variance are computed per mini-batch. During inference, batch statistics are replaced with '''population statistics''' — running averages accumulated during training — so that the output for a single sample is deterministic and does not depend on other samples in the batch. | During training, the mean and variance are computed per {{Term|mini-batch}}. During inference, batch statistics are replaced with '''population statistics''' — running averages accumulated during training — so that the output for a single sample is deterministic and does not depend on other samples in the batch. | ||
Latest revision as of 21:40, 27 April 2026
During training, the mean and variance are computed per mini-batch. During inference, batch statistics are replaced with population statistics — running averages accumulated during training — so that the output for a single sample is deterministic and does not depend on other samples in the batch.