Translations:Batch Normalization/24/en
BatchNorm is typically applied before the activation function (as in the original paper), though some practitioners place it after the activation. For convolutional layers, normalization is performed per-channel across the spatial dimensions and the batch dimension.