Translations:Batch Normalization Accelerating Deep Network Training/16/en
batch normalization is typically applied before the activation function, after the linear or convolutional transformation. When used with convolutional layers, the normalization is performed per feature map (channel) rather than per individual activation, sharing statistics across all spatial locations within a feature map.