Translations:Backpropagation/30/en
| Problem | Symptom | Common mitigations |
|---|---|---|
| Vanishing gradients | Early layers learn extremely slowly | ReLU activations, residual connections, batch normalisation, careful initialisation |
| Exploding gradients | Loss diverges or produces NaN values | gradient clipping, weight regularisation, lower learning rate |