Translations:Cross-Entropy Loss/20/en

    From Marovi AI

    The gradient with respect to the logit $ z $ takes the elegantly simple form $ \hat{y} - y $, which is both intuitive and computationally efficient.