The gradient with respect to the logit $ z $ takes the elegantly simple form $ \hat{y} - y $, which is both intuitive and computationally efficient.