Translations:Softmax Function/23/en

    From Marovi AI

    This is why binary classifiers typically use a single output neuron with a sigmoid activation rather than two neurons with softmax — they are mathematically equivalent.