Translations:Softmax Function/31/en: Difference between revisions
(Importing a new version from external source) |
(Importing a new version from external source) |
||
| Line 1: | Line 1: | ||
In practice, the softmax and cross-entropy are computed jointly for numerical stability (the '''log-softmax''' formulation), and the argmax at inference time can be applied directly to the logits without computing softmax at all. | In practice, the softmax and {{Term|categorical cross-entropy|cross-entropy}} are computed jointly for numerical stability (the '''log-softmax''' formulation), and the argmax at inference time can be applied directly to the {{Term|logits}} without computing softmax at all. | ||
Revision as of 19:42, 27 April 2026
In practice, the softmax and cross-entropy are computed jointly for numerical stability (the log-softmax formulation), and the argmax at inference time can be applied directly to the logits without computing softmax at all.