; readWord success
  
In softmax classifier, why use exp function to do normalization? cella 2024 Branchable 0 12794
  

https://datascience.stackexchange.com/questions/23159/in-softmax-classifier-why-use-exp-function-to-do-normalization

The second answer by MachineLearner is great.

This is similar to the Boltzmann distribution.