Hi all,
To clarify, in ex12 we assume the usage of the cross-entropy loss which includes the softmax function.
I.e. as the loss you should implement this:
In this particular formulation, y is a scalar in {1, 2, ..., K}. However, it might be a bit easier to work with the one-hot representation of the labels. It's up to you.
Best wishes,
Maksym