"""Computes the gradients for softmax.
op: The softmax Operation that we are differentiating
grad: Gradient with respect to the output of the softmax op.
Gradients with respect to the input of softmax.