Hello Lance,
Thank you for your question.
That is indeed a nasty function to differentiate... I started working it out, but it would take quite a bite of time to do this completely. Also, it will depend on the expressions for p(x) and q(x), and their derivatives.
I would advise you start by working out the gradient for this function:
f(W) = ( sqrt(T(W'x)) - sqrt(1-T(W'x)) )^2
Indeed, (1) is a linear combination of such functions, so getting the full gradient from that of f is easy.
It's not too difficult to get the differential of f:
Df(W)[Wdot] = \frac{2T-1}{sqrt(T*(1-T))} * DT(W'x)[Wdot'x],
where T = T(W'x) for short, and DT(W'x) is the differential of T(x) at W'x.
Now, to get DT, you'll need to look into p, q, Dp and Dq.
Once you get that, you want to figure out the adjoint of DT (it's a linear operator). Call the adjoint DT*, Then, the gradient is:
nabla f(W) = \frac{2T-1}{sqrt(T*(1-T))} x * (DT*(W'x)[1])'
Once you have the expression for DT, use the definition to get DT*.
I'm sorry if this probably looks just as complicated to finish as it did at first..