Hi Mike,
To expand to what Sophie wrote:
Observation 1: that's exactly right.
That's more elegant implementation-wise, so the spec could be re-written in those terms, although I won't comment on which formula is more or less elegant mathematically.
Observation 2: I got the same result, using the fact that c has \tau coefficients in {-1, 1} and that t0 has coefficients <= 2^{d-1}.
So |ct0| is bounded by 2^{d-1} \tau, which is smaller than gamma2 for Dilithium-3 and 5.
It could be good indeed to adjust the spec to remove this check for the parameter sets where it doesn't matter. Although having the algorithm change based on the parameters may lead to confusion and implementation mistakes (e.g. if the check is removed when it shouldn't be) - hence the need for good test vectors.
At least, the spec could comment that this check is unnecessary for the two larger parameter sets, and that implementations are allowed to remove it as an optimization.
We haven't implemented test vectors for Dilithium-2 so far, so my estimation for the rejection sampling rate of 2^-33 (per coefficient) is again based on the fact that each coefficient of ct0 is a sum of \tau values each (approximately) uniformly distributed in [-4095, 4096].
Approximating this sum by a normal distribution (central limit theorem) gave a probability of 2^-33 to exceed gamma2 (which looks quite a bit larger than 7.12e-12), but I haven't confirmed that in practice.
I don't have much to say about observation 3, but perhaps 3b isn't ideal as calculating UseHint(h, r + ct0) == w1 is more computation for the signer than just |ct0|.
Best,
Guillaume