Chris, it's an interesting question, but I don't think it's really an issue: (I'm pretty sure anyway)
Here's why:
say if y is a vector (let's say of length v), then you'd simply be solving
Transpose(X) X Theta = Transpose(X) Y
where Y would be the _matrix_ of y's (so, it's a matrix of (is n the number of samples? that's how I'm going to use it anyway) n x v)
This is really no different from solving it when v happens to be 1. It's just now your Theta is going to be a matrix of size ((m+1), v)
You'd solve it the same way...
(Transpose(X) X)^-1 Transpose(X) Y = Theta again, and now,
your theta would be matrix of size ((m+1), v)..
I'll try to code up a demo in Octave.
Binesh