On Fri, Oct 5, 2012 at 5:33 PM, Jeremiah Lowin <
jlo...@gmail.com> wrote:
> There are many opportunities to optimize the new tensordot function by
> skipping operations that are unnecessary or redundant, like if the input is
> already a matrix (no reshape) or already has its dimensions in the required
> order (no transpose). Relatedly, there are a number of cases where I could
> choose to use vector/matrix multiplication or matrix/matrix multiplication.
> I'm curious if there's a good reason to prefer one over the other? I believe
> Theano has an optimization for matrix/matrix products, I don't know about
> matrix/vector.
>
> As an example of when this choice comes up, if all the axes of an input
> array are being summed over, then its reshaped form could be either a vector
> or a row/column matrix (depending on whether it's the first or second
> input). Alternatively, if a vector is passed in as an input, it can either
> be left as a vector or reshaped to a row/column matrix. In the first
> example, reshaping must take place and I just need to decide whether to
> reshape it as (N) or (1xN). In the second example, the reshape is optional
> and probably only worth it if there is a significant speedup from some
> theano-specific optimization. Basically this is a question of reshape
> overhead vs optimization speedup.
>
> Right now, when reshaping is required I use row/column matrices instead of
> vectors; when reshaping is optional, I don't do it at all. I'm curious if
> there's a better option?
>
> Thanks!
>
If you implement tensordot in terms of reshapes, transposes, and dot,
then there are several optimizations that should already kick in, do
they? Time has been spent on optimizing such subgraphs already, but
I'm sure there are still badly-handled cases. I'd recommend holding
off on such optimizations for this PR, and then putting together some
badly-handled cases to motivate a new PR for any missing
optimizations. Theano doesn't currently have support for
profile-driven optimizations, so sometimes we just have to make a
choice without knowing all the information we'd like.
I don't think it will make any difference whether you make your
vectors rows or columns, they should be put into the same canonical
form during the optimization path regardless.
HTH,
- James