On Thu, May 17, 2018 at 10:55 AM 'Bjarke Roune' via XLA development <
xla...@googlegroups.com> wrote:
> I'll give the XLA-level answer. If the weights are passed in as
parameters, then the compiler does not know what the weights are, only
their shape (i.e. the type of the array including array bounds). When you
run the XLA module (i.e. program), the parameters are passed in at runtime
and they will be the values of the Parameter nodes in the graph. The
parameters need to already be on the device, or otherwise they would need
to be transferred to the device first. If you want to transfer data in and
out of the program while it runs, you'd need support for infeed and outfeed
nodes. The only drawback here is that the compiler does not know the exact
values of the weights, though for large weight arrays this has so far for
us not been useful information to have in the compiler anyway.
On the CPU backend we will sometimes flip the layout of large weight
matrices if we think that can make some operations faster (see
ShouldMakeOperandColumnMajor in cpu_layout_assignment.cc). We can't do
this for parameters since from XLA's perspective the parameters have a
fixed layout. I'm working on some more optimizations (targeting the CPU
backend, though in principle this restriction can be lifted) that will be
more effective with the weights expressed as constants in XLA IR.
LLVM does have trouble with large constants, but this problem is already
addressed for the XLA JIT (when JITting we don't emit large constants in
the XLA IR as LLVM constants, but we instead use an "external constant
pool" for these). However, I have a better fix on the horizon, llvm's
ConstantDataArray is much better at handing large constants and I plan to
move both the JIT and the AOT backends to use it in the near future.
-- Sanjoy