This sounds right.
I think people often make too big a deal about language semantics in
these cases; GPU support can/will/has been available in matlab,
python, etc. People seem content to write begin_GPU ... end and get
whatever semantics are necessary within that block, for example.
I am hoping that enough type infrastructure and metaprogramming will
be enough, and that from there a compiler for a gpu sublanguage can be
On Wed, May 30, 2012 at 4:57 PM, Andrei de Araújo Formiga
> 2012/5/30 Krzysztof Kamieniecki <kr...@kamieniecki.com>:
>> I am currently walking a fine line between wanting to get the GPU integrated
>> and not wanting to rebuild LLVM in Julia (and also not replicate what NVidia
>> is doing with PTX generation in LLVM). I've been busy with other things, but
>> I have been thinking about this, and I think I have finally figured out a
>> simple/quick way to get basic PTX code generated, that can be replaced when
>> the LLVM/PTX backend comes up to speed. Even when the GPU code
>> is generated there is still the issue of deciding how, when, how much data
>> is moved to the GPU and back, and when to execute the kernels. I want that
>> interface to match with @parallel and DArray, although I may change how it
>> behaves with the GPU, I hope that eventually the behavior will converge when
>> everyone has code to experiment with.
> Based on my experience with GPUs, this will be hard to do if the goal
> is good performance, but you noted some of the issues in your message.
> I think a first step is to provide ways for Julia users to send stuff
> to the GPU if they need it, even if it means changing/restructuring
> the functions that will be executed on the GPU. That is, create a more
> low-level layer for people who want to use it, and then build the
> higher level layers on top of it.
> Regarding OpenCL, as far as I know it is precisely done to make the
> programmer have control over the GPU, I don't know how OpenCL "takes
> out" the problematic elements of GPU programming. Programming OpenCL
> is similar to programming CUDA at the Driver API level, which is
> considerably lower-level than CUDA C. The only difference is that
> OpenCL has vector programming support, and the Intel OpenCL SDK has an
> automatic vectoriser, but I don't think it does everything magically
> (haven't used it though).
> s, Andrei Formiga