--
You received this message because you are subscribed to the Google Groups "WebGL Dev List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to webgl-dev-lis...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
a quick answer (I'll try to test timings later)
1. a question to the Intel team, if ANGLE or driver copies buffers (in the same CPU+GPU common memory) or just pass pointer to the buffer (zero copy)? somewhere in
gl.bufferData(gl.ARRAY_BUFFER, data, gl.STATIC_DRAW);
2. does "petamoriken/float16" uses WASM, SIMD, native functions or it is pure JS?
As I think TensorFlowJS may have a lot of f16 buffers. Is it possible to accelerate "+, -, x Const, convolution" operations on buffers (not necessary in WebGL, sorry :) ? E.g. shall one generate data in JS, then transfer them into Uint16Array or work immediately on buffers?
fairly vague questions (and may be off WebGL topic a bit)
Evgeny
On Thursday, February 21, 2019 at 10:44:07 PM UTC+3, Kenneth Russell wrote:On Wed, Feb 20, 2019 at 11:46 PM Evgeny Demidov <demidov...@gmail.com> wrote:There are NxNxN FLOP (mul + add) in matrix multiplications (data are only ~NxN). Therefore there may be very high GFLOPS. We can exclude data preparation from timing and it is enough to demonstrate Compute shaders advantage.For a TensorFlowJS library full performance is important. Matrix Vector multiplication and convolution operations performance is limited by bandwidth (only NxN FLOP). Therefore1. can we use zero copy for SSBO on embedded GPU with common CPU+GPU memory (Intel, mobile)?2. is it possible to make (temporarily) ANGLEFloat16Array, small WASM (native) function to convert JSfloat <-> Uint18Array or something else?Does https://github.com/petamoriken/float16 do what you want?-Ken
--
Sorry for the late response. Some updates on https://bugs.chromium.org/p/angleproject/issues/detail?id=3160 Thanks.
Regards,
Jiajia
I nether got error more then 0.000011
I got 1.4 error in Shader2 and 0.7 in Shader3 with GTX 780M(Kepler, same as GT 710), then 0.7 error in Shader2 and 0.05 in Shader3 with GTX 1080Ti(Pascal). I checked both under OpenGL backend. Sorry, I don't have any low-end environment. Is the error relevant to GPU grade...?
I can also reproduce this issue on Intel Kaby Lake. Agree that ‘barrier(CLK_LOCAL_MEM_FENCE)’ is equivalent to ‘barrier()’ not ‘groupMemoryBarrier()’☺
>And ALL_BARRIER_BITS is not defined in WebGL2ComputeRenderingContext yet. (Is there any plan to implement it?)
PR has been sent out. Thanks.
Regards,
Jiajia
From: webgl-d...@googlegroups.com [mailto:webgl-d...@googlegroups.com] On Behalf Of Kentaro Kawakatsu
Sent: Friday, March 1, 2019 1:22 PM
To: WebGL Dev List <webgl-d...@googlegroups.com>
--
--
--
--
--