Compute shaders

1,596 views
Skip to first unread message

Evgeny

unread,
Jan 15, 2019, 8:39:05 AM1/15/19
to WebGL Dev List

"Boids simulation" in “WebGL 2.0 Compute shader Demos” at

https://github.com/9ballsyndrome/WebGL_Compute_shader

doesn't work on my Lenovo G505 with AMD A6-5200 APU +

Canary + Win 10   :(

I'll check it later

 

I’d like to add more examples: 3D excitation waves,

 3D Game of Life and 3D Diffusion Limited Aggregation

https://www.ibiblio.org/e-notes/webgl/gpu/contents.htm

 

2D demos (e.g. simple waves) could be useful too

 

Evgeny

Cătălin George Feștilă

unread,
Jan 15, 2019, 10:50:54 AM1/15/19
to webgl-d...@googlegroups.com
about 2D demos and math the ShaderToy can be a solution.
I try to recreate the flat wave from
https://www.ibiblio.org/e-notes/webgl/gpu/flat_wave.htm
I don't know what is wrong ?
see: https://www.shadertoy.com/view/wds3W2
> --
> You received this message because you are subscribed to the Google Groups "WebGL Dev List" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to webgl-dev-lis...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Evgeny

unread,
Jan 15, 2019, 12:07:50 PM1/15/19
to WebGL Dev List


On Tuesday, January 15, 2019 at 6:50:54 PM UTC+3, catafest wrote:
about 2D demos and math the ShaderToy can be a solution.
I try to recreate the flat wave from
https://www.ibiblio.org/e-notes/webgl/gpu/flat_wave.htm
I don't know what is wrong ?
see: https://www.shadertoy.com/view/wds3W2
in current Firefox black screen and
TypeError: hWnd is null openwebmail-main.pl:273:7

I mean WebGL2+ compute shader demos

Evgeny

unread,
Jan 15, 2019, 12:46:34 PM1/15/19
to WebGL Dev List
for a second order in time equation (e.g. oscillator) you calculate y_t+1 as
    y_t+1 = F(y_t, y_t-1)
therefore for wave equation you need 2 NxN matrices to calculate new NxN matrix
in old WebGL1 demo I use only only Red F32 "color" and ping-pong 3 (1,2,3) textures 3 times
in WebGL2 demo F32RG texture holds t and t-1 values in R and G channels you calculate t+1
and store new t+1 and t vales again as F32RG texture (FBO).
You need one more "show" shader to visualize this F32 texture values on screen

I think I shall explain this more clear on my pages.

Evgeny

On Tuesday, January 15, 2019 at 6:50:54 PM UTC+3, catafest wrote:

Ken Russell

unread,
Jan 16, 2019, 8:45:38 PM1/16/19
to WebGL Dev List
Hi Evgeny,

I see the same problem on my Windows 10 workstation with NVIDIA Quadro K2200 GPU when navigating to  https://9ballsyndrome.github.io/WebGL_Compute_shader/   and specifically https://9ballsyndrome.github.io/WebGL_Compute_shader/webgl-compute-boids/dist/ . The GPU process crashes while compiling the compute shader and causes the WebGL context to be lost.

However, there is a working version of this demo here:

Running fine on Chrome 73.0.3669.0 (Official Build) canary (64-bit)  with the following command line:

"AppData\Local\Google\Chrome SxS\Application\chrome.exe" --enable-webgl2-compute-context

Does that version work for you too?

I'll ask the folks from Intel who are working on WebGL 2.0 compute to take a look and try to follow up here.

-Ken


--

Gu, Yang

unread,
Jan 16, 2019, 10:30:37 PM1/16/19
to webgl-d...@googlegroups.com

We already noticed this, and will take a detailed look next week (Sorry that this week we’re fully occupied by the preparation for the upcoming WebGL/WebGPU F2F).

To give an update on current status of WebGL 2.0 Compute. Almost all the features in ANGLE have been implemented, and we get close to 100% pass rate for native dEQP tests. There are still some opens in spec, and we’re going to discuss them with the community. In addition, we will soon expose all the APIs in Chrome in following weeks (the efforts are not big).

Evgeny, the samples for Compute Shader are awesome. Can we get your authorization to use some of them to present our work during the upcoming WebGL F2F? Thanks!

 

Regards,

-Yang

Evgeny

unread,
Jan 17, 2019, 4:06:26 AM1/17/19
to WebGL Dev List
However, there is a working version of this demo here:

Running fine on Chrome 73.0.3669.0 (Official Build) canary (64-bit)  with the following command line:

"AppData\Local\Google\Chrome SxS\Application\chrome.exe" --enable-webgl2-compute-context

Does that version work for you too?
no black screen and
171[.WebGL-000001331D4C6A60] GL_INVALID_VALUE: Stride must be within [0, MAX_VERTEX_ATTRIB_STRIDE).
85[.WebGL-000001331D4C6A60] GL_INVALID_OPERATION: An enabled vertex array has no buffer.

I'll try to make 2D waves as soon as possible. Boids later.

by the way (not urgent but interesting)
is it possible/easy (for Intel people) to add as a Compute shader extension double (F64)
1. math (e.g. for deep fractal zoom)
2. buffers for particle systems (I can't get thin structure of e.g. Lorenz strange attractor with F32 math/buffers).

Evgeny 

jiaji...@intel.com

unread,
Jan 17, 2019, 12:13:36 PM1/17/19
to WebGL Dev List
Hi Kbr/Evgeny,
I notice that you are using ssbo's structure data directly assign to another structure (See https://github.com/9ballsyndrome/WebGL_Compute_shader/blob/master/webgl-compute-boids/src/Main.ts#L182) . Current, this kind of usage hasn't been supported in D3D backend. So you will get compile error if you are using d3d rendering.

For now, ssbo in d3d only support scalar/vector/matrix directly assignment. If you want to assign a structure data in ssbo to another structure data. You can do like below:
struct Boids {
    vec3 position;
    vec3 velocity;
}

layout(std410, binding=0) buffer SSBOIn {
   Boids data[];
} ssboIn;

...
void main() {
...
    Boids boids;
    boids.position = ssboIn.data.position;
    boids.velocity = ssboIn.data.velocity;
...
}

You can pay attention to https://bugs.chromium.org/p/angleproject/issues/detail?id=1951 to get SSBO's latest status. structure/array directly assignment will be in our following plan.

Evgeny

unread,
Jan 17, 2019, 1:02:16 PM1/17/19
to WebGL Dev List
On Thursday, January 17, 2019 at 8:13:36 PM UTC+3, jiaji...@intel.com wrote:
Hi Kbr/Evgeny,
I notice that you are using ssbo's structure data directly assign to another structure
not me :)  Thank you.

for wave PDEs I'd like to read/write into 2D/3D textures (due to Z-ordering, nearest neighbor and so on). I have working WebGL 2 demo (I'd like to replace VS+FS -> Compute shader next) at
but it doesn't work with the webgl2-compute context (no CS yet).
[.WebGL-000001D5B212F8D0] GL_INVALID_VALUE: Stride must be within [0, MAX_VERTEX_ATTRIB_STRIDE).
254[.WebGL-000001D5B212F8D0] GL_INVALID_OPERATION: An enabled vertex array has no buffer.

Meanwhile I'd like to write in CS into texture "Image 2D test" demo.

Evgeny

Ken Russell

unread,
Jan 17, 2019, 1:06:13 PM1/17/19
to WebGL Dev List
On Thu, Jan 17, 2019 at 1:06 AM Evgeny <dem...@ipm.sci-nnov.ru> wrote:
However, there is a working version of this demo here:

Running fine on Chrome 73.0.3669.0 (Official Build) canary (64-bit)  with the following command line:

"AppData\Local\Google\Chrome SxS\Application\chrome.exe" --enable-webgl2-compute-context

Does that version work for you too?
no black screen and
171[.WebGL-000001331D4C6A60] GL_INVALID_VALUE: Stride must be within [0, MAX_VERTEX_ATTRIB_STRIDE).
85[.WebGL-000001331D4C6A60] GL_INVALID_OPERATION: An enabled vertex array has no buffer.

That's strange - https://www.khronos.org/registry/webgl/sdk/demos/intel/ComputeBoids.html works for me with both ANGLE's D3D11 and OpenGL backends.

Could you provide plaintext copy/paste of your about:gpu?

-Ken

 

I'll try to make 2D waves as soon as possible. Boids later.

by the way (not urgent but interesting)
is it possible/easy (for Intel people) to add as a Compute shader extension double (F64)
1. math (e.g. for deep fractal zoom)
2. buffers for particle systems (I can't get thin structure of e.g. Lorenz strange attractor with F32 math/buffers).

Evgeny 

--

Демидов Евгений Валентинович

unread,
Jan 17, 2019, 7:34:25 PM1/17/19
to webgl-d...@googlegroups.com
On Thu, 17 Jan 2019 10:05:54 -0800, Ken Russell wrote
> On Thu, Jan 17, 2019 at 1:06 AM Evgeny <dem...@ipm.sci-nnov.ru> wrote:
>
>

>
> However, there is a working version of this demo here:
>
> https://www.khronos.org/registry/webgl/sdk/demos/intel/ComputeBoids.html 
> https://github.com/KhronosGroup/WebGL/blob/master/sdk/demos/intel/ComputeBoids.html
>
> Running fine on Chrome 73.0.3669.0 (Official Build) canary (64-bit)  with the following command line:
>
> "AppData\Local\Google\Chrome SxS\Application\chrome.exe" --enable-webgl2-compute-context
>
> Does that version work for you too?

> no black screen and
> 171[.WebGL-000001331D4C6A60] GL_INVALID_VALUE: Stride must be within [0, MAX_VERTEX_ATTRIB_STRIDE).
> 85[.WebGL-000001331D4C6A60] GL_INVALID_OPERATION: An enabled vertex array has no buffer.

>
> That's strange - https://www.khronos.org/registry/webgl/sdk/demos/intel/ComputeBoids.html works for me with both ANGLE's D3D11 and OpenGL backends.
>
> Could you provide plaintext copy/paste of your about:gpu?
>
> -Ken
>

with "...chrome.exe" --enable-webgl2-compute-context --use-angle=gl --use-cmd-decoder=passthrough
Evgeny

Graphics Feature Status

  • Canvas: Hardware accelerated
  • Flash: Hardware accelerated
  • Flash Stage3D: Hardware accelerated
  • Flash Stage3D Baseline profile: Hardware accelerated
  • Compositing: Hardware accelerated
  • Multiple Raster Threads: Enabled
  • Native GpuMemoryBuffers: Software only. Hardware acceleration disabled
  • Out-of-process Rasterization: Unavailable
  • Hardware Protected Video Decode: Unavailable
  • Rasterization: Hardware accelerated
  • Skia Renderer: Disabled
  • Surface Control: Disabled
  • Surface Synchronization: Enabled
  • Video Decode: Hardware accelerated
  • Viz Service Display Compositor: Disabled
  • WebGL: Hardware accelerated
  • WebGL2: Hardware accelerated

Driver Bug Workarounds

  • clear_uniforms_before_first_program_use
  • decode_encode_srgb_for_generatemipmap
  • disable_discard_framebuffer
  • disable_dxgi_zero_copy_video
  • disable_framebuffer_cmaa
  • disable_nv12_dxgi_video
  • exit_on_context_lost
  • force_cube_complete
  • scalarize_vec_and_mat_constructor_args
  • disabled_extension_GL_KHR_blend_equation_advanced
  • disabled_extension_GL_KHR_blend_equation_advanced_coherent

Problems Detected

Version Information

Data exported2019-01-18T00:14:39.910Z
Chrome versionChrome/73.0.3674.0
Operating systemWindows NT 10.0.17134
Software rendering list URLhttps://chromium.googlesource.com/chromium/src/+/69010a60db037ed7b973da72c8e4171434473102/gpu/config/software_rendering_list.json
Driver bug list URLhttps://chromium.googlesource.com/chromium/src/+/69010a60db037ed7b973da72c8e4171434473102/gpu/config/gpu_driver_bug_list.json
ANGLE commit iddd34b3b9b707
2D graphics backendSkia/73 a30c755f95a388f7f8fbc93a5f6f62188a55ba93-
Command Line"C:\Users\guest\AppData\Local\Google\Chrome SxS\Application\chrome.exe" --enable-webgl2-compute-context --use-angle=gl --use-cmd-decoder=passthrough --flag-switches-begin --use-angle=gl --flag-switches-end

Driver Information

Initialization time1417
In-process GPUfalse
Passthrough Command Decodertrue
Sandboxedtrue
GPU0VENDOR = 0x1002 [Google Inc.], DEVICE= 0x9830 [ANGLE (ATI Technologies Inc., AMD Radeon HD 8400, OpenGL 4.5 core)] *ACTIVE*
Optimusfalse
AMD switchablefalse
Desktop compositingAero Glass
Direct compositionfalse
Supports overlaysfalse
Overlay capabilities
Diagonal Monitor Size of \\.\DISPLAY115.5"
Driver D3D12 feature levelD3D 12.0
Driver Vulkan API versionNot supported
Driver vendorAdvanced Micro Devices, Inc.
Driver version15.200.1055.0
Driver date7-6-2015
GPU CUDA compute capability major version0
Pixel shader version1.00
Vertex shader version1.00
Max. MSAA samples8
Machine model name
Machine model version
GL_VENDORGoogle Inc.
GL_RENDERERANGLE (ATI Technologies Inc., AMD Radeon HD 8400, OpenGL 4.5 core)
GL_VERSIONOpenGL ES 2.0 (ANGLE 2.1.0.dd34b3b9b707)
GL_EXTENSIONSGL_ANGLE_client_arrays GL_ANGLE_depth_texture GL_ANGLE_explicit_context GL_ANGLE_explicit_context_gles1 GL_ANGLE_framebuffer_blit GL_ANGLE_framebuffer_multisample GL_ANGLE_instanced_arrays GL_ANGLE_memory_size GL_ANGLE_multi_draw GL_ANGLE_program_cache_control GL_ANGLE_request_extension GL_ANGLE_robust_client_memory GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_ANGLE_texture_rectangle GL_ANGLE_translated_shader_source GL_CHROMIUM_bind_generates_resource GL_CHROMIUM_bind_uniform_location GL_CHROMIUM_color_buffer_float_rgb GL_CHROMIUM_color_buffer_float_rgba GL_CHROMIUM_copy_texture GL_CHROMIUM_sync_query GL_EXT_blend_minmax GL_EXT_color_buffer_half_float GL_EXT_debug_marker GL_EXT_discard_framebuffer GL_EXT_disjoint_timer_query GL_EXT_draw_buffers GL_EXT_frag_depth GL_EXT_map_buffer_range GL_EXT_multisample_compatibility GL_EXT_occlusion_query_boolean GL_EXT_read_format_bgra GL_EXT_robustness GL_EXT_sRGB GL_EXT_sRGB_write_control GL_EXT_shader_texture_lod GL_EXT_texture_compression_bptc GL_EXT_texture_compression_dxt1 GL_EXT_texture_filter_anisotropic GL_EXT_texture_format_BGRA8888 GL_EXT_texture_rg GL_EXT_texture_sRGB_decode GL_EXT_texture_storage GL_EXT_unpack_subimage GL_KHR_debug GL_KHR_parallel_shader_compile GL_NV_fence GL_NV_pack_subimage GL_OES_compressed_ETC1_RGB8_texture GL_OES_depth32 GL_OES_element_index_uint GL_OES_fbo_render_mipmap GL_OES_get_program_binary GL_OES_mapbuffer GL_OES_packed_depth_stencil GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_surfaceless_context GL_OES_texture_border_clamp GL_OES_texture_float GL_OES_texture_float_linear GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_texture_npot GL_OES_vertex_array_object OES_compressed_EAC_R11_signed_texture OES_compressed_EAC_R11_unsigned_texture OES_compressed_EAC_RG11_signed_texture OES_compressed_EAC_RG11_unsigned_texture OES_compressed_ETC2_RGB8_texture OES_compressed_ETC2_RGBA8_texture OES_compressed_ETC2_punchthroughA_RGBA8_texture OES_compressed_ETC2_punchthroughA_sRGB8_alpha_texture OES_compressed_ETC2_sRGB8_alpha8_texture OES_compressed_ETC2_sRGB8_texture
Disabled ExtensionsGL_KHR_blend_equation_advanced GL_KHR_blend_equation_advanced_coherent
Disabled WebGL Extensions
Window system binding vendor
Window system binding version1.4 (ANGLE 2.1.0.dd34b3b9b707)
Window system binding extensionsEGL_EXT_create_context_robustness EGL_ANGLE_d3d_share_handle_client_buffer EGL_ANGLE_d3d_texture_client_buffer EGL_ANGLE_surface_d3d_texture_2d_share_handle EGL_ANGLE_query_surface_pointer EGL_ANGLE_keyed_mutex EGL_KHR_create_context EGL_KHR_get_all_proc_addresses EGL_ANGLE_create_context_webgl_compatibility EGL_CHROMIUM_create_context_bind_generates_resource EGL_EXT_pixel_format_float EGL_KHR_surfaceless_context EGL_ANGLE_display_texture_share_group EGL_ANGLE_create_context_client_arrays EGL_ANGLE_program_cache_control EGL_ANGLE_robust_resource_initialization EGL_ANGLE_create_context_extensions_enabled EGL_ANDROID_blob_cache
Direct renderingYes
Reset notification strategy0x8252
GPU process crash count0

Compositor Information

Tile Update ModeOne-copy
Partial RasterEnabled

GpuMemoryBuffers Status

R_8Software only
R_16Software only
RG_88Software only
BGR_565Software only
RGBA_4444Software only
RGBX_8888GPU_READ, SCANOUT
RGBA_8888GPU_READ, SCANOUT
BGRX_8888Software only
BGRX_1010102Software only
RGBX_1010102Software only
BGRA_8888Software only
RGBA_F16Software only
YVU_420Software only
YUV_420_BIPLANARSoftware only
UYVY_422Software only

Display(s) Information

InfoDisplay[2528732444] bounds=[0,0 1093x615], workarea=[0,0 1093x615], scale=1.25, external.
Color space information{primaries:BT709, transfer:IEC61966_2_1, matrix:RGB, range:FULL}
Bits per color component8
Bits per pixel24

Video Acceleration Information

Decode h264 baselineup to 1920x1088 pixels
Decode h264 mainup to 1920x1088 pixels
Decode h264 highup to 1920x1088 pixels
Encode h264 baselineup to 3840x2176 pixels and/or 30.000 fps
Encode h264 mainup to 3840x2176 pixels and/or 30.000 fps
Encode h264 highup to 3840x2176 pixels and/or 30.000 fps

Diagnostics

0
b3DAccelerationEnabledtrue
b3DAccelerationExiststrue
bAGPEnabledtrue
bAGPExistenceValidtrue
bAGPExiststrue
bCanRenderWindowtrue
bDDAccelerationEnabledtrue
bDriverBetafalse
bDriverDebugfalse
bDriverSignedfalse
bDriverSignedValidfalse
bNoHardwarefalse
dwBpp32
dwDDIVersion12
dwHeight768
dwRefreshRate60
dwWHQLLevel0
dwWidth1366
iAdapter0
lDriverSize1507024
lMiniVddSize0
szAGPStatusEnglishEnabled
szAGPStatusLocalizedВкл
szChipTypeAMD Radeon Graphics Processor (0x9830)
szD3DStatusEnglishEnabled
szD3DStatusLocalizedВкл
szDACTypeInternal DAC(400MHz)
szDDIVersionEnglish12
szDDIVersionLocalized12
szDDStatusEnglishEnabled
szDDStatusLocalizedВкл
szDXVAHDEnglishNot Supported
szDXVAModesModeMPEG2_A ModeMPEG2_C ModeVC1_C ModeWMV9_C
szDescriptionAMD Radeon HD 8400
szDeviceId0x9830
szDeviceIdentifier{D7B71EE2-DB70-11CF-6371-0818BEC2C535}
szDeviceName\\.\DISPLAY1
szDisplayMemoryEnglish2262 MB
szDisplayMemoryLocalized2262 MB
szDisplayModeEnglish1366 x 768 (32 bit) (60Hz)
szDisplayModeLocalized1366 x 768 (32 bit) (60Hz)
szDriverAssemblyVersion15.200.1055.0
szDriverAttributesFinal Retail
szDriverDateEnglish06.07.2015 3:00:00
szDriverDateLocalized7/6/2015 03:00:00
szDriverLanguageEnglishEnglish
szDriverLanguageLocalizedАнглийский
szDriverModelEnglishWDDM 2.0
szDriverModelLocalizedWDDM 2.0
szDriverNameaticfx64.dll,aticfx64.dll,aticfx64.dll,amdxc64.dll
szDriverNodeStrongNameoem1.inf:cb0ae4148c08a993:ati2mtag_Kabini_Mobile:15.200.1055.0:pci\ven_1002&dev_9830&subsys_380217aa
szDriverSignDateUnknown
szDriverVersion8.17.0010.1401
szKeyDeviceIDEnum\PCI\VEN_1002&DEV_9830&SUBSYS_380217AA&REV_00
szKeyDeviceKey\Registry\Machine\System\CurrentControlSet\Control\Video\{4D4ED69A-4FA4-11E8-B8E4-BCC88AD4C3D1}\0000
szManufacturerAdvanced Micro Devices, Inc.
szMiniVddНет данных
szMiniVddDateEnglishUnknown
szMiniVddDateLocalizedНет данных
szMonitorMaxResUnknown
szMonitorNameGeneric PnP Monitor
szNotesEnglishNo problems found.
szNotesLocalizedНеполадок не найдено.
szOverlayEnglishNot Supported
szRankOfInstalledDriver00D10001
szRegHelpTextUnknown
szRevisionUnknown
szRevisionId0x0000
szSubSysId0x380217AA
szTestResultD3D7EnglishNot run
szTestResultD3D7LocalizedНе выполнена
szTestResultD3D8EnglishNot run
szTestResultD3D8LocalizedНе выполнена
szTestResultD3D9EnglishNot run
szTestResultD3D9LocalizedНе выполнена
szTestResultDDEnglishNot run
szTestResultDDLocalizedНе выполнена
szVddНет данных
szVendorId0x1002

Log Messages

  • [1372:2204:0118/030835.644:WARNING:angle_platform_impl.cc(52)] : initializeImpl(256): WGL_ARB_create_context_robustness exists but unable to OpenGL context with robustness.
  • GpuProcessHostUIShim:
  • GpuProcessHostUIShim:
  • [1372:2204:0118/031438.910:ERROR:gles2_cmd_decoder_passthrough_doers.cc(4365)] : NOT IMPLEMENTED

Evgeny

unread,
Jan 18, 2019, 2:01:12 AM1/18/19
to WebGL Dev List
Boids work fine on my desktop with NVidea GPU + Win10 64
"...chrome.exe" --use-cmd-decoder=passthrough --enable-webgl2-compute-context  --use-angle=gl

I'l try to update old Lenovo/Win video drivers

Evgeny


On Tuesday, January 15, 2019 at 4:39:05 PM UTC+3, Evgeny wrote:

"Boids simulation" in “WebGL 2.0 Compute shader Demos” at

https://github.com/9ballsyndrome/WebGL_Compute_shader

doesn't work on my Lenovo G505 with AMD A6-5200 APU +

Canary + Win 10   :(

I'll check it later

 

Evgeny

Evgeny

unread,
Jan 18, 2019, 5:29:52 AM1/18/19
to WebGL Dev List
Sorry, boids (and my draft demos) work with new drivers updated manually from AMD. Windows uses old (2015) Lenovo drivers.

Evgeny

On Tuesday, January 15, 2019 at 4:39:05 PM UTC+3, Evgeny wrote:

"Boids simulation" in “WebGL 2.0 Compute shader Demos” at

https://github.com/9ballsyndrome/WebGL_Compute_shader

doesn't work on my Lenovo G505 with AMD A6-5200 APU +

Canary + Win 10   :(

I'll check it later

 

Evgeny

Cătălin George Feștilă

unread,
Jan 18, 2019, 8:36:07 AM1/18/19
to webgl-d...@googlegroups.com
about ComputeBoids with Opera browser :
the Opera browser tells me about :
https://www.khronos.org/registry/webgl/sdk/demos/intel/ComputeBoids.html

It doesn't appear your computer can support WebGL 2.0 Compute.
Click here for more information.

the "Click here" send me to next link : "Click Here for WebGL
troubleshooting info for Mozilla on Windows"
and this click link send me to:
https://get.webgl.org/troubleshooting/null

Cătălin George Feștilă

unread,
Jan 18, 2019, 8:37:24 AM1/18/19
to webgl-d...@googlegroups.com
The demo from webpage:
https://github.com/9ballsyndrome/WebGL_Compute_shader come with this
output from Opera

WebGL 2.0 Compute not available

Make sure you are on a system with WebGL 2.0 Compute enabled. Windows
Chrome Canary with Command Line Switches:
"--enable-webgl2-compute-context" and "--use-angle=gl".

Gu, Yang

unread,
Jan 18, 2019, 9:49:51 AM1/18/19
to webgl-d...@googlegroups.com
Currently only Chrome browser supports WebGL 2.0 Compute.

Regards,
-Yang

Ken Russell

unread,
Jan 18, 2019, 11:48:28 AM1/18/19
to WebGL Dev List
Great, glad it's working for you!


--

an tro

unread,
Jan 21, 2019, 1:23:56 PM1/21/19
to WebGL Dev List
I've added 2D wave demo with CS at https://www.ibiblio.org/e-notes/webgl/gpu/contents.htm

Surprisingly I can't find RG32F Format Layout Qualifiers for the compute shaders (I can use RG32F textures in FBO). Therefore RGBA32F textures are used.

3D Barkley model is broken yet (2D only). It is likely that Z-coordinate dispatch is not working. I'll make simple test tomorrow.

Evgeny

Ken Russell

unread,
Jan 21, 2019, 1:25:20 PM1/21/19
to WebGL Dev List
Super work Evgeny! Thank you for putting these together! Excited to see what you'll do next with compute shaders!

-Ken


Evgeny Demidov

unread,
Jan 22, 2019, 2:36:07 AM1/22/19
to WebGL Dev List
A few years ago I was suggested (by someone from Mozilla) to use compute shaders for splines rendering. https://www.ibiblio.org/e-notes/webgl/gpu/contents.htm

1,2D Bezier, extrusions, lathe...

2D spline patches are used in procedural Lathe flowers 2. Almost all meshes are stored in one big vertex array. It is possible:
  just accelerate mesh generation by GPU
  generate spline patches on fly
Not sure if control points transformation and spline generation on fly will be faster than one vertex array transformation. But one can effectively generate patches on fly for interactive LOD.


Any suggestions?


Evgeny

Evgeny Demidov

unread,
Jan 25, 2019, 12:01:45 PM1/25/19
to WebGL Dev List
does gl.TEXTURE_WRAP (e.g. periodic) work in Compute shader?
I'm using 
   ivec2 p = ivec2(gl_GlobalInvocationID.xy);
   ... imageLoad(inTex, ivec2(p.x, p.y - 1) ).r ...
in 2D waves. Shall I wrap images by hand?
I see that WebGL2 (FS) and CS waves are reflected differently from boundaries.

Note: Load operations from any texel that is outside of the boundaries of the bound image will return all zeros.

Evgeny

Evgeny Demidov

unread,
Jan 25, 2019, 1:49:51 PM1/25/19
to WebGL Dev List
Thanks to a guy from Intel, I've updated 3D Barkley's model (clamped by hand).

Compute shader has many "1" (or "0") constants. As I remember they may be stored separately as different constants (or inserted in gpu program codes). Shall I use "const one=1" and so on? Local or global? Or compilers optimize constants usage? And it may depend on WebGL backend  GL or D3D11.

not sure shall I use
 layout (local_size_x = 4, local_size_y = 4, local_size_z = 4) in;
or 8x8x8 workgroup (on my notebook work both).
I get  alert(gl.MAX_COMPUTE_WORK_GROUP_INVOCATIONS)  "undefined"

Evgeny

Qin, Jiajia

unread,
Jan 25, 2019, 2:40:31 PM1/25/19
to webgl-d...@googlegroups.com

I think that texParameter only works for texture image unit which is used for sampling. imageLoad will operate the image unit not texture image unit. According to the spec, ‘If the individual texel identified for an image load or store operation doesn’t exist, the access is treated as invalid. Invalid image loads will return a vector where the value of R, G, and B components is 0 and the value of the A component is undefined. Invalid image stores will have no effect’.

 

So in fragment shader, I presume that you are using texture sampler not image load. Maybe that’s why you see the difference.

If you want to use gl.TEXTURE_WARP, try to use texture instead of imageLoad.

 

Regards,

Jiajia

 

From: webgl-d...@googlegroups.com [mailto:webgl-d...@googlegroups.com] On Behalf Of Evgeny Demidov


Sent: Friday, January 25, 2019 9:02 AM
To: WebGL Dev List <webgl-d...@googlegroups.com>

--

Qin, Jiajia

unread,
Jan 25, 2019, 3:12:44 PM1/25/19
to webgl-d...@googlegroups.com

Hi Evgeny,

Sorry that I haven’t added gl.MAX_COMPUTE_WORK_GROUP_INVOCATIONS to implementation IDL. You can check https://cs.chromium.org/chromium/src/third_party/blink/renderer/modules/webgl/webgl2_compute_rendering_context_base.idl?sq=package:chromium&g=0 for supported enums and apis.

 

The missed compute shader enums will be added soon. Please stay tuned.

Currently, in most drivers, if the result of local_size_x * local_size_y * local_size_z is less than 1024, I think it’s safe to use. But it doesn’t mean that the bigger the number, the better. You need to turn it. Sorry to the inconvenience.

 

Regards,

Jiajia

 

From: webgl-d...@googlegroups.com [mailto:webgl-d...@googlegroups.com] On Behalf Of Evgeny Demidov
Sent: Friday, January 25, 2019 10:50 AM
To: WebGL Dev List <webgl-d...@googlegroups.com>
Subject: [webgl-dev-list] Re: Compute shaders

 

Thanks to a guy from Intel, I've updated 3D Barkley's model (clamped by hand).

--

Ken Russell

unread,
Jan 28, 2019, 7:43:00 PM1/28/19
to WebGL Dev List
Great work Evgeny prototyping these and thanks Jiajia for your help!

Evgeny, seeing "Uncaught ReferenceError: initEvents is not defined" while trying your compute shader based 3D Barkley's model.

-Ken


--

Evgeny Demidov

unread,
Jan 29, 2019, 2:29:13 AM1/29/19
to WebGL Dev List
I think your browser uses old cashed version of https://www.ibiblio.org/e-notes/webgl/Controls.js .
initEvents() at the bottom.

This simulation may render up to 128x128x128 "point" sprites. I'm trying to make faster simulation at https://www.ibiblio.org/e-notes/webgl/gpu/barkley3D2cs2.htm. How can I expand by html WebGL viewport on canvas (e.g. 3 times) or shall I render into FBO and textured quad?

Evgeny

Evgeny Demidov

unread,
Jan 29, 2019, 2:37:43 AM1/29/19
to WebGL Dev List
I'd like to use explicit uniform location (recommended e.g. by ARM)
  layout (location = 0) uniform vec4 col;
...
   const colLoc = 0
   gl.uniform4f(colLoc, 1,0,0,1 );
but get "'WebGL2ComputeRenderingContext': parameter 1 is not of type 'WebGLUniformLocation'." I think it is not implemented yet.

Evgeny

Evgeny Demidov

unread,
Jan 29, 2019, 3:11:51 AM1/29/19
to WebGL Dev List
code below are used (don't think I'm using HandlingHighDPI :)
   c_w = Math.floor(window.innerWidth*.9);   c_h = window.innerHeight - 10
   canvas.width = c_w;   canvas.height = c_h
...
  gl.viewport(0, 0, c_w, c_h)

Qin, Jiajia

unread,
Jan 29, 2019, 3:39:17 AM1/29/19
to webgl-d...@googlegroups.com

You should use it like below (const value is not supported):

var colLoc = gl.getUniformLocation(program, ‘col’);

gl.uniform4f(colLoc, 1, 0, 0, 1);

 

From: webgl-d...@googlegroups.com [mailto:webgl-d...@googlegroups.com] On Behalf Of Evgeny Demidov
Sent: Tuesday, January 29, 2019 3:38 PM
To: WebGL Dev List <webgl-d...@googlegroups.com>
Subject: [webgl-dev-list] Re: Compute shaders

 

I'd like to use explicit uniform location (recommended e.g. by ARM)

--

Evgeny Demidov

unread,
Jan 29, 2019, 8:06:53 AM1/29/19
to WebGL Dev List
https://www.khronos.org/opengl/wiki/Layout_Qualifier_(GLSL)#Explicit_uniform_location
"For example, uniform location 2 represents the array `some_thingies[0].an_array`. As such, you can upload an array of vec4s to this array with glUniform4fv(2, 3, ...);."

"Calling glGetUniformLocation(prog, "modelToWorldMatrix") is guaranteed to return 2. "
location should be int but in WebGL2+ it is WebGLUniformLocation object.

I think uniform locations should be similar to attribute locations.

"OpenGL [ES] Optimizations +Shanee Nishry @Lunarsong"

Say, that
 GLint handleWVP = glGetUniformLocation( shaderProgram, “g_matWorldViewProj” );
is slow and non deterministic (p.39).

Evgeny

Ken Russell

unread,
Jan 29, 2019, 4:17:36 PM1/29/19
to WebGL Dev List
On Mon, Jan 28, 2019 at 11:29 PM Evgeny Demidov <demidov...@gmail.com> wrote:
I think your browser uses old cashed version of https://www.ibiblio.org/e-notes/webgl/Controls.js .
initEvents() at the bottom.

Thanks. That was the problem. Manually refreshing that file worked.


This simulation may render up to 128x128x128 "point" sprites. I'm trying to make faster simulation at https://www.ibiblio.org/e-notes/webgl/gpu/barkley3D2cs2.htm. How can I expand by html WebGL viewport on canvas (e.g. 3 times) or shall I render into FBO and textured quad?

You need to use JavaScript code to set the canvas's width and height to the devicePixelRatio * the desired width and height, while setting the canvas's "style" (CSS) width and height to the desired width and height *not* multiplied by devicePixelRatio. See https://www.khronos.org/webgl/wiki/HandlingHighDPI .

-Ken
 

Evgeny

On Tuesday, January 29, 2019 at 3:43:00 AM UTC+3, Kenneth Russell wrote:
Great work Evgeny prototyping these and thanks Jiajia for your help!

Evgeny, seeing "Uncaught ReferenceError: initEvents is not defined" while trying your compute shader based 3D Barkley's model.

-Ken

Ken Russell

unread,
Jan 29, 2019, 4:24:04 PM1/29/19
to WebGL Dev List
On Tue, Jan 29, 2019 at 5:06 AM Evgeny Demidov <demidov...@gmail.com> wrote:
https://www.khronos.org/opengl/wiki/Layout_Qualifier_(GLSL)#Explicit_uniform_location
"For example, uniform location 2 represents the array `some_thingies[0].an_array`. As such, you can upload an array of vec4s to this array with glUniform4fv(2, 3, ...);."

"Calling glGetUniformLocation(prog, "modelToWorldMatrix") is guaranteed to return 2. "
location should be int but in WebGL2+ it is WebGLUniformLocation object.

I think uniform locations should be similar to attribute locations.

WebGL's uniform locations are opaque by design. It's too easy for applications to perform incorrect arithmetic on uniform locations.

I'm not sure how this impacts compute shaders. Up to WebGL 2.0 and OpenGL ES 3.0, it wasn't possible to specify explicit uniform locations using layout qualifiers, but that functionality was introduced in OpenGL ES 3.1 (?).

It would violate abstraction barriers to allow explicit uniform locations to be specified (sidestepping the handling of other uniforms), so perhaps WebGL 2.0 Compute shaders should forbid the location qualifier with uniform variables.

-Ken

 

"OpenGL [ES] Optimizations +Shanee Nishry @Lunarsong"

Say, that
 GLint handleWVP = glGetUniformLocation( shaderProgram, “g_matWorldViewProj” );
is slow and non deterministic (p.39).

Evgeny

On Tuesday, January 29, 2019 at 11:39:17 AM UTC+3, Qin, Jiajia wrote:

You should use it like below (const value is not supported):

var colLoc = gl.getUniformLocation(program, ‘col’);

gl.uniform4f(colLoc, 1, 0, 0, 1);

 

From: webgl-d...@googlegroups.com [mailto:webgl-d...@googlegroups.com] On Behalf Of Evgeny Demidov
Sent: Tuesday, January 29, 2019 3:38 PM
To: WebGL Dev List <webgl-d...@googlegroups.com>
Subject: [webgl-dev-list] Re: Compute shaders

 

I'd like to use explicit uniform location (recommended e.g. by ARM)

  layout (location = 0) uniform vec4 col;

...

   const colLoc = 0

   gl.uniform4f(colLoc, 1,0,0,1 );

but get "'WebGL2ComputeRenderingContext': parameter 1 is not of type 'WebGLUniformLocation'." I think it is not implemented yet.

 

Evgeny

--

Qin, Jiajia

unread,
Jan 29, 2019, 10:12:31 PM1/29/19
to webgl-d...@googlegroups.com

Just clarify that gl.uniform4fv is a WebGL 1.0 API not introduced by WebGL 2.0 Compute. Currently, all uniform location in WebGL or later version is a WebGLUniformLocation object. It’s also a difference with native OpenGL ES API.

 

From: webgl-d...@googlegroups.com [mailto:webgl-d...@googlegroups.com] On Behalf Of Ken Russell


Sent: Wednesday, January 30, 2019 5:24 AM
To: WebGL Dev List <webgl-d...@googlegroups.com>

Subject: Re: [webgl-dev-list] Re: Compute shaders

 

On Tue, Jan 29, 2019 at 5:06 AM Evgeny Demidov <demidov...@gmail.com> wrote:.

Qin, Jiajia

unread,
Jan 30, 2019, 2:34:42 AM1/30/19
to webgl-d...@googlegroups.com

Sorry Ken. I misunderstood your meaning. I think you are right. In OpenGL ES 3.1, we can specify explicit uniform locations using layout qualifiers. Agree you that we should forbid it in WebGL 2.0 Compute.

 

From: Qin, Jiajia

Sent: Wednesday, January 30, 2019 11:12 AM
To: WebGL Dev List <webgl-d...@googlegroups.com>

Subject: RE: [webgl-dev-list] Re: Compute shaders

 

Just clarify that gl.uniform4fv is a WebGL 1.0 API not introduced by WebGL 2.0 Compute. Currently, all uniform location in WebGL or later version is a WebGLUniformLocation object. It’s also a difference with native OpenGL ES API.

 


Sent: Wednesday, January 30, 2019 5:24 AM

To: WebGL Dev List <webgl-d...@googlegroups.com>

Subject: Re: [webgl-dev-list] Re: Compute shaders

 

On Tue, Jan 29, 2019 at 5:06 AM Evgeny Demidov <demidov...@gmail.com> wrote:.

Evgeny Demidov

unread,
Jan 31, 2019, 12:17:20 AM1/31/19
to WebGL Dev List
but   gl.bindImageTexture(0, ...) . So Image uniform "sampler" is explicit?

Evgeny

Evgeny Demidov

unread,
Jan 31, 2019, 1:19:18 AM1/31/19
to WebGL Dev List
 forgot about all the rest uniforms - float, vec, matrices. Can locations be explicit?

Evgeny 

Evgeny Demidov

unread,
Feb 3, 2019, 5:46:07 AM2/3/19
to WebGL Dev List
3D Game of Life at
Compute shader uses 3D r32i textures. texture(samp3d, pos3d).r function is used for REPEAT clamp mode.
The script works fine on NVIDIA GPU. Unfortunately it works only on 8x8x8 grid on my AMD A6-5200 APU with 25.20.15011.1004 (09.01.2019) drivers.
  gl.memoryBarrier( gl.SHADER_STORAGE_BARRIER_BIT | gl.TEXTURE_FETCH_BARRIER_BIT)
  gl.finish()
doesn't help me.
Shall I use gl.memoryBarrier() compute shaders synchronization in simulations?

Evgeny

Evgeny Demidov

unread,
Feb 3, 2019, 8:13:12 AM2/3/19
to WebGL Dev List
sorry, NVIDIA GPU fails on larger than 16x16x16 grids

Evgeny

Ken Russell

unread,
Feb 4, 2019, 7:55:40 PM2/4/19
to WebGL Dev List
Only works for me on ANGLE's GL backend. There are some restrictions enforced on compute shaders for better compatibility with the D3D backend. I haven't checked your source code; are you overwriting the same buffer or ping-ponging between two buffers? I think the ping-pong approach will be most compatible. I also suspect that if you get this working with the D3D backend it'll also start working for larger grid sizes.

The Intel folks are mostly away for Chinese New Year, so it'll probably be several days before they look at your example.

Keep up your good work though!

-Ken


Evgeny Demidov

unread,
Feb 5, 2019, 2:05:31 AM2/5/19
to WebGL Dev List
GeForce GT 710, updated (manually by Windows) drivers 24.21.13.9924 (05.09.2018), 74.0.3693.4 (Официальная сборка) canary (64 бит), ANGLE's GL backend.
glider dies on 64x64x64 grid. I'l wait. I'm working on splines. 

Evgeny

Evgeny Demidov

unread,
Feb 5, 2019, 12:51:24 PM2/5/19
to WebGL Dev List
sorry, I get that bindings not locations are explicit in WebGL compute shaders (i.e. "Uniform and Shader Storage Block Layout Qualifiers" in GL ES 3.10 spec)
 layout (std430, binding = 0) buffer SSBO {
...
 context.bindBufferBase(context.SHADER_STORAGE_BUFFER, 0, ssbo);

Evgeny

Evgeny Demidov

unread,
Feb 6, 2019, 6:06:38 AM2/6/19
to WebGL Dev List
"5 trending open source machine learning JavaScript frameworks"
What is known about machine learning and compute shaders? E.g.
transfer JS codes from CPU to GPU?
or better transfer C + compute shaders (CUDA) library to WebGL by WASM (by hand)?
what for machine learning can be used in browser?

any suggestions? sorry I'm not an expert.

Evgeny

Gu, Yang

unread,
Feb 6, 2019, 8:31:43 AM2/6/19
to webgl-d...@googlegroups.com

We just started a task to create a Compute Shader backend for TF.js. This is not a trivial task, and we don’t have enough expertise, so we’re not sure how long it would take. Meanwhile, the effort to support WebGL 2.0 Compute in Emscripten is WIP, however the upstream review is slower than expectation.

Another parallel effort from our (Intel) web team is W3C “Machine Learning for the Web” CG, and you may get latest update from regular meetings, Chromium fork and polyfill.

Machine learning is still in high speed evolution, so we hope to provide multiple choices here so that these techs can co-work or compete with each other to make a better solution for web.

 

Regards,

-Yang

 

From: webgl-d...@googlegroups.com [mailto:webgl-d...@googlegroups.com] On Behalf Of Evgeny Demidov
Sent: Wednesday, February 06, 2019 7:07 PM
To: WebGL Dev List <webgl-d...@googlegroups.com>
Subject: [webgl-dev-list] Re: Compute shaders

 

"5 trending open source machine learning JavaScript frameworks"

--

Evgeny Demidov

unread,
Feb 7, 2019, 11:43:00 PM2/7/19
to WebGL Dev List
thank you, I get global picture.
1. it is a very good test case for WebGL2 compute.
2. CUDA tensorflow backend is 1.5-2 times faster than WebGL (1?) one. TF.js will compare CUDA and compute shaders (performance, bottleneck ...).
3. a few fresh links from Google: tensorFlow.js "compute shaders"
Investigate compute shaders · Issue #716 · tensorflow/tfjs · GitHub 20 Sep - 23 Dec 2018
TensorFlow Lite Now Faster with Mobile GPUs (Developer Preview)

I think, AI evaluation is more simple than learning. Is it possible (useful) to replace WebGL backend by steps (e.g. just matrix multiplication, neural networks evaluation...) with TensorFlow (or Google) community? (curiosity mainly)

Evgeny

Evgeny Demidov

unread,
Feb 8, 2019, 12:10:34 AM2/8/19
to WebGL Dev List
sorry continuation
"TensorFlow Lite delegate APIs on Android (requires OpenGL ES 3.1 or higher)"
glGenBuffers(id.length, id, 0);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, id[0]);
...
so one can just make TF.lite.js. Where can we find Andrei Kulik, Juhyun Lee, Nikolay Chirkov, Ekaterina Ignasheva, Raman Sarokin, Yury Pisarchyk ? :)

Evgeny

Evgeny Demidov

unread,
Feb 8, 2019, 12:21:43 AM2/8/19
to WebGL Dev List
Andrei Kulik - Staff Software Engineer at Google, 10 years; CEO at AIMatter, acquired by Google ("small world" :)

Gu, Yang

unread,
Feb 8, 2019, 9:50:02 AM2/8/19
to webgl-d...@googlegroups.com


Sent: Friday, February 08, 2019 1:22 PM
To: WebGL Dev List <webgl-d...@googlegroups.com>

Subject: Re: [webgl-dev-list] Re: Compute shaders

 

Andrei Kulik - Staff Software Engineer at Google, 10 years; CEO at AIMatter, acquired by Google ("small world" :)

--

Evgeny Demidov

unread,
Feb 8, 2019, 11:52:15 AM2/8/19
to WebGL Dev List
may be Andrei Kulik's team can make TF Lite.JS (for Google ;) ?
it is very interesting to compare CUDA and Compute shaders performance on real applications.

Evgeny

Evgeny Demidov

unread,
Feb 8, 2019, 12:39:02 PM2/8/19
to WebGL Dev List
for this simple shader
const CSs = `#version 310 es
  layout (local_size_x = 2, local_size_y = 2, local_size_z = 1) in;
  layout (std430, binding = 0) buffer SSBO {
    vec3 data[];
  } ssbo;
void main() {
  ssbo.data[gl_LocalInvocationIndex] = vec3(gl_LocalInvocationID);
}
`;
I get strange result - output: [0,0,0, 0,1,0, 0,0,0, 1,0,0] (I inserted spaces by hand)

for vec2() data[]
I see correct - output: [0,0, 1,0, 0,1, 1,1]
what is wrong? (the script is very short)

Evgeny

Kentaro Kawakatsu

unread,
Feb 11, 2019, 8:17:26 PM2/11/19
to WebGL Dev List
Hi Evgeny,

I suppose this is because of memory layout. An array of vec3 is rounded up to 4N even though std430.
If you allocate and get result with new Float32Array(4 * 4) by bufferData()/getBufferSubData(), you will get correct result - output: [0,0,0,(0), 1,0,0,(0), 0,1,0,(0), 1,1,0,(0)], (0) means auto inserted padding. The result you showed is first 12 components of this.

And std140 is worse. An array of vec2 is rounded up to 4N too. So if you change the memory layout of SSBO in CDbuffer2x2x2.htm to std140, you will get strange(but not wrong) result.

-Kentaro

Evgeny Demidov

unread,
Feb 12, 2019, 12:30:21 AM2/12/19
to WebGL Dev List
Hi Kentaro,

Thank you.
But "the base alignment and stride of arrays of scalars and vectors in rule 4 and of structures in rule 9 are not rounded up a multiple of the base alignment of a vec4." OpenGL ES Version 3.1 (November 3, 2016)
Can (shall :) one use vec3 attributes?

I'm using really vec4 attributes (x,y,z, brightness) https://www.ibiblio.org/e-notes/webgl/gpu/CSpetal.htm . I can store texture coordinate 2x f16 too. I'd like to store 10_10_10_2 normals, but they will need to be converted in vertex shaders. How (and where) to store nonmals?

Evgeny

Qin, Jiajia

unread,
Feb 12, 2019, 12:42:28 AM2/12/19
to webgl-d...@googlegroups.com

Hi Evgeny,

I find that vertex shader source ‘VSs’ in life3Dcs64.htm is different with life3Dcs16.htm and life3Dcs8.htm. If I manually change it like below:

const VSs = `#version 310 es

  uniform mat4 mvpMatrix;

  uniform lowp isampler3D uTS;

void main(void) {

   ivec3 pos = ivec3(gl_InstanceID & 63,

     (gl_InstanceID >> 6) & 63, gl_InstanceID >> 12);

   if (texelFetch(uTS, pos, 0 ).r == 0) gl_PointSize = 1.0;

   else gl_PointSize = 5.;

   gl_Position = mvpMatrix * vec4(pos, 1);

}

`;

 

A result can be got. So could you please check the vertex shader if it’s expected?

 

Regards,

Jiajia

 

 

From: webgl-d...@googlegroups.com [mailto:webgl-d...@googlegroups.com] On Behalf Of Evgeny Demidov


Sent: Sunday, February 3, 2019 9:13 PM
To: WebGL Dev List <webgl-d...@googlegroups.com>

--

Evgeny Demidov

unread,
Feb 12, 2019, 11:52:03 AM2/12/19
to WebGL Dev List
On Tuesday, February 12, 2019 at 8:42:28 AM UTC+3, Qin, Jiajia wrote:

Hi Evgeny,

I find that vertex shader source ‘VSs’ in life3Dcs64.htm is different with life3Dcs16.htm and life3Dcs8.htm. If I

I've tried many scripts with texture() and texelFetch() functions https://www.ibiblio.org/e-notes/webgl/ca/ 
For small N all point of grid are rendered (with different point sizes ). For big N only glider is shown (non zero grid values). For N=32 glider dies near grid borders (Nvidia, GL backend).
I think it is not very important if scripts do not work only on my old Core 2 duo with GT 710 or AMD A6-5200 APU.

I think it is not difficult and very interesting to make big matrix summation, multiplication, inversion demos (and compare them with CUDA in TensorFlow :)

Evgeny

Qin, Jiajia

unread,
Feb 13, 2019, 1:41:50 AM2/13/19
to webgl-d...@googlegroups.com

Sorry. The machine I tried before never shown the glider. I thought your problem was the grid couldn’t be rendered. I can reproduce your issue now.

 

We find a problem is your draw() method.

The original code is like below:

‘var t = tex;  tex = tex1;  tex1 = tex’

We suppose that you want to exchange the value of tex and tex1. So you should change the last assignment to ‘tex1 = t’ not ‘tex1= tex’.

 

Regards,

Jiajia

 

From: webgl-d...@googlegroups.com [mailto:webgl-d...@googlegroups.com] On Behalf Of Evgeny Demidov


Sent: Wednesday, February 13, 2019 12:52 AM
To: WebGL Dev List <webgl-d...@googlegroups.com>

Subject: Re: [webgl-dev-list] Re: Compute shaders

 

On Tuesday, February 12, 2019 at 8:42:28 AM UTC+3, Qin, Jiajia wrote:

--

Evgeny Demidov

unread,
Feb 13, 2019, 12:05:03 PM2/13/19
to WebGL Dev List
On Wednesday, February 13, 2019 at 9:41:50 AM UTC+3, Qin, Jiajia wrote:

We find a problem is your draw() method.

The original code is like below:

‘var t = tex;  tex = tex1;  tex1 = tex’

Thank you very much Jiajia. Annoying misprint. I've corrected my scripts.

I've found Tutorial: OpenCL SGEMM tuning for Kepler by Cedric Nugteren and made
Matrix multiplication SGEMM v.1 demo at https://www.ibiblio.org/e-notes/webgl/gpu/sgemm.htm

Hope I could make all the rest optimizations from CLBlast.

Evgeny

Qin, Jiajia

unread,
Feb 14, 2019, 4:12:50 AM2/14/19
to webgl-d...@googlegroups.com

It’s really a good work to use WebGL 2.0 Compute to optimize matrix multiplication. Looking forward your follow-up work.

 

FYI gl.MAX_COMPUTE_WORK_GROUP_SIZE and gl.MAX_COMPUTE_WORK_GROUP_INVOCATIONS are already supported in latest chrome canary.

https://chromium-review.googlesource.com/c/chromium/src/+/1445795/

 

Regards,

Jiajia

 

From: webgl-d...@googlegroups.com [mailto:webgl-d...@googlegroups.com] On Behalf Of Evgeny Demidov
Sent: Thursday, February 14, 2019 1:05 AM
To: WebGL Dev List <webgl-d...@googlegroups.com>
Subject: Re: [webgl-dev-list] Re: Compute shaders

 

On Wednesday, February 13, 2019 at 9:41:50 AM UTC+3, Qin, Jiajia wrote:

--

Evgeny Demidov

unread,
Feb 14, 2019, 7:22:17 AM2/14/19
to WebGL Dev List
Shader v.2 "Tiling in the local memory".
shared float Asub[TS][TS], groupMemoryBarrier() at last :)
I get max WG invocations and size (but don't use yet :)

Does WebGL2 compute supported on Android?

Evgeny

Ken Russell

unread,
Feb 15, 2019, 6:28:00 PM2/15/19
to WebGL Dev List
On Thu, Feb 14, 2019 at 4:22 AM Evgeny Demidov <demidov...@gmail.com> wrote:
Shader v.2 "Tiling in the local memory".
shared float Asub[TS][TS], groupMemoryBarrier() at last :)
I get max WG invocations and size (but don't use yet :)

Does WebGL2 compute supported on Android?

Not yet. WebGL 2.0 compute only works on platforms that use ANGLE, and right now Chrome doesn't use ANGLE on Android. However, the team is switching to use ANGLE on all platforms, so it's a work-in-progress.

-Ken

 

Evgeny

On Thursday, February 14, 2019 at 12:12:50 PM UTC+3, Qin, Jiajia wrote:

It’s really a good work to use WebGL 2.0 Compute to optimize matrix multiplication. Looking forward your follow-up work.

 

FYI gl.MAX_COMPUTE_WORK_GROUP_SIZE and gl.MAX_COMPUTE_WORK_GROUP_INVOCATIONS are already supported in latest chrome canary.

https://chromium-review.googlesource.com/c/chromium/src/+/1445795/

 

Regards,

Jiajia

--

Evgeny Demidov

unread,
Feb 16, 2019, 12:59:37 PM2/16/19
to WebGL Dev List
Naive matrix multiplication shader works on my tablet with Intel Atom Z3735F (win10 32bit Chromium 74, default D3D11 backend). But I get gl.getParameter(gl.MAX_COMPUTE_SHARED_MEMORY_SIZE) = 0 and shader 2 crashes. Isn't it my fault (again :) ?

I'm looking for an appropriate small GPU (~mobile) for matrix multiplication test and optimization. 128 or 192 GPU cores seems too big for mobile and too small for desktop :)

Evgeny

Evgeny Demidov

unread,
Feb 16, 2019, 2:28:02 PM2/16/19
to WebGL Dev List
it may be not implemented yet in D3D11 backend. But I can't get webgl2-compute context with GL backend. I have rather old drivers 10.18.10.4358 12-21-2015. Where can I find new drivers (I can't install 64bit version).

Evgeny

On Saturday, February 16, 2019 at 8:59:37 PM UTC+3, Evgeny Demidov wrote:Naive matrix multiplication shader works on my tablet with Intel Atom Z3735F (win10 32bit Chromium 74, default D3D11 backend). But I get gl.getParameter(gl.MAX_COMPUTE_SHARED_MEMORY_SIZE) = 0 and shader 2 crashes. Isn't it my fault (again :) ?

Evgeny Demidov

unread,
Feb 17, 2019, 4:21:38 AM2/17/19
to WebGL Dev List
I've updated drivers Version: 15.33.47.5059 (Latest) Date: 9/18/2018. Still no GL context.
ERROR:angle_platform_impl.cc(47) : initialize(530): ANGLE Display::initialize error 12289: WGL_NV_DX_interop2 is required but not present 
Chrome/74.0.3709.0

Next optimization step - half float in compute shader on Intel GPU (f16 are supported in OpenCL on Intel GPU). I'll try mediump float and so on... What will be as a result on Z3735F? :)


Evgeny

Ken Russell

unread,
Feb 19, 2019, 1:56:12 PM2/19/19
to WebGL Dev List
Keep trying Evgeny. Hopefully the folks from Intel can chime in here regarding the unavailability of shared memory for compute shaders with the D3D11 backend on this device. I think the D3D11 backend is your best bet on this device.

-Ken


--

Qin, Jiajia

unread,
Feb 20, 2019, 5:14:50 AM2/20/19
to webgl-d...@googlegroups.com

MAX_COMPUTE_SHARED_MEMORY_SIZE hasn’t been implemented in d3d backend. I am adding it to ANGLE. The HLSL limits it to 32kb in d3d11. So it’s safe if the shared memory is not larger than 32kb.

You mentioned that shader 2 crashes. What do you mean ‘crash’? Will the gl context be lost or shader compile fail? Is there any error information (I don’t have that device in my hand, so I can’t reproduce it)?

 

 

From: webgl-d...@googlegroups.com [mailto:webgl-d...@googlegroups.com] On Behalf Of Evgeny Demidov


Sent: Sunday, February 17, 2019 2:00 AM
To: WebGL Dev List <webgl-d...@googlegroups.com>

--

Evgeny Demidov

unread,
Feb 20, 2019, 5:59:20 AM2/20/19
to WebGL Dev List
context is lost too on AMD A6-5200 Win10 64bit, D3D11 backend in
Tiling in the local memory v.2
uses 16x16 x2 FLOATS = 2KB of shared memory (no more than 8 KB for 32x32 tiles)

really old ATOM Z3735F is not very important, if someone test the script on modern GPU at
and get 2-3 times acceleration with respect to Naive shader ;)

Evgeny

Hoda Naghibi

unread,
Feb 20, 2019, 2:12:01 PM2/20/19
to WebGL Dev List
Hi Evgency, 

Thank you so much for sharing your code samples on computer shader. I was wondering if we can use Offscreencanvas to run the compute shader off the main thread in parallel and get the results on the main thread (using any shared buffer or glreadpixel) once it is required ?

Thanks.

Evgeny Demidov

unread,
Feb 21, 2019, 12:09:25 AM2/21/19
to WebGL Dev List
Not sure I understand you correctly. As I think, compute dispatch is executed on GPU independently of the main thread. If only gl.finish() is used (see below) I get ti-ti0=0 ms execution time
    var ti0 = new Date().getTime()
    gl.dispatchCompute(N/TS, N/TS, 1)
    gl.memoryBarrier(gl.ALL_BARRIER_BITS)   // ?
//  gl.finish()
    gl.getBufferSubData(gl.SHADER_STORAGE_BUFFER, 0, result)
    var ti = new Date().getTime()
therefore dummy buffer reading is used to get timing.

For security reason GPU initializes sheared arrays by 0. May it slow down execution noticeably?

Evgeny

Hoda Naghibi

unread,
Feb 21, 2019, 1:27:03 AM2/21/19
to WebGL Dev List
Thank you so much for the reply. I get your point about in parallel execution of compute shader with main thread. Actually I am not concerned about execution time. I was wondering if there is a way for realtime fine-grained tracking of what is happening on compute shader. For example suppose that we have a very long loop in compute shader that updates a buffer rapidly, and in parallel we have our own code running on CPU thread and can read the buffer at some points (while computer shader is executing, not finished). Thanks.

Evgeny Demidov

unread,
Feb 21, 2019, 2:15:27 AM2/21/19
to WebGL Dev List
On Thursday, February 21, 2019 at 9:27:03 AM UTC+3, Hoda Naghibi wrote:
I was wondering if there is a way for realtime fine-grained tracking of what is happening on compute shader. For example suppose that we have a very long loop in compute shader that updates a buffer rapidly, and in parallel we have our own code running on CPU thread and can read the buffer at some points (while computer shader is executing, not finished). Thanks.
there are barrier() function and  many gl BARRIER_BITS but not sure it helps you. I think we need more detailed information about your shader. 

Evgeny

Kentaro Kawakatsu

unread,
Feb 21, 2019, 1:05:03 PM2/21/19
to WebGL Dev List
Hi Evgeny,
Sorry for late.

It seems that rule3 is applied to vec3 before rule4. The base alignment of vec3 is already rounded up to equally vec4 by rule3. So in std140, an array of vec3 is 4N because of not rule4 but rule3. And in std430, round up of rule4 is not applied but this is no effect to an array of vec3 because it is already 4N. That's my understanding.



Sorry I couldn't understand well what you really want to do.
Do you simply want to use vec3 attribute(the data has been written in compute shader as an array of vec3 in SSBO) in vertex shader?

In my demo, I use vec3 in compute shader(though std140 layout).

And in JavaScript, I put dummy paddings when defining the data which will be sent to GPU  through bufferData() (why position and velocity are in the same array is I'm using "Interleaving").

In vertex shader, I specify byteStride and bufferOffset to vertexAttribPointer() as like vec4(16byte) though I want to deal position and velocity as vec3.
Maybe this is an important point.

Then I can use them normally as vec3.

Does my answer make sense?

-Kentaro

2019年2月12日火曜日 14時30分21秒 UTC+9 Evgeny Demidov:

Evgeny Demidov

unread,
Feb 21, 2019, 1:57:03 PM2/21/19
to WebGL Dev List
Thank you Kentaro,

really, my question was if "vec4 alignment"-rule corresponds to std430 spec. But now they have enough work with half_float alignment, then structures (exotic one like POS+TC = 3 vec3 + 2 half_vec2 :) ...

Evgeny



Hoda Naghibi

unread,
Feb 25, 2019, 5:43:00 PM2/25/19
to webgl-d...@googlegroups.com
As I know, the barrier() function makes sure that your shader execution is terminated before getting the result. But I need to get the result before shader termination. For example my shader is counting in a very long loop and storing in the shared buffer. And CPU needs to query the counter and get its value in every point it wants, not waiting for compute shader to be terminated. Is there any way to implement this scenario or sending one signal from CPU to the compute shader to force it to terminate (before its completion) when it wants? 

--

Evgeny Demidov

unread,
Mar 6, 2019, 7:34:24 AM3/6/19
to WebGL Dev List
erroneous script reads 32x32 float data from 16x16 shared array. Result (in the first string) is slightly different after the page is reloaded (Win10,64, OpenGL, AMD, Nvidia). What does it read? :)

Evgeny

Qin, Jiajia

unread,
Mar 6, 2019, 10:01:20 PM3/6/19
to webgl-d...@googlegroups.com

I find that your local work group size is still 32x32 not 16x16. But the shared array is 16x16 which will results the shared memory out of bounds.

Maybe you want to have a  16x16 local work group. So you need to have below changes:

1.      layout (local_size_x = 32, local_size_y = 32, local_size_z = 1) -> layout (local_size_x = 16, local_size_y = 16, local_size_z = 1)

2.      gl.dispatchCompute(M/32, M/32, 1)                                            -> gl.dispatchCompute(M/16, M/16, 1)

 

Regards,

Jiajia

 

From: webgl-d...@googlegroups.com [mailto:webgl-d...@googlegroups.com] On Behalf Of Evgeny Demidov
Sent: Wednesday, March 6, 2019 8:34 PM
To: WebGL Dev List <webgl-d...@googlegroups.com>
Subject: [webgl-dev-list] Re: Compute shaders

 

erroneous script reads 32x32 float data from 16x16 shared array. Result (in the first string) is slightly different after the page is reloaded (Win10,64, OpenGL, AMD, Nvidia). What does it read? :)

 

Evgeny

--

Evgeny Demidov

unread,
Mar 21, 2019, 2:14:57 PM3/21/19
to WebGL Dev List
in compute shader athttps://www.ibiblio.org/e-notes/webgl/gpu/mul/sgemm3b.htm
I get for D3D11 backend errors:

sgemm3b.htm:91 max WG invoc=1024 size=1024
sgemm3b.htm:94 Shared mem=32768
sgemm3b.htm:102 C:\fakepath(77,2-33): error X3504: literal loop terminated early due to out of bounds array access
C:\fakepath(70,27-36): error X3696: infinite loop detected - loop never exits

C:\fakepath(77,2-33): error X3504: literal loop terminated early due to out of bounds array access
C:\fakepath(70,27-36): error X3696: infinite loop detected - loop never exits

Warning: D3D shader compilation failed with default flags. (cs_5_0)
Retrying with skip validation
C:\fakepath(77,2-33): error X3504: literal loop terminated early due to out of bounds array access
C:\fakepath(70,27-36): error X3696: infinite loop detected - loop never exits

Warning: D3D shader compilation failed with skip validation flags. (cs_5_0)
Retrying with skip optimization
C:\fakepath(77,2-33): error X3504: literal loop terminated early due to out of bounds array access
C:\fakepath(70,27-36): error X3696: infinite loop detected - loop never exits

Warning: D3D shader compilation failed with skip optimization flags. (cs_5_0)
Failed to create D3D compute shader.

both on AMD and Nvidia HW. The script works with OpenGL backend. Similar script
https://www.ibiblio.org/e-notes/webgl/gpu/mul/sgemm3.htm
and all the rest shaders work fine with D3D11.

Evgeny

Qin, Jiajia

unread,
Mar 21, 2019, 10:09:21 PM3/21/19
to webgl-d...@googlegroups.com

You should change the RTS value with your TS value. However, your current RTS value is always 4u which is not TS/WPT. Try to change it like below:

const TS8 =`#version 310 es

#define TS 8u

#define RTS 1u  // TS/WPT`

const TS16 =`#version 310 es

#define TS 16u

#define RTS 2u  // TS/WPT`

const TS32 =`#version 310 es

#define TS 32u

#define RTS 4u  // TS/WPT`

const CSs = `

#define WPT 8u

… …

 

Above code works well in my machine. We didn’t do the out of bounds checking in webgl2-compute. But I think HLSL compiler supports it. That’s why you see the error on d3d backend but not in gl backend.

 

Regards,

Jiajia

 

From: webgl-d...@googlegroups.com [mailto:webgl-d...@googlegroups.com] On Behalf Of Evgeny Demidov
Sent: Friday, March 22, 2019 2:15 AM
To: WebGL Dev List <webgl-d...@googlegroups.com>
Subject: [webgl-dev-list] Re: Compute shaders

 

--

Reply all
Reply to author
Forward
0 new messages