Vittorio Romeo
unread,Sep 14, 2024, 9:35:45 PM9/14/24Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to angleproject
I need help understanding some weird behavior/performance discrepancy between desktop OpenGL and OpenGL ES 3.x (ANGLE) on Windows x64.
I am batch-drawing 500k sprites via `glDrawElements`, using these two techniques:
```cpp
// technique 1: orphaning + subdata
glBufferData(GL_ARRAY_BUFFER, sizeof(Vertex) * vertexCount, nullptr, GL_STREAM_DRAW);
glBufferSubData(GL_ARRAY_BUFFER, 0u, sizeof(Vertex) * vertexCount, vertices);
// ...same for index buffer...
glDrawElements(...);
// technique 2: orphaning + buffer mapping
glBufferData(GL_ARRAY_BUFFER, sizeof(Vertex) * vertexCount, nullptr, GL_STREAM_DRAW);
void* ptr0 = glMapBufferRange(
GL_ARRAY_BUFFER, 0u, sizeof(Vertex) * vertexCount,
GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_BUFFER_BIT | GL_MAP_UNSYNCHRONIZED_BIT);
memcpy(ptr0, vertices, sizeof(Vertex) * vertexCount);
// ...same for index buffer...
glDrawElements(...);
```
On desktop OpenGL, the performance of both techniques is pretty much comparable (around ~15ms draw time), maybe the buffer mapping has a slight edge but it's hard to tell. This is also true regardless of `STATIC`, `DYNAMIC`, or `STREAM`.
On OpenGL ES 3.x, it's weird. Buffer mapping is consistantly slower than subdata. Subdata with `DYNAMIC` or `STREAM` is around ~23ms draw time even via ANGLE, which is much slower than desktop OpenGL.
However, if I use the subdata technique with `GL_STATIC_DRAW`, I get around ~15ms draw time (good), however there's a weird bug: if I increase the number of sprites to something massive (e.g. 2 million) the performance obviously tanks (~300ms). However, after going back to 500k sprites, the performance is significantly worse (~70ms).
I've looked online, tried to profile both CPU/GPU, and ensured that I'm sending the right amount of vertices/indices when drawing the sprites... so I'm quite confident that I'm doing something the ES driver doesn't like.
What's more, the drawing time stays around ~70ms even if I set the number of sprites to something like 2 or 3! I'm really confused.
I think it's GPU-related because of the discrepancy with desktop OpenGL and the fact that removing glDrawElements completely does not cause the problem I mentioned
Without glDrawElements the "draw time" is always as expected (around ~16ms) when reverting to 500k sprites (and goes to ~0ms with 2-3 sprites)
So the problem seems to only happen on (1) ANGLE ES 3.x when (2) glDrawElements is invoked (3) VBO/EBO buffers are bound as GL_STATIC_DRAW and (4) there was a spike of data sent to the GPU that then disappeared
Any ideas?