Graphics Performance Optimization: Point Sprites, GLSL, and Kivy

454 views
Skip to first unread message

Kovak

unread,
Jan 5, 2014, 4:19:44 AM1/5/14
to kivy-...@googlegroups.com
Hi Everyone,

See the Example Code Here

I have been looking into using GL ES2.0 Point Sprites in order to improve performance, and having done some testing I think I can confirm that this is a viable way to speed up certain types of rendering! In general this technique could also be used with Quads (sets of 4 vertices) and you still gain most of the performance (basic profiling suggests speed gains in the ~1000x to ~2000x range for large sets of quads, however using gl_Points (1 vertex) you can gain an addition 1.5x to 3x speed on the quad method.

When to use this type of rendering?
Point Sprites are good for rendering large numbers of fairly small 2d textures. The typical use case in in creating Particle Systems, but I have also used points to render the bullets in my games and small background decorations. When you are expecting to have small objects with the same graphic, possibly 100+ on the screen at once this is a good method to consider. This could also be good for drawing 'nice' looking lines that follow the users touches (sort of like the way touchtracer works)

Potential Downsides:
Perhaps not all hardware, particularly older hardware will support these shaders. My transformer tf101 does not correctly link the shader and crashes on attempting to use, however my Droid 4 has no problem, and experiences a nice performance boost.
Additionally, different gpus have different minimum and maximum size limits. It seems on most hardware, this is a fairly large range ~1.0 to ~500.0, however, some gpus may not be able to achieve such a range. I believe the standard implementation just clips the size down to its max or min if passed a larger size so it should fail gracefully, but your visuals could end up looking different based on hardware.
Grab the apk here and test your devices to help me compile a list of problematic hardware.

This is GLSL right? What's the Catch?
GLSL can't be easy right? Correct! One shader will not work for everywhere because of the difference in OpenGL and OpenGL ES 2.0!
On your typical ES2.0 device point sprites are built in by default. This means that in the Vertex Shader you can set the special value 'gl_PointSize', and in the Fragment shader you can retrieve a special built in coordinate 'gl_PointCoord'. This is available in version #100, likely the only version your phone will have access to.

These built ins were not added in until the desktop shader version #120. Using an ifdef to check gl version and then setting version works on my desktop, however it crashes my android devices on attempting to load the shader, meaning you will pretty much have to use 2 separate files just to set this properly.

In addition, on the desktop use of these values is not enabled by default. We must tell gl to use them this is GL_VERTEX_PROGRAM_POINT_SIZE and GL_POINT_SPRITE. However, Kivy does not know about the required GL instructions probably because they are no longer actually documented in the gl spec. You can still enable them using their hex. So we will import glEnable from kivy.graphics.opengl and:
            glEnable(0x8642) #GL_VERTEX_PROGRAM_POINT_SIZE
            glEnable(0x8861) #GL_POINT_SPRITE

In later versions of GL and GLSL the job of the point sprite has been replaced by the more flexible Geometry shader. However, in ES2.0 land this is the best solution we have. You should be fine just enabling GL_POINT_SPRITE, and GL_VERTEX_PROGRAM_POINT_SIZE and keeping them enabling if you are wanting to use point sprites, they only take effect on instructions with gl_points enabled, and make the gl_PointCoord built in available and causing the shader to heed gl_PointSize.

Please let me know if this works on your hardware, desktop and phone, I am not certain if I have the versioning right for when to load each shader, it is possible some versions above 2.0 should still be using the es2.0 shader. I also want to get an idea of whether or not there is a significant set of devices for which this is not a useful option.

Potential Optimizations:
1. Right now when we create a Mesh in python this is passed to Cython where the list is then copied again before being sent to the gpu. This could possibly be improved with the use of the array module, which should be able to be sent directly to the gpu.


Let me know if you have any questions or critique!

Thanks,
Kovak

Reply all
Reply to author
Forward
0 new messages