SkCanvas flush painfully slow when using GPU backed secondary SkSurface armv7

96 views
Skip to first unread message

Jona

unread,
May 3, 2021, 6:04:07 PMMay 3
to skia-discuss
I'm running into an interesting issue with Android devices with what it appears to be armeabi-v7 CPUs where the call flush is incredibly slow. However, using a secondary SkSurface as raster it flushes noticeably faster. This issue does not happen with arm64 devices AFAIK.

My setup:
SkSurface main; << With GPU target.
SkSurface drawSurface = main.makeSurface(main.imageInfo()) // << SLOW
//SkSurface drawSurface = SkSurface::MakeRaster(SkImageInfo::MakeN32Premul(size)) // <<< FASTER

I then do multiple rect drawings:
drawSurface.drawRect(....)

Update main Surface
Skcanvas canvas = main.getCanvas()
drawSurface.draw(canvas, 0, 0, null)
canvas.flush()
eglSwapBuffers(mEGLDisplay, mEGLSurface);
---------------

Here's what I was able to deduce from various tests.
1. Using drawSurface that is GPU backed is noticeably faster to draw things into it as one would expect.
2. Calling flush on the main surface canvas is much faster if the drawSurface is RASTER backed. Crazy?

What could be wrong? I built Skia m87, with target_os="android" and target_cpu="arm". I'm kinda running out of options. :( Thanks for any help.

Brian Salomon

unread,
May 4, 2021, 9:11:26 AMMay 4
to skia-d...@googlegroups.com
This seems odd. Perhaps you could take a CPU profile and see where the time is being spent?


--
You received this message because you are subscribed to the Google Groups "skia-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to skia-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/skia-discuss/f94ae0d0-01f1-44f7-a961-93dfa506c83fn%40googlegroups.com.

Jona

unread,
May 4, 2021, 10:44:58 AMMay 4
to skia-discuss
Hey Brian, after more investigation I found out that drawing with a SkPaint with antialiasing enabled and some opacity it caused heavy performance issues. Setting the paint to fully opaque performance issues where noticeably better but still not great.

I read somewhere in this forum a while back that sampleCount where used by GPU to do antialiasing or something along those lines. Well story short I had sampleCount=0 and when I changed it to 2 things worked fast! Now the question is, why? What does sampleCount really do? What value could be optimal?

GrBackendRenderTarget target(width, height, 2, 8, glInfo);

Brian Salomon

unread,
May 4, 2021, 11:56:52 AMMay 4
to skia-d...@googlegroups.com
Hi Jona,

The sample count refers to the number samples for GPU multsampling (MSAA). When Skia can use MSAA to antialias paths draw much faster. This is also an area of active improvement, so expect more significant performance advantages to MSAA as Skia continues to evolve.

If you're wrapping an existing FBO and just tell Skia the sample count is 2 without actually making a multisampled FBO the result will be unantialiased, though. To have things be properly antialiased you'll have to modify the code outside of Skia where the backing store is created to setup MSAA. For the SkSurface factories where Skia is allocating the backing texture (i.e. SkSurface::MakeRenderTaget)  the sample count will affect the buffer allocated and will work properly.

The actual relationship depends on the GPU, but in general more samples means more memory, less speed, higher quality. On mobile you probably don't want to go above 4.

Brian

Jona

unread,
May 4, 2021, 12:16:33 PMMay 4
to skia-discuss
Thanks for the explanation. I am using an existing FBO from GLSurfaceView from Android. 

I'm lost in a way about the sample count being 2 and still being unantialiased. Does SkPaint antialiase settings not work? Or I would have to not use SkPaint antialiase and make the changes you're saying under the hood to aliase everything?

Below is my FBO hookup code. Hopefully it looks good.
mEGLDisplay = eglGetCurrentDisplay();
mEGLSurface = eglGetCurrentSurface(EGL_DRAW);

eglGetError();
eglSurfaceAttrib(mEGLDisplay, mEGLSurface, EGL_SWAP_BEHAVIOR, EGL_BUFFER_PRESERVED);
EGLint error = eglGetError();
if (error != EGL_SUCCESS)
{
LOGE("Could not enable buffer preserved swap behavior (%x)", error);
}

GrGLint frameBuffer;
glGetIntegerv(GL_FRAMEBUFFER_BINDING, &frameBuffer);
if (error != EGL_SUCCESS)
{
LOGE("Could not get frame buffer! (%x)", error);
}

mWidth = width;
mHeight = height;

GrGLFramebufferInfo glInfo;
glInfo.fFBOID = (GrGLuint) frameBuffer;
glInfo.fFormat = 0x8058; //GR_GL_RGBA8: this definition inside in a header in skia's src folder.

// SampleCount is nicely explained here:
// https://medium.com/@jlwu.will/vulkan-on-android-12-vr-stereo-rendering-part-3-implementation-f02ea16fe1a0

GrBackendRenderTarget target(width, height, 2, 8, glInfo);

// setup SkSurface
// To use distance field text, use commented out SkSurfaceProps instead
// SkSurfaceProps props(SkSurfaceProps::kUseDeviceIndependentFonts_Flag,
// SkSurfaceProps::kUnknown_SkPixelGeometry);
// SkSurfaceProps props;
mSurface = SkSurface::MakeFromBackendRenderTarget(mContext.get(), target,
kBottomLeft_GrSurfaceOrigin,
kRGBA_8888_SkColorType,
nullptr,
nullptr);


Brian Salomon

unread,
May 4, 2021, 2:05:29 PMMay 4
to skia-d...@googlegroups.com
Skia basically takes the sample count of a user provided FBO on faith. We'll assume that HW MSAA can be used to perform antialiasing if the SkPaint requires it. If the FBO doesn't actually have multiple samples then HW MSAA won't work and it will effectively be unantialiased. Miscommunicating the FBO's sample count might also cause other things to misbehave.

Here is some discussion of how to create a GLSurfaceView with multiple samples:


I have no experience working with GLSurfaceView so can't vouch for it.

You'll still want to set the antialias setting on the SkPaint when you want the path to be antialiased. Otherwise we may, depending on GPU capabilities, go out of our way, and possibly be slower, to workaround the presence of multiple samples on the FBO to produce an unantialiased result for the draw.

Brian

Jona

unread,
May 6, 2021, 2:28:49 PMMay 6
to skia-discuss
What you're saying makes total sense.

Just to recap, passing 0 as the sample count, for my GrBackendRenderTarget, it was turning OFF GPU based MSAA and used HW based MSAA.

Those changes have caused an improvement in multiple areas but I'm still struggling with the sluggish performance on this particular device, the Samsung Tab A tablet. Other devices running 64bit CPUs everything flies! The part that drives me a bit crazy is that Android native java code draws things super fast on this Samsung Tab A. So I must have some settings incorrectly set? 

I checked the link you provided and it was very helpful. I used it to to specify sample counts and have a fixed desired config.
int[] attrib_list = {
EGL10.EGL_LEVEL, 0,
EGL10.EGL_RENDERABLE_TYPE, (mEGLContextClientVersion == 2) ? EGL14.EGL_OPENGL_ES2_BIT : EGLExt.EGL_OPENGL_ES3_BIT_KHR,
EGL10.EGL_RED_SIZE, 8,
EGL10.EGL_GREEN_SIZE, 8,
EGL10.EGL_BLUE_SIZE, 8,
EGL10.EGL_DEPTH_SIZE, 0,
EGL10.EGL_SAMPLE_BUFFERS, 1,
EGL10.EGL_SAMPLES, 2, // This is for 2x MSAA. Makes it go fast on slower devices. Weird!
EGL10.EGL_NONE
};

I have two performance related questions:
1. Would having two SkSurfaces created from the main GPU backed SkSurface cause performance degradation?
2. Could there be some additional settings I could try on the build or GPU settings to optimize performance?

Thank you!

Jona

unread,
May 6, 2021, 3:53:27 PMMay 6
to skia-discuss
Well, another area I was able to uncover that is super slow is when using an SkPaint with SkBlendMode::kLighten. This mode makes it go crazy slow.

Brian Salomon

unread,
May 10, 2021, 9:06:10 AMMay 10
to skia-d...@googlegroups.com
Hey Jona,

Sorry for the slow response. WRT the recap I think it's more accurate to say that with passing 0 is turning off GPU/HW antialiasing and instead Skia would do antialiasing in the fragment shader using alpha blending.

If you have multiple GPU-backed SkSurfaces you should try to make sure you minimize switching between the surfaces. We will soon be doing re-ordering across surfaces automatically at flush time but right now it's important not to switch often.

The performance of SkBlendModes will vary a lot across different GPUs. We use a variety of OpenGL features to implement blending but on some older GPUs for more sophisticated blends there is no way to implement it other than to make a copy of the destination buffer and feed it into the fragment shader to do shader-based blending.

Brian

Jona

unread,
May 11, 2021, 9:56:17 AMMay 11
to skia-discuss
Brian, I'm more than happy with a slow response! I'm super happy how much you guys interact with all us here! Super awesome!

1. About antialiasing explanation, makes sense and all clicks!
2. GPU backed SkSurfaces: We use them to help drawing layered content. It just makes it easier for our drawing implementation. In the end they all get composed into the main SkSurface and drawn onto the screen. This type of usage might be different than what you're talking about maybe?
3. SkBlendModes: Got it, makes sense. Would there be some API we can check if running on LOW, MED, HIGH performance GPU? I ask because internally we could work on an alternative way to do things for LOW performance GPUs. Wouldn't look optimal but it's better than nothing. :D

Thanks again for all the insight!

Brian Salomon

unread,
May 13, 2021, 9:14:18 AMMay 13
to skia-d...@googlegroups.com

Hey Jona,

Regarding #2, I would just try to make sure you draw to each offscreen surface in turn and then to the main SkSurface rather than drawing some to the main, switching to an offscreen, and then switching back to main, etc.

Regarding #3, Unfortunately, it's actually quite complicated. There are a bunch of different OpenGL features and extensions we can use and some affect some blend modes but not others and the performance can also on the SkPaint used to draw (e.g. whether the paint is opaque or not, whether there is a shader or not, whether antialiasing is enabled, ...).

Some things that could be checked for in the GL_EXTENSIONS string that would indicate using the variety of SkBlendMode blending is likely going to be faster:

GL_KHR_blend_equation_advanced
GL_NV_texture_barrier (helps when the surface drawn to is a texture, but not if it's a wrapped FBO)
GL_EXT_shader_framebuffer_fetch
GL_ARM_shader_framebuffer_fetch
GL_EXT_blend_func_extended (more useful for blend modes up through kScreen but not beyond)

However, on some GPUs we may not actually take advantage of the features because we've found bugs in the driver. Also some these features are implicitly supported in later OpenGL versions. However, most vendors put them in the extension string anyway so that apps written against earlier GL versions can take advantage.

Brian

Jona

unread,
May 26, 2021, 11:08:26 AMMay 26
to skia-discuss
Thank you for providing this information. Really helpful.

For those looking for some more information on AntiAliasing and MSAA information, the following link is awesome.

In the end, I decided that the Skia based antialiasing quality is superior to what I was getting with GPU msaa at x4 or higher. So I turned OFF msaa by setting sampleCount=0. Works fast!
Reply all
Reply to author
Forward
0 new messages