Unable to recover from EGL_CONTEXT_LOST

430 views
Skip to first unread message

Hamish Arblaster

unread,
Apr 14, 2022, 7:39:58 PM4/14/22
to angleproject
I'm getting this error on Windows,
It happens when I call eglSwapBuffers on my Windows VM running through Parallels when it switches GPU (this is not the only way to replicate this, but I don't know what the conditions some of my users had to replicate it). I'm on an Intel CPU macbook running the process in x64.
I want to be able to properly destroy the old context + make a new context when this occurs, and use it instead after re-rendering & setting up textures, shaders etc. (or fix current context?).
I'm currently using the chromium/4692 branch, but I can move to a newer one if it would definitely fix it, it just takes me a long time to compile it (and I will move in the future anyway at some point).

This is the simplified version of how I'm setting up, rendering & destroying my EGL/GLES context (error handling ommited):

_display = eglGetDisplay(<dc goes here>);

eglInitialize(_display, ...);

eglBindAPI(EGL_OPENGL_ES_API);

eglChooseConfig(_display, configAttribs, &__configs); //size is gotten before this

_config = *__configs;

_surface = eglCreateWindowSurface(_display, _config, <window/control handle here>, surfaceAttribs;

_context = eglCreateContext(_display, _config, NULL, contextAttribs);

eglMakeCurrent(_display, _surface, _surface, _context);

eglSwapInterval(_display, 0 or 1);


Rendering:

//OpenGL ES rendering

eglSwapBuffers(_display, _surface)


//Destroying:

eglDestroySurface(_display, _surface);

eglDestroyContext(_display, _context);

eglTerminate(_display);

destroy dc


When I try to destroy the context I get errors as well (I cannot remember exactly which of the top of my head).

Geoff Lang

unread,
Apr 19, 2022, 10:34:45 AM4/19/22
to hama...@gmail.com, angleproject
I can't say what's wrong based on the code given. It looks like a regular flow. You should be able to tear down everything like you are and create a new display from scratch.

If you're having difficulty interpreting errors, try using EGL_KHR_debug to get error messages. If you have a debugger, you can also put a breakpoint in your debug callback to look at the code in ANGLE that generated the error.

Geoff

--
You received this message because you are subscribed to the Google Groups "angleproject" group.
To unsubscribe from this group and stop receiving emails from it, send an email to angleproject...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/angleproject/9c7c5893-1f3f-4ff6-aad2-8df7dfe27646n%40googlegroups.com.

Hamish Arblaster

unread,
Apr 21, 2022, 6:57:47 AM4/21/22
to angleproject
Hi, thanks for your suggestion,
I added callbacks for both EGL_KHR_debug and GL_KHR_debug, this is all the output (when eglSwapBuffers has an error, I'm trying to go through the following sequence: destroy, create, (next frame) re-render. I made it ignore the errors from the destroying phase to see what other errors happen - the 2 functions below are the ones that had an error in the destroying phase):
EGL: 12302, "eglSwapBuffers", 13241, 0, 0, "Context lost."
EGL: 12302, "eglDestroySurface", 13241, 0, 0, "display had a context loss"
EGL: 12302, "eglDestroyContext", 13241, 0, 0, "display had a context loss"

Then, when re-creating, eglCreateWindowSurface has the following error: Bad alloc

In terms of trying to debug it, my app runs in .NET (I made the ANGLE interop library myself) and I'm not sure how to debug native code in a .NET app - I'll have a look into this to see if it's possible if this info can't resolve it.

When I add a breakpoint in the callback to EGL_KHR_debug, I get an additional message - the error happens here instead in this case:
EGL: 12302, "eglChooseConfig", 13241, 0, 0, "display had a context loss"
In release mode, it happens in eglChooseConfig as well whether I add a breakpoint or not.

It's odd that it behaves different in debug mode without breakpoints to debug+breakpoints or release, but this is probably caused by something else e.g. dotnet may have inlined / conditionally run some things in this config only (or perhaps I caused it?) - I think this isn't good though, so I will look into this further at some point. But, I thought I looked at eglGetError and it was just a simple read, but if not - this is probably the cause of this behaviour difference (due to possibly misusing SuppressGCTransitionAttribute). I can remove this attribute if eglGetError doesn't meet the requirements of SuppressGCTransition, can you have a look and let me know if there's any reason it wouldn't (I also have glGetError & eglMakeCurrent using this - I have it on these 3 since I'm essentially calling them once each for every other function call in GLES to check for errors & ensure the context)?

But looking at "display had a context loss", it seems to come from validationEGL.cpp's ValidateDisplay function, which checks display->isDeviceLost. Should this fail silently since I'm destroying something that's already lost? Not sure why eglChooseConfig/eglCreateWindowSurface is failing though.

Thanks for your time & help!

Hamish Arblaster

unread,
Apr 26, 2022, 4:39:06 PM4/26/22
to angleproject
Hi, I've done some mixed debugging and found that it has a problem in validationEGL.cpp at line 2261:
if (Display::hasExistingWindowSurface(window)) //2261
{
    val->setError(EGL_BAD_ALLOC);
    return false;
}
It seems to be caused by line 187 in entry_points_egl_autogen.cpp:
EGLBoolean EGLAPIENTRY EGL_DestroySurface(EGLDisplay dpy, EGLSurface surface)
{
    ANGLE_SCOPED_GLOBAL_LOCK();
    EGL_EVENT(DestroySurface, "dpy = 0x%016" PRIxPTR ", surface = 0x%016" PRIxPTR "",
              (uintptr_t)dpy, (uintptr_t)surface);

    Thread *thread = egl::GetCurrentThread();

    egl::Display *dpyPacked = PackParam<egl::Display *>(dpy);
    Surface *surfacePacked  = PackParam<Surface *>(surface);

    ANGLE_EGL_VALIDATE(thread, DestroySurface, GetDisplayIfValid(dpyPacked), EGLBoolean, dpyPacked, //187
                       surfacePacked);

    return DestroySurface(thread, dpyPacked, surfacePacked);
}
Which fails because mDeviceLost == true, and then it can't delete the surface.
Please advise how I can fix this?

Geoff Lang

unread,
Apr 27, 2022, 11:39:50 AM4/27/22
to hama...@gmail.com, angleproject
Hey,

Sorry I didn't get a chance to reply to your previous message. It looks like there is a bug in the validation of eglDestroySurface after the device is lost and we should fix that. Could you file a bug on anglebug.com? The good news is that it probably is not very serious for your application, eglTerminate will destroy all contexts and surfaces for you.

Geoff

Hamish Arblaster

unread,
Apr 27, 2022, 6:54:18 PM4/27/22
to angleproject
Hi, I can do that,
The problem is that eglTerminate doesn't destroy it until it's not the current context, but I can't make it not the current context because display is invalidated and ANGLE doesn't support a NO_DISPLAY context. When I call eglGetDisplay I seem to simply get my terminated display back - I think this isn't good as well, if I call eglTerminate shouldn't it cause eglGetDisplay to give a new display?
What should I put on anglebug.com?
Is it that eglDestroySurface and eglDestroyContext don't destroy the surface & context respectively if mDeviceLost == true, and it has an error instead?
Should I also make one for eglGetDisplay returning a terminated display that is yet to be destroyed because it happens to be current? If this isn't a bug, what am I meant to do instead?
Thanks!

Hamish Arblaster

unread,
Jun 5, 2022, 11:20:52 AM6/5/22
to angleproject
It seems to me that if EGL_EXT_display_alloc (https://github.com/KhronosGroup/EGL-Registry/pull/156, which allows forcing the allocation of a new display object) and EGL_KHR_display_reference (which allows forcing of destruction of displays) were implemented, my issues would be fixed. How possible would that be?
Reply all
Reply to author
Forward
0 new messages