GPU watchdog

1,027 views
Skip to first unread message

Nehal Shah

unread,
Mar 21, 2016, 1:33:00 PM3/21/16
to Chromium OS dev, Mike Frysinger, Dario Dorando, Dominik Behr, Grant Grundler, Stéphane Marchesin, Antoine Labour
Dear All,

While freon porting AMDGPU we are facing "gpu_watchdog_thread.cc(315)] The GPU process hung" and it reboots the system in sometime. Is there any way to stop it? I am already doing chrome_dev.conf following settings 

--disable-hang-monitor
--no-sandbox # disables all sandboxes
--disable-gpu-sandbox # disables gpu sandbox only, creates only one GPU process
--disable-gpu-watchdog # disables 10 second watchdog thread for GL calls, needed to debug GL driver
--enable-gpu-debugging
--disable-glsl-translator # stop translating shaders
--disable-shader-name-hashing 
--disable-gpu-shader-disk-cache
--disable-gpu-program-cache
--disable-gpu-driver-bug-workarounds
--allow-sandbox-debugging
--disable-gpu-process-prelaunch
--enable-gpu-client-logging
--enable-gpu-client-tracing
--enable-gpu-command-logging
--enable-gpu-service-logging 
--enable-gpu-service-tracing
--wait-for-debugger-children
--ignore-gpu-blacklist
--ui-disable-partial-swap 

Still it reboots. Please suggest the correct settings

Regards
Nehal Shah

Puneet Kumar

unread,
Mar 21, 2016, 2:42:36 PM3/21/16
to Nehal Shah, Chromium OS dev, Mike Frysinger, Dario Dorando, Dominik Behr, Grant Grundler, Stéphane Marchesin, Antoine Labour
On Mon, Mar 21, 2016 at 10:33 AM Nehal Shah <21.n...@gmail.com> wrote:
Dear All,

While freon porting AMDGPU we are facing "gpu_watchdog_thread.cc(315)] The GPU process hung" and it reboots the system in sometime.

Are you able to get any information from the logs? Isn't that the right thing to be looking at first? 
Is there any way to stop it? I am already doing chrome_dev.conf following settings 

--disable-hang-monitor
--no-sandbox # disables all sandboxes
--disable-gpu-sandbox # disables gpu sandbox only, creates only one GPU process
--disable-gpu-watchdog # disables 10 second watchdog thread for GL calls, needed to debug GL driver
--enable-gpu-debugging
--disable-glsl-translator # stop translating shaders
--disable-shader-name-hashing 
--disable-gpu-shader-disk-cache
--disable-gpu-program-cache
--disable-gpu-driver-bug-workarounds
--allow-sandbox-debugging
--disable-gpu-process-prelaunch
--enable-gpu-client-logging
--enable-gpu-client-tracing
--enable-gpu-command-logging
--enable-gpu-service-logging 
--enable-gpu-service-tracing
--wait-for-debugger-children
--ignore-gpu-blacklist
--ui-disable-partial-swap 

Still it reboots. Please suggest the correct settings

Regards
Nehal Shah

--
--
Chromium OS Developers mailing list: chromiu...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-os-dev?hl=en

---
You received this message because you are subscribed to the Google Groups "Chromium OS dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-os-d...@chromium.org.

Nehal Shah

unread,
Mar 21, 2016, 3:06:38 PM3/21/16
to Puneet Kumar, Chromium OS dev, Mike Frysinger, Dario Dorando, Dominik Behr, Grant Grundler, Stéphane Marchesin, Antoine Labour
HI Puneet

Thanks for the quick reply. Yes i have attached UI logs (with log enabled in command buffer ERROR: in log is not actual error but it is normal log only). It crashes in one of GLES command and we are getting SEGV_MAPERR 000000000000 error. So we thought to run gl_tests first instead of directly debugging UI. In order to run the tests it is required that system should not reboot. However here GPU watch dog is rebooting it. 

From log if you can point out some valuable information please let me know i will look into it .

Thanks and Regards
Nehal Shah

ui.LATEST

Puneet Kumar

unread,
Mar 21, 2016, 3:43:20 PM3/21/16
to Nehal Shah, Chromium OS dev, Mike Frysinger, Dario Dorando, Dominik Behr, Grant Grundler, Stéphane Marchesin, Antoine Labour
Does the GPU process stay up long enough (without invoking gl_tests) to attach to it via gdb?  I see this comment in the code to allow looking at stuff with a debugger: 
// For minimal developer annoyance, don't keep terminating. You need to skip
// the call to base::Process::Terminate below in a debugger for this to be
// useful.
static bool terminated = false;
if (terminated)
return;

Have you tried attaching with gdb and skipping over some code to see what could be going wrong?

Dominik Behr

unread,
Mar 21, 2016, 6:54:57 PM3/21/16
to Puneet Kumar, Nehal Shah, Chromium OS dev, Mike Frysinger, Dario Dorando, Grant Grundler, Stéphane Marchesin, Antoine Labour
Do you add these command line options to /etc/chrome_dev.conf?
Make sure so you dont paste the comments starting with #

This looks suspicios:
[1142:1142:0316/033644:ERROR:ozone_platform_gbm.cc(128)] Failed to find vgem device: No such file or directory
[1620:1620:0316/033648:ERROR:gl_surface_egl.cc(248)] No suitable EGL configs found.
[1620:1620:0316/033649:ERROR:gl_surface_egl.cc(248)] No suitable EGL configs found.

Can you add debugging output to mesa to see if EGL context is created and dmabufs allocated in GBM imported using dma buf import?

Is there any command buffer submitted to kernel? If so does it hang? maybe this causes GPU watchdog to kill GPU process with sig11?



--
Dominik
Reply all
Reply to author
Forward
This conversation is locked
You cannot reply and perform actions on locked conversations.
0 new messages