today I created this experimental branch implementing a software
watchdog: https://github.com/antirez/redis/compare/watchdog
This may be a little revolution in the debugging of latency issues
when the user has not a clear clue about what's happening.
This is how it works. The watchdog is disabled by default, it is
possible to turn it on with:
CONFIG SET watchdog_delay 1000
1000 indicates the watchdog period in milliseconds. If more than 1000
milliseconds will elapse without Redis event loop returning in control
of the execution, the watchdog will log the event and the stack trace
so that we can understand *where it was* while not responding.
For instance if I run "DEBUG POPULATE 1000000" I'll see in the logs:
[92805] 27 Mar 12:07:18 # --- WATCHDOG TIMER EXPIRED ---
[92805] 27 Mar 12:07:18 # --- STACK TRACE
[92805] 27 Mar 12:07:18 # 1 libsystem_c.dylib
0x00007fff8e7d6ed5 __sfvwrite + 502
[92805] 27 Mar 12:07:18 # 2 libsystem_c.dylib
0x00007fff8e801cfa _sigtramp + 26
[92805] 27 Mar 12:07:18 # 3 ???
0x0000000000000000 0x0 + 0
[92805] 27 Mar 12:07:18 # 4 libsystem_c.dylib
0x00007fff8e7a5947 __vfprintf + 17124
[92805] 27 Mar 12:07:18 # 5 libsystem_c.dylib
0x00007fff8e7a0edb vsnprintf_l + 396
[92805] 27 Mar 12:07:18 # 6 libsystem_c.dylib
0x00007fff8e7a9f22 snprintf + 169
[92805] 27 Mar 12:07:18 # 7 redis-server
0x0000000102a20cbc debugCommand + 716
[92805] 27 Mar 12:07:18 # 8 redis-server
0x00000001029f852b call + 155
[92805] 27 Mar 12:07:18 # 9 redis-server
0x00000001029f8c37 processCommand + 1111
[92805] 27 Mar 12:07:18 # 10 redis-server
0x0000000102a03d00 processInputBuffer + 160
[92805] 27 Mar 12:07:18 # 11 redis-server
0x0000000102a02b7c readQueryFromClient + 396
[92805] 27 Mar 12:07:18 # 12 redis-server
0x00000001029f4753 aeProcessEvents + 643
[92805] 27 Mar 12:07:18 # 13 redis-server
0x00000001029f496b aeMain + 59
[92805] 27 Mar 12:07:18 # 14 redis-server
0x00000001029fb4e5 main + 1045
[92805] 27 Mar 12:07:18 # 15 redis-server
0x00000001029f3884 start + 52
[92805] 27 Mar 12:07:18 # 16 ???
0x0000000000000001 0x0 + 1
[92805] 27 Mar 12:07:18 # ------
How it is implemented? The watchdog implementation uses unix signals,
specifically the SIGALRM that can be scheduled using setitimer().
Because of this it may stop the execution of certain system calls.
There should not be places in Redis where this is critical, as partial
reads and writes should be handled correctly everywhere, however we
should probably consider using SA_RESTART in the setup of the signal
handler.
However because this is in general dangerous the watchdog feature is
considered a debugging feature that can not be enabled via redis.conf,
but only using CONFIG SET / GET, and the plan is to documented it
mainly in the troubleshooting section of the documentation.
What do you think? Comments welcomed.
Cheers,
Salvatore
--
Salvatore 'antirez' Sanfilippo
open source developer - VMware
http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
Also currently we are using the standard logging function that may
call *printf() against the same file descriptor, which is not
guaranteed to work well, so it should probably use just a write after
generating the string in a local buffer. I'll fix both the issues,
thank you.
Cheers,
Salvatore
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/redis-db/-/TOcojRiUYa8J.
>
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to
> redis-db+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=en.
--
Thanks,
Salvatore
> maybe cache the current command in some globally available string?
Not possible for performance concerns ;)
> cache only the first N bytes :)
To do so you need a copy operation at every command dispatch, this is
already too expensive in the context of Redis command execution where
we care about the microsecond.
Cheers,