Redis.new.keys('*').count freezing up everything in a stress test

85 views
Skip to first unread message

Alexey Verkhovsky

unread,
May 8, 2009, 1:34:04 PM5/8/09
to redi...@googlegroups.com
I was thrashing Redis and Redis-rb client library (yesterday's master branches of both), and discovered the following problem:

1. Initialize Redis database with a million keys (1 to 999,999)
2. Run 10 processes that read from those keys and another 10 that write to it. Code here:
    http://github.com/alexeyv/redis-rb/blob/acfc1da0384df6f89d9b1d5e9fe138ac25a568e4/benchmarking/worker.rb
    http://github.com/alexeyv/redis-rb/blob/acfc1da0384df6f89d9b1d5e9fe138ac25a568e4/benchmarking/suite.rb
3. Perform Redis.new.keys('*').count from an IRB session.

Expected: the evil bulk operation may starve the other sessions by locking some resources - I can live with that - but eventually it's over and everything else should recover and continue working.
Actual: CPU utilization or redis-server.rb goes through the roof and then, in a few seconds, becomes zero. At this point, every client process (including the one running Redis.new.keys('*').count) is stuck forever on socket read.

Is this a known issue of some sort?

-
Alexey Verkhovsky
http://alex-verkhovsky.blogspot.com/
CruiseControl.rb [http://cruisecontrolrb.thoughtworks.com]

Salvatore Sanfilippo

unread,
May 8, 2009, 1:39:21 PM5/8/09
to redi...@googlegroups.com
On Fri, May 8, 2009 at 7:34 PM, Alexey Verkhovsky
<alexey.v...@gmail.com> wrote:

> Expected: the evil bulk operation may starve the other sessions by locking
> some resources - I can live with that - but eventually it's over and
> everything else should recover and continue working.

Hello Alexey: this is the expected behavior actually. You found a bug,
and given that you are the 100th people discovering a bug in Redis you
just won 1,000,000$ dollars!

Ok, the latter is not true.

> Actual: CPU utilization or redis-server.rb goes through the roof and then,
> in a few seconds, becomes zero. At this point, every client process
> (including the one running Redis.new.keys('*').count) is stuck forever on
> socket read.

Not very cool. I guess that the other client performing queries are
not part of the bug. It's just a matter of KEYS going bad if the keys
are so much. This may be another issue with integer overflows if your
keys are pertty big, let me try, I'll report back here in few minutes.

Cheers,
Salvatore

>
> Is this a known issue of some sort?
>
> -
> Alexey Verkhovsky
> http://alex-verkhovsky.blogspot.com/
> CruiseControl.rb [http://cruisecontrolrb.thoughtworks.com]
>
>
> >
>



--
Salvatore 'antirez' Sanfilippo
http://invece.org

Salvatore Sanfilippo

unread,
May 8, 2009, 2:16:29 PM5/8/09
to redi...@googlegroups.com
On Fri, May 8, 2009 at 7:39 PM, Salvatore Sanfilippo <ant...@gmail.com> wrote:

> keys are pertty big, let me try, I'll report back here in few minutes.

Hello,

can't reproduce the problem: lodade 1 million keys, and the server is
able to reply to 'KEYS *' even if it takes something like 30 seconds
on my macbook, and then when the server replied to KEYS other clients
start to get replies too.

Of course the operation is *blocking*, that is, no other client will
get served until the KEYS command ended it's execution (but everything
will work while the output is being sent to the client).

Maybe it just takes more time you think? In a slow box it may even
take some minute as far as I can tell.

Btw note that the time is used to create the buffer to send to the
client. Even with 1 million of keys if you perform:

KEYS foo*

and there are few 'foo*' keys it will return in less than a second.

Cheers,
Salvatore

Alexey Verkhovsky

unread,
May 8, 2009, 3:09:40 PM5/8/09
to redi...@googlegroups.com
On Fri, May 8, 2009 at 11:39 AM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
Not very cool. I guess that the other client performing queries are
not part of the bug.

Indeed. Just running Redis.new.keys('*') on that database locks it up.

The redis-server stack at that point looks as follows:

#0  0x95d146f2 in select$DARWIN_EXTSN ()
#1  0x000024b2 in aeProcessEvents ()
#2  0x00002980 in aeMain ()
#3  0x0000afe4 in main ()

Client stack is:

#0  0x95cc5bc9 in read$NOCANCEL$UNIX2003 ()
#1  0x95cfd2bf in _sread ()
#2  0x95cfd265 in __srefill ()
#3  0x95d317de in __srget ()
#4  0x95d3179f in getc ()
#5  0x00128a8c in io_fread ()
#6  0x001293bc in io_read ()
#7  0x00101955 in call_cfunc ()
#8  0x0010baf2 in rb_call0 ()
#9  0x0010c6fc in rb_call ()
#10 0x0010a037 in rb_eval ()

And after it goes into his state, any new client session gets stuck in exact same position, as soon as it tries to communicate with the server.

--

Alexey Verkhovsky

unread,
May 8, 2009, 3:12:08 PM5/8/09
to redi...@googlegroups.com
On Fri, May 8, 2009 at 12:16 PM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
Maybe it just takes more time you think?

Nope. You saw the stack trace of the server in my earlier message - it certainly isn't doing anything. CPU utilization is also zero.
Do you want me to send you a database dump / try the same accessing thing with some other client / do anything else to debug this further?
 
--

Salvatore Sanfilippo

unread,
May 8, 2009, 3:27:49 PM5/8/09
to redi...@googlegroups.com
On Fri, May 8, 2009 at 9:12 PM, Alexey Verkhovsky
<alexey.v...@gmail.com> wrote:
> On Fri, May 8, 2009 at 12:16 PM, Salvatore Sanfilippo <ant...@gmail.com>
> wrote:
>>
>> Maybe it just takes more time you think?
>
> Nope. You saw the stack trace of the server in my earlier message - it
> certainly isn't doing anything. CPU utilization is also zero.
> Do you want me to send you a database dump / try the same accessing thing
> with some other client / do anything else to debug this further?

Yes Alexey thank you very much, this will be *very* helpful since to
load the DB with N keys is not enough for some reason, I guess the bug
is dataset-dependent or something like this.

As first step could you please gzip the DB and try to send it to me in
some way? If it's too big for email I can give you access to some
server where you can upload this or something like this.

Thanks your hep is very appreciated.

Ciao,
Salvatore

> --
> Alexey Verkhovsky
> http://alex-verkhovsky.blogspot.com/
> CruiseControl.rb [http://cruisecontrolrb.thoughtworks.com]
>
>
> >
>



Reply all
Reply to author
Forward
0 new messages