Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Redis DB Hyperbole
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Salvatore Sanfilippo  
View profile  
 More options May 14 2010, 4:55 am
From: Salvatore Sanfilippo <anti...@gmail.com>
Date: Fri, 14 May 2010 10:55:59 +0200
Local: Fri, May 14 2010 4:55 am
Subject: Re: Redis DB Hyperbole

On Fri, May 14, 2010 at 9:56 AM, LFReD <terrybrown...@gmail.com> wrote:
> Hi,

> After some exhausting testing, here's some conclusions..

> - The advertised speed of Redis is not even close to real world
> expectations. The I/O between Redis, and the PHP libraries (Rediska,
> Predis) is a killer. The Redis benchmarks show 55,000 GETS/s, but in
> reality it's more like 7700/s.. and that's against a mere 100,000
> keys!

hello Terry,

the problem with benchmarks is that there is no way to reach good
conclusions unless the people running this benchmarks are *very*
skilled about the field of testing, networking, high performance
systems and so forth.
Still I've to admin that your errors and misconceptions are very naive
here, but in general, even a more well designed test is very hard to
conduct.

This is how things works in practice. We have two computers REDIS, and CLIENT:

REDIS <--------------- communication link -----------------> CLIENT

They talk using a communication link.

Let's assume that the communication link is slow, so 1 second is
required in order for an information to travel from one computer to
another. Let's also assume that both REDIS and CLIENT are infinite
speed, and also that even if the communication link has a big latency
(1 second) the bandwidth is infinite.

Then I try to meter the performance of Redis using code like this:

REPEAT 10000 times:
   SEND REQUEST
   READ RESPONSE

We need both to send the requests and read the responses as Redis uses
a request/reply protocol.

In the best of the conditions (link already established, just one
packet needed for request and reply) the time needed for a
request/reply protocol to complete a single query is 2*latency (that
is, the round trip time, RTT).

So if we measure the speed of Redis in this conditions it will show a
speed of 0.5 requests per second, or 30 requests per minute.

What happens if I use a faster link with 100 ms latency? Then the
performance is 10 requests per second.
So just changing the link the Redis performances improve? And why on
the earth I'l measuring 0% CPU usage in Redis and client side? Because
I'm not metering the Redis performances, but the round trip time.

Or actually, a little bit of Redis performances interleaved with a
great deal of round trip time.

TESTING WITH MULTIPLE CLIENTS

Since I assumed an infinite bandwidth, the obvious next step to avoid
the problem if the RTT is using multiple clients or multiple processes
(or threads, or non blocking I/O, or whatever) in the same client in
order to be able to saturate better the link between the client and
server.

This is why Redis benchmarks use 50 parallel requests in order to
meter what is happening. This model is closely related to practice
because usually there are N web server processes talking with a single
client: if your web application has 50 clients connected at the same
time, the Redis<->Client link will actually get 50 requests in
parallel.

YES BUT I WANT TO GO FASTER WITH JUST 1 CLIENT!

Since Redis was designed for high performance from the start, it
supported pipelining since the first releases.
Pipelining means that instead of:

CLIENT         -> request ->              SERVER
CLIENT         <-  reply   <-               SERVER

We send multiple requests without waiting to receive the reply, in
order to don't pay the price for the RTT:

CLIENT -> request -> request -> request -> SERVER
CLIETN      <- reply <- reply <- reply           SERVER

NON BLOCKING I/O

Another alternative is using non blocking I/O. A single client may
take N connections against the server, and use a different connection
with non-blocking I/O for every request. This makes the programming
harder as what is needed is to switch into an event-driven programming
model where thing don't happen sequentially, but there are callbacks
called when a given reply reached the client and so forth.

> The Apache -> Predis/Rediska -> Redis I/O roundtrip simply chokes, and
> i see no way of speeding it up.

There is no single networked DB without this limitation, as I explained above.

> - Pattern searching against keys on a large dataset is dreadful. Don't
> even bother.

If you are referring to "KEYS <pattern>", from the Keys man page:

Note that while the time complexity for this operation is O(n) the
constant times are pretty low. For example Redis running on an entry
level laptop can scan a 1 million keys database in 40 milliseconds.
Still it's better to consider this one of the slow commands that may
ruin the DB performance if not used with care.
In other words this command is intended only for debugging and special
operations like creating a script to change the DB schema. Don't use
it in your normal code. Use Redis Sets in order to group together a
subset of objects.

Please read the documentation in order to understand what commands are
supposed to do.
KEYS is a debugging command. If you understand the Redis data model
you can implement your indexes with very good performances.

> As a comparison, I built some similar code using Rebol + Cheyenne (a
> Rebol web server).

> Some stats..
> - 1 million GETS against 100,000 keys in 0.906 seconds (142 times
> faster than my Rediska/Predis + Redis tests)
> - Pattern match against 10 million keys in 0.17 seconds

Fore sure here you are using threads, non blocking I/O or something
without networking in the middle.

Btw with a single client you should get more or less 10k
requests/second, so anyway even if this is completely pointless is 10
(or 14) times faster, not 142.

> And blocks are native to Cheyenne.. there's no port I/O necessary..
> and it runs on Linux or Windows

As so it is a library. This way you dropped the need for a client ->
db communication link.
But if you already noticed we are in the cloud era, and different
computers need to communicate otherwise you are not going to scale.

> One thing Redis needs to do is tone down the hyper-bole on
> performance, and post some real world benchmarks.

Our benchmarks are fine, as discussed.

Regards,
Salvatore

--
Salvatore 'antirez' Sanfilippo
http://invece.org

"Once you have something that grows faster than education grows,
you’re always going to get a pop culture.", Alan Kay

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redis-db@googlegroups.com.
To unsubscribe from this group, send email to redis-db+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.