Redis DB Hyperbole

219 views
Skip to first unread message

LFReD

unread,
May 14, 2010, 3:56:33 AM5/14/10
to Redis DB
Hi,

After some exhausting testing, here's some conclusions..

- The advertised speed of Redis is not even close to real world
expectations. The I/O between Redis, and the PHP libraries (Rediska,
Predis) is a killer. The Redis benchmarks show 55,000 GETS/s, but in
reality it's more like 7700/s.. and that's against a mere 100,000
keys!

The Apache -> Predis/Rediska -> Redis I/O roundtrip simply chokes, and
i see no way of speeding it up.

- Pattern searching against keys on a large dataset is dreadful. Don't
even bother.

As a comparison, I built some similar code using Rebol + Cheyenne (a
Rebol web server).

Some stats..
- 1 million GETS against 100,000 keys in 0.906 seconds (142 times
faster than my Rediska/Predis + Redis tests)
- Pattern match against 10 million keys in 0.17 seconds

And blocks are native to Cheyenne.. there's no port I/O necessary..
and it runs on Linux or Windows

This doesn't have all the features of Redis, but for sheer
speed.. .REAL WORLD speed, it wins hands down.

One thing Redis needs to do is tone down the hyper-bole on
performance, and post some real world benchmarks.

"Another influence was Rebol. Rebol’s a more modern language, but with
some very similar ideas to Lisp, in that it’s all built upon a
representation of data which is then executable as programs. But it’s
a much richer thing syntactically. Rebol is a brilliant language, and
it’s a shame it’s not more popular, because it deserves to be." --
Douglas Crockford


Here's some links if anyone is interested. (I'm not affiliated with
either)

http://rebol.com
http://cheyenne-server.org/

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Konstantin Merenkov

unread,
May 14, 2010, 4:45:14 AM5/14/10
to redi...@googlegroups.com
Hello.
I won't say anything whether you are right or wrong, but!
Redis is fast enough for most real world tasks unless you are facebook
or google.

So I think nobody's heart will be broken :-)
--
Best Regards,
Konstantin Merenkov

Demis Bellot

unread,
May 14, 2010, 4:55:43 AM5/14/10
to redi...@googlegroups.com
What version and build (x86/x64) of redis-server and what distro are you using?

What numbers did the redis-benchmark utility give you on your environment?
That will show you your theoretical upper limit, since its highly optimized for maximum network IO throughput.

Can you post the source code used for your tests, maybe there is something others can run to see if they're getting the same numbers.

I find it interesting that you are getting better numbers with an alternate solution, as knowing how Redis works its hard to get any faster. 
Are both servers being accessed over the network to the same server?

- Demis

Salvatore Sanfilippo

unread,
May 14, 2010, 4:55:59 AM5/14/10
to redi...@googlegroups.com
On Fri, May 14, 2010 at 9:56 AM, LFReD <terryb...@gmail.com> wrote:
> Hi,
>
> After some exhausting testing, here's some conclusions..
>
> - The advertised speed of Redis is not even close to real world
> expectations. The I/O between Redis, and the PHP libraries (Rediska,
> Predis) is a killer. The Redis benchmarks show 55,000 GETS/s, but in
> reality it's more like 7700/s.. and that's against a mere 100,000
> keys!

hello Terry,

the problem with benchmarks is that there is no way to reach good
conclusions unless the people running this benchmarks are *very*
skilled about the field of testing, networking, high performance
systems and so forth.
Still I've to admin that your errors and misconceptions are very naive
here, but in general, even a more well designed test is very hard to
conduct.

This is how things works in practice. We have two computers REDIS, and CLIENT:

REDIS <--------------- communication link -----------------> CLIENT

They talk using a communication link.

Let's assume that the communication link is slow, so 1 second is
required in order for an information to travel from one computer to
another. Let's also assume that both REDIS and CLIENT are infinite
speed, and also that even if the communication link has a big latency
(1 second) the bandwidth is infinite.

Then I try to meter the performance of Redis using code like this:

REPEAT 10000 times:
SEND REQUEST
READ RESPONSE

We need both to send the requests and read the responses as Redis uses
a request/reply protocol.

In the best of the conditions (link already established, just one
packet needed for request and reply) the time needed for a
request/reply protocol to complete a single query is 2*latency (that
is, the round trip time, RTT).

So if we measure the speed of Redis in this conditions it will show a
speed of 0.5 requests per second, or 30 requests per minute.

What happens if I use a faster link with 100 ms latency? Then the
performance is 10 requests per second.
So just changing the link the Redis performances improve? And why on
the earth I'l measuring 0% CPU usage in Redis and client side? Because
I'm not metering the Redis performances, but the round trip time.

Or actually, a little bit of Redis performances interleaved with a
great deal of round trip time.

TESTING WITH MULTIPLE CLIENTS

Since I assumed an infinite bandwidth, the obvious next step to avoid
the problem if the RTT is using multiple clients or multiple processes
(or threads, or non blocking I/O, or whatever) in the same client in
order to be able to saturate better the link between the client and
server.

This is why Redis benchmarks use 50 parallel requests in order to
meter what is happening. This model is closely related to practice
because usually there are N web server processes talking with a single
client: if your web application has 50 clients connected at the same
time, the Redis<->Client link will actually get 50 requests in
parallel.

YES BUT I WANT TO GO FASTER WITH JUST 1 CLIENT!

Since Redis was designed for high performance from the start, it
supported pipelining since the first releases.
Pipelining means that instead of:

CLIENT -> request -> SERVER
CLIENT <- reply <- SERVER

We send multiple requests without waiting to receive the reply, in
order to don't pay the price for the RTT:

CLIENT -> request -> request -> request -> SERVER
CLIETN <- reply <- reply <- reply SERVER

NON BLOCKING I/O

Another alternative is using non blocking I/O. A single client may
take N connections against the server, and use a different connection
with non-blocking I/O for every request. This makes the programming
harder as what is needed is to switch into an event-driven programming
model where thing don't happen sequentially, but there are callbacks
called when a given reply reached the client and so forth.

> The Apache -> Predis/Rediska -> Redis I/O roundtrip simply chokes, and
> i see no way of speeding it up.

There is no single networked DB without this limitation, as I explained above.

> - Pattern searching against keys on a large dataset is dreadful. Don't
> even bother.

If you are referring to "KEYS <pattern>", from the Keys man page:

Note that while the time complexity for this operation is O(n) the
constant times are pretty low. For example Redis running on an entry
level laptop can scan a 1 million keys database in 40 milliseconds.
Still it's better to consider this one of the slow commands that may
ruin the DB performance if not used with care.
In other words this command is intended only for debugging and special
operations like creating a script to change the DB schema. Don't use
it in your normal code. Use Redis Sets in order to group together a
subset of objects.

Please read the documentation in order to understand what commands are
supposed to do.
KEYS is a debugging command. If you understand the Redis data model
you can implement your indexes with very good performances.

> As a comparison, I built some similar code using Rebol + Cheyenne (a
> Rebol web server).
>
> Some stats..
> - 1 million GETS against 100,000 keys in 0.906 seconds (142 times
> faster than my Rediska/Predis + Redis tests)
> - Pattern match against 10 million keys in 0.17 seconds

Fore sure here you are using threads, non blocking I/O or something
without networking in the middle.

Btw with a single client you should get more or less 10k
requests/second, so anyway even if this is completely pointless is 10
(or 14) times faster, not 142.

> And blocks are native to Cheyenne.. there's no port I/O necessary..
> and it runs on Linux or Windows

As so it is a library. This way you dropped the need for a client ->
db communication link.
But if you already noticed we are in the cloud era, and different
computers need to communicate otherwise you are not going to scale.

> One thing Redis needs to do is tone down the hyper-bole on
> performance, and post some real world benchmarks.

Our benchmarks are fine, as discussed.

Regards,
Salvatore

--
Salvatore 'antirez' Sanfilippo
http://invece.org

"Once you have something that grows faster than education grows,
you’re always going to get a pop culture.", Alan Kay

Ericson Smith

unread,
May 14, 2010, 7:58:47 AM5/14/10
to redi...@googlegroups.com
Its 7:57am here, and Salvatore's response is going to be THE best email I'll read all day.

Regards
- Ericson Smith
CTO Funadvice.com

Daniele Alessandri

unread,
May 14, 2010, 7:59:10 AM5/14/10
to redi...@googlegroups.com
On Fri, May 14, 2010 at 09:56, LFReD <terryb...@gmail.com> wrote:

Hi LFReD,

> - The advertised speed of Redis is not even close to real world
> expectations. The I/O between Redis, and the PHP libraries (Rediska,
> Predis) is a killer. The Redis benchmarks show 55,000 GETS/s, but in
> reality it's more like 7700/s.. and that's against a mere 100,000
> keys!

I think that there's nothing to add to Salvatore's reply, but given
that you have tried to use Predis for your tests I would like to point
you to a nice benchmark that Chris did a few months ago by simulating
various scenarios, and using the first stable release of the library:

http://www.chrisstreeter.com/archive/2010/01/434/benchmarking-redis-and-predis

Cheers,
Daniele

--
Daniele Alessandri
http://clorophilla.net/
http://twitter.com/JoL1hAHN

Demis Bellot

unread,
May 14, 2010, 8:01:35 AM5/14/10
to redi...@googlegroups.com
Agreed, just wished he posted it a little sooner - would've saved me an email :)

- Demis

LFReD

unread,
May 14, 2010, 1:52:45 PM5/14/10
to Redis DB
"I won't say anything whether you are right or wrong, but!
Redis is fast enough for most real world tasks unless you are facebook
or google. " -- Konstantin

I agree. As long as the task is kept simple.

As I mentioned in an earlier post "Real world (PHP + Rediska) speed vs
Redis Benchmark ", I GET around 7200 values/sec against 121,733 keys.
It's not a benchmark, it's a result. Yet the Redis benchmark utility
says 55,000 values/sec.
That may very well be total throughput using 50 clients etc., but this
is where I feel the hyperbole kicking in.

If all one is doing is pulling values of keys, then Redis is fine. But
in my world, I need to process the data.

I'm looking to use a NoSQL DB for semantic networking against large
datasets. My queries look like;
"Get all the people in group A (a SET with 100,000 values) that live
in Texas, listen to "The Killers", are under 35 and drive SUVs"

I said "The Apache -> Predis/Rediska -> Redis I/O roundtrip simply
chokes, and i see no way of speeding it up.",
and you replied "There is no single networked DB without this
limitation, as I explained above."

And therein lies the rub. With the Rebol experiments, I don't have the
network limitation. Blocks (Rebol's name for SETs), are native to the
Cheyenne webserver.
Cheyenne is non-blocking, event driven as well.

>YES BUT I WANT TO GO FASTER WITH JUST 1 CLIENT!

>Since Redis was designed for high performance from the start, it
>supported pipelining since the first releases.

Pipelining isn't always an option. Often i need to pull data, process,
take the results, pull more data, process ad infinitum (well not
quite, unless I divide by 0 somewhere :)
Pipelining is useful, but only in certain circumstances. Just not in
my world.

In contrast, I can crawl a 1 million key SET in Rebol once, and
parallel process at the same time.
ie: look at each key/value and build push into arrays with;
a) user(n):location = texas
b) user(n):age < 35
c) user(n):vehicle = "some SUV"
etc.
Then intersect the arrays. In my "real world" test against 1 million
keys, that takes 0.263 seconds for any number of properties.

It's possible to "pick" by index against a Rebol SET so

(Rebol code)
n: ["one" "two" "three" ... "1,000,000"]
result: pick n 897694
>> "897,694"

I can loop this 1 million times in 0.389 seconds. So retrieving values
by index has phenomenal performance. Saving values as an index
position seems beneficial.
Now, you can say what you want, but these RESULTS over Predis/Rediska
+ Redis makes Rebol the clear winner.
Rebol using pick: 2,570,694/s
Predis/Rediska + Redis: 7200/s

Salvatore said, "Fore sure here you are using threads, non blocking I/
O or something without networking in the middle. "

Yes, yes I am.

You said "But if you already noticed we are in the cloud era, and
different computers need to communicate otherwise you are not going to
scale."

Exactly, and unless you're simply pushing keys and values around, ms
per client is crucial. In other words, if all you're doing is sending
values back as result of a request, you're ok with Redis.
But if you want to actually DO something with the data, then frankly,
it's not cutting it in my world.

Rebol has some additional abilites, like searching against values, and
Rebol code runs on Linux, MacOS, BSD and Window with no alteration,
and more.
Don't take my word for it, check it out.


On May 14, 1:55 am, Salvatore Sanfilippo <anti...@gmail.com> wrote:
> Salvatore 'antirez' Sanfilippohttp://invece.org

Jeremy Zawodny

unread,
May 14, 2010, 2:06:56 PM5/14/10
to redi...@googlegroups.com
On Fri, May 14, 2010 at 10:52 AM, LFReD <terryb...@gmail.com> wrote:
"I won't say anything whether you are right or wrong, but!
Redis is fast enough for most real world tasks unless you are facebook
or google. " -- Konstantin

I agree. As long as the task is kept simple.

As I mentioned in an earlier post "Real world (PHP + Rediska) speed vs
Redis Benchmark ", I GET around 7200 values/sec against 121,733 keys.
It's not a benchmark, it's a result. Yet the Redis benchmark utility
says 55,000 values/sec.
That may very well be total throughput using 50 clients etc., but this
is where I feel the hyperbole kicking in.

What hyperbole?
 
If all one is doing is pulling values of keys, then Redis is fine. But
in my world, I need to process the data.

How is that a Redis limitation?
 
I'm looking to use a NoSQL DB for semantic networking against large
datasets. My queries look like;
"Get all the people in group A (a SET with 100,000 values) that live
in Texas, listen to "The Killers", are under 35 and drive SUVs"

I said "The Apache -> Predis/Rediska -> Redis I/O roundtrip simply
chokes, and i see no way of speeding it up.",
and you replied "There is no single networked DB without this
limitation, as I explained above."

And therein lies the rub. With the Rebol experiments, I don't have the
network limitation. Blocks (Rebol's name for SETs), are native to the
Cheyenne webserver.
Cheyenne is non-blocking, event driven as well.

So you're comparing apples and oranges.  The first example is a blocking, non-event-driven client (Predis/Rediska) running under Apache talking to Redis.  The second example is a web server written in Rebol, using a non-blocking and event-driven architecture *and* keeping all data in-process (as far as I can tell).

What are you trying to prove by doing this?  OF COURSE that's going to be faster.

I can loop this 1 million times in 0.389 seconds. So retrieving values
by index has phenomenal performance. Saving values as an index
position seems beneficial.
Now, you can say what you want, but these RESULTS over Predis/Rediska
+ Redis makes Rebol the clear winner.
Rebol using pick: 2,570,694/s
Predis/Rediska + Redis: 7200/s

Bingo.  Your data proves the point nicely.

Again, it sounds like you're trying to say that Redis doesn't fit the way you want to work.  That's fine.  But I'm not sure why you're making that sound like a flaw in Redis.  It was designed for different circumstances.

Jeremy

Brad Webb

unread,
May 14, 2010, 2:49:51 PM5/14/10
to Redis DB


On May 14, 11:06 am, Jeremy Zawodny <Jer...@Zawodny.com> wrote:
> On Fri, May 14, 2010 at 10:52 AM, LFReD <terrybrown...@gmail.com> wrote:
> > "I won't say anything whether you are right or wrong, but!
> > Redis is fast enough for most real world tasks unless you are facebook
> > or google. " -- Konstantin
>
> > I agree. As long as the task is kept simple.
>
> > As I mentioned in an earlier post "Real world (PHP + Rediska) speed vs
> > Redis Benchmark ", I GET around 7200 values/sec against 121,733 keys.
> > It's not a benchmark, it's a result. Yet the Redis benchmark utility
> > says 55,000 values/sec.
> > That may very well be total throughput using 50 clients etc., but this
> > is where I feel the hyperbole kicking in.
>
> What hyperbole?
>
> > If all one is doing is pulling values of keys, then Redis is fine. But
> > in my world, I need to process the data.
>
> How is that a Redis limitation?

I think the issue is that some folks are looking at redis as a full-
blown NoSQL db, rather than the tasks that redis is *GREAT* at
(replacing memcache, using in place of mqueue, etc) -- I haven't seen
anyone "officially" suggesting the replacement of a primary database
for something that primary databases are much better at (i.e. munging
data, running analysis, etc) -- Unfortunately, I think this thread
points out that PERHAPS the messaging needs to be adjusted to make it
more clear what the goals and best targets for redis use are.

To be clear, I think the presumed "hyperbole" in this case is a
misunderstanding of targeted implementation mixed with unrealistic
expectations of the implemented library tested. In what "real" world
is a client-facing application pulling back thousands of queried lists
or keys? If you explained that as a programming case, I'd personally
say you're doing it wrong.

** snip **

Marc Byrd

unread,
May 14, 2010, 3:27:47 PM5/14/10
to redi...@googlegroups.com
I think the original poster's motivation came through.  I wouldn't expect such a person to do the thoughtful work it sometimes takes in redis to make it sing.  My application started out taking 30s, got it down to sub 1 ms.  

Agree that flaws in testing are sync instead of event-driven, others.

However, the original poster did mention sets, intersections, etc.  Certainly redis should shine there.   And given that redis is doing this intersection in its own memory space, it shouldn't matter that the competing app is also doing it in its memory space.  

Wonder if it might be interesting to have a bake-off, start with a well-specified problem definition, initial data, etc.  I'd pitch in (as long as I can use redis-rb).  Then we can run these on the same hardware, etc., get a good apples to apples comparison.

Cheers,


m

Demis Bellot

unread,
May 14, 2010, 4:56:03 PM5/14/10
to redi...@googlegroups.com
>>My queries look like;
>>"Get all the people in group A (a SET with 100,000 values) that live
>>in Texas, listen to "The Killers", are under 35 and drive SUVs"

It sounds to me that you haven't properly indexed your data for fast access and are blaming Redis for your chatty solution. 
I hope you realize that this is not a Redis problem but a Network I/O throughput and access problem. i.e. even if Redis was an echo server took 0 time to process your request the overall results wouldn't be much better.
Redis is not going to get any faster to suit your problem-set, if you want faster results you need to structure your data so the results can be achieved with the minimum number of network calls.
Providing results comparing using a library against a network server is not going to give your numbers much credibility.

As Salvatore explained the benchmarks themselves are fine because they are modelled around real-world use where your server processes multiple concurrent requests.

I'm not aware of how your data is structured or what steps you've made to index your data (as it sounds like you're doing all the processing on the client)

But if I needed to satisfy the above request I might look at maintaining something like:

group:A => SET[user-ids]
country:Texas => SET[user-ids]
tag:SUV => SET[user-ids]

You should then have a daily task that goes through your entire dataset and recalculate the following indexes:
age:34 => SET[user-ids]
age:35 => SET[user-ids]
age:36 => SET[user-ids], etc

Now because there might not be a 1:1 relation between a user and an artist, you have to make a call as to what granularity you need to maintain:
If 'listens to Artist' is a common query then you may want to maintain this list directly against a user, e.g. 

previews:artists:{the-killers-artist-id} => SET[user-ids]

Otherwise you can keep it more structured and use the below 2 sets to work out 'Artists a user has listened to' e.g.

artist:{the-killers-artist-id}:tracks => SET[track-ids]
previews:tracks:{track-id} => SET[user-ids]

If you layout your data in something like the above then you can use Redis's server-side set operations to process your query in a much less chatty and timelier solution.
The other option is of course to keep the entire dataset in memory and all the processing done on the same server where all your data is so you never have to cross a network boundary to load your data.

- Demis

LFReD

unread,
May 15, 2010, 3:37:10 AM5/15/10
to Redis DB
You know, I'm sitting here pulling 1,000,000 values from a Rebol db
with 10,000,000 key/value pairs in 0.7 seconds on a mediocre machine
with 2gb, and what do i get?
"It's the I/O, don't blame Redis" and other fanboy comments.

I don't need to structure my data as Demis suggested. Why all of the
complexity? A simple triple-store works fine.
And I can do anything a standard MySQL db can do, and do it 138 times
faster than Redis + PHP libs.

Because I say Redis is slow (relatively speaking) you get all
defensive? My original posts were trying to figure out how to speed it
up.
I mentioned the problem was I/O right from the start. How do you make
that go away?
You can't. So time to move on.
> >> redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> >> .
> >> For more options, visit this group at
> >>http://groups.google.com/group/redis-db?hl=en.
>
> >  --
> > You received this message because you are subscribed to the Google Groups
> > "Redis DB" group.
> > To post to this group, send email to redi...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> > .
> > For more options, visit this group at
> >http://groups.google.com/group/redis-db?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group athttp://groups.google.com/group/redis-db?hl=en.

Salvatore Sanfilippo

unread,
May 15, 2010, 4:21:16 AM5/15/10
to redi...@googlegroups.com
On Sat, May 15, 2010 at 9:37 AM, LFReD <terryb...@gmail.com> wrote:
> You know, I'm sitting here pulling 1,000,000 values from a Rebol db
> with 10,000,000 key/value pairs in 0.7 seconds on a mediocre machine
> with 2gb, and what do i get?
> "It's the I/O, don't blame Redis" and other fanboy comments.

I'm incrementing a C counter much faster... if you are analyzing data
you should switch to C programs dealing with memory: this is the
future.

Cheers,
Salvatore

--
Salvatore 'antirez' Sanfilippo
http://invece.org

"Once you have something that grows faster than education grows,
you’re always going to get a pop culture.", Alan Kay

Daniele Alessandri

unread,
May 15, 2010, 4:22:28 AM5/15/10
to redi...@googlegroups.com
On Sat, May 15, 2010 at 09:37, LFReD <terryb...@gmail.com> wrote:

LFReD,

I am sure that nobody here is trying to impose Redis as a solution for
every kind of scenarios, but everyone was trying to highlight how your
comparison doesn't quite hold up from the start due to its
apples-and-oranges nature.

Having said that, I wish you good luck with your solution if it's best
suited for you and your particular application. I am sure that the
rest of the developers and companies here will continue to
successfully leverage Redis for what it was conceived and what has to
offer, since it probably represents the best-suited solution for their
own applications and works as expected... or I guess they wouldn't be
using it.

Time to move on, right?

Best Regards,

Aníbal Rojas

unread,
May 15, 2010, 7:35:04 AM5/15/10
to redi...@googlegroups.com
Incrementing counter in C? Nah. This thread will help you to open your
mind Salvatore, _this_ is the future:
http://stackoverflow.com/questions/1542006/can-i-atomically-increment-a-16-bit-counter-on-x86-x86-64

--
Aníbal Rojas
Ruby on Rails Web Developer
http://www.google.com/profiles/anibalrojas

Alvaro Videla

unread,
May 15, 2010, 8:49:19 AM5/15/10
to redi...@googlegroups.com
No Aníbal,

This is THE arithmetic question: http://www.doxdesk.com/img/updates/20091116-so-large.gif ;)

Regards,

Alvaro

Alvaro Videla

unread,
May 15, 2010, 8:51:11 AM5/15/10
to redi...@googlegroups.com
Maybe you can write a rebol client for redis XD

Demis Bellot

unread,
May 15, 2010, 8:52:30 AM5/15/10
to redi...@googlegroups.com
LOMFG that's just beautiful, best thing I've seen all day :)

Who would've thought jQuery harnessed all this power!

Sergi

unread,
May 15, 2010, 10:21:05 AM5/15/10
to Redis DB
OMG! That's for real? XDD

In regard of the discussion, I think Jeremy's answer goes directly to
the point:

"So you're comparing apples and oranges. The first example is a
blocking,
"non-event-driven client (Predis/Rediska) running under Apache talking
to
Redis. The second example is a web server written in Rebol, using a
non-blocking and event-driven architecture *and* keeping all data in-
process
(as far as I can tell)."

S.

On May 15, 2:52 pm, Demis Bellot <demis.bel...@gmail.com> wrote:
> LOMFG that's just beautiful, best thing I've seen all day :)
>
> Who would've thought jQuery harnessed all this power!
>
> On Sat, May 15, 2010 at 1:49 PM, Alvaro Videla <videlalv...@gmail.com>wrote:
>
>
>
>
>
> > No Aníbal,
>
> > This is THE arithmetic question:
> >http://www.doxdesk.com/img/updates/20091116-so-large.gif  ;)
>
> > Regards,
>
> > Alvaro
>
> > On May 16, 2010, at 7:05 AM, Aníbal Rojas wrote:
>
> > > Incrementing counter in C? Nah. This thread will help you to open your
> > > mind Salvatore, _this_ is the future:
>
> >http://stackoverflow.com/questions/1542006/can-i-atomically-increment...
>
> > > --
> > > Aníbal Rojas
> > > Ruby on Rails Web Developer
> > >http://www.google.com/profiles/anibalrojas
>
> > > On Sun, May 16, 2010 at 3:51 AM, Salvatore Sanfilippo <anti...@gmail.com>
> > wrote:
> > >> On Sat, May 15, 2010 at 9:37 AM, LFReD <terrybrown...@gmail.com> wrote:
> > >>> You know, I'm sitting here pulling 1,000,000 values from a Rebol db
> > >>> with 10,000,000 key/value pairs in 0.7 seconds on a mediocre machine
> > >>> with 2gb, and what do i get?
> > >>> "It's the I/O, don't blame Redis" and other fanboy comments.
>
> > >> I'm incrementing a C counter much faster... if you are analyzing data
> > >> you should switch to C programs dealing with memory: this is the
> > >> future.
>
> > >> Cheers,
> > >> Salvatore
>
> > >> --
> > >> Salvatore 'antirez' Sanfilippo
> > >>http://invece.org
>
> > >> "Once you have something that grows faster than education grows,
> > >> you’re always going to get a pop culture.", Alan Kay
>
> > >> --
> > >> You received this message because you are subscribed to the Google
> > Groups "Redis DB" group.
> > >> To post to this group, send email to redi...@googlegroups.com.
> > >> To unsubscribe from this group, send email to
> > redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> > .
> > >> For more options, visit this group at
> >http://groups.google.com/group/redis-db?hl=en.
>
> > > --
> > > You received this message because you are subscribed to the Google Groups
> > "Redis DB" group.
> > > To post to this group, send email to redi...@googlegroups.com.
> > > To unsubscribe from this group, send email to
> > redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> > .
> > > For more options, visit this group at
> >http://groups.google.com/group/redis-db?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Redis DB" group.
> > To post to this group, send email to redi...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> > .
> > For more options, visit this group at
> >http://groups.google.com/group/redis-db?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group athttp://groups.google.com/group/redis-db?hl=en.

Robey Pointer

unread,
May 26, 2010, 6:50:02 PM5/26/10
to redi...@googlegroups.com
Sorry to hijack this thread with real data, but I've been doing benchmarking of redis too.

On my 2 year old macbook pro laptop, using jredis in pipeline mode, and redis using the append-only log with no fsync, I get about:

Pushed 10000 items in 202 msec.

or a bit under 50k/sec. On a beefy test machine this morning, I actually got:

Pushed 100000 items in 704 msec.

or well over 100k/sec. Test code is in scala below. I didn't put any particular amount of effort into this code, so that's strong praise for the jredis library.

Using pipelining is KEY to getting results this good, by the way.

robey

def writeABunchPipelined(redis: JRedisPipeline) {
var currentKey = 0
var statusId = 1L
var pipeline = new mutable.ListBuffer[Future[ResponseStatus]]

buffer.clear()
buffer.putLong(statusId)
buffer.putLong(0)

val startTime = System.currentTimeMillis
var i = 0
while (i < pushes) {
val key = "data:" + currentKey
pipeline += redis.rpush(key, buffer.array)
if (pipeline.size > pipelineSize) {
pipeline.remove(0).get()
}
i += 1
currentKey = (currentKey + 1) % totalKeys
}
while (pipeline.size > 0) {
pipeline.remove(0).get()
}
val endTime = System.currentTimeMillis
println("Pushed %d items in %d msec.".format(pushes, endTime - startTime))
}

Salvatore Sanfilippo

unread,
May 27, 2010, 4:45:20 AM5/27/10
to redi...@googlegroups.com
On Thu, May 27, 2010 at 12:50 AM, Robey Pointer <robeyp...@gmail.com> wrote:
> Sorry to hijack this thread with real data, but I've been doing benchmarking of redis too.

Very nice to see a counter example ;)

Thanks for sharing,

LFReD

unread,
May 28, 2010, 3:00:06 PM5/28/10
to Redis DB
I suppose it's only fair to backup my point. I'm developing a Rebol
based key/value store with similar features as Redis as a comparison.

At the moment it's pulling 1,000,000 GETS in 0.64 seconds, from a
store with 1,000,000 key/value pairs. Now that's without any I/O, but
I don't need that feature. I'll open a port and see how well it
performs with I/O.

Stay tuned.

Joubin Houshyar

unread,
May 28, 2010, 5:04:43 PM5/28/10
to Redis DB
Hi Robey,

Thank you for sharing.

Note that the send side of (JRedis) pipeling blocks on the sends, so
you are actually paying for socket write latencies on the pushes.

I found the client wasn't quite stable with both send and receive
queued. Blocking on send obviates the need to size the send queue and
provides a natural 'governor' for the flow rate. However, when it is
stable, the numbers are an order of magnitude better on send side.
Also, at least on OS X, socket writes do not complain on broken
connections. (Naturally the actual RTT remains constant.)

However, if you don't mind spec'ing a bounded (send) queue and
watching for insert fails on that end, then you may want to hack the
pipeline code and queue the sends as well. That's the full potential.

/R

Josiah Carlson

unread,
May 28, 2010, 5:05:24 PM5/28/10
to redi...@googlegroups.com
You're point is lost on me.

Because you are doing in-process access, it is (as stated previously)
apples and oranges (redis talks across the network). You can use C++
and the STL to do tens of millions of gets/second from an in-process
hash on a modern machine. Heck, I wrote my own hash table 6 years ago
(using C), ran it on a 1.2 ghz mobile processor, and was able to do 5+
million gets/second. That sort-of makes your rebol-based solution
seem slow now, doesn't it?

[de]serialization, network latency (even over a local socket, pipe, or
unix domain socket), and data copies are going to reduce performance
significantly.

Good luck,
- Josiah

P.S. Incidentally, common use-cases of Google's AppEngine include
memcache copies of model objects. While this would seem like a good
idea generally (memcaching is obviously faster!), it turns out that
the de-serialization of the models could take longer than fetching the
models directly from the data store. I don't know if that is still
the case, but I remember learning that lesson the hard way about a
year ago (removing memcache gave me a 2-3x reduction in overall
request latency).

LFReD

unread,
May 29, 2010, 3:41:02 AM5/29/10
to Redis DB
As I've mentioned, if you're using Redis for simply pulling values of
keys, then fine. And everything depends on the usage.
I'm running an inference engine against millions of keys.. pulling
properties, filtering, grouping, then pulling more properties etc.
Some of these more complicated queries using MySQL can take 10 or 20
seconds. But at 7700 GETS/s using say Predis, it's worse than MySQL

Now some here have said that this kind of data crunching is not what
Redis is for. From the performance I've seen, that's a correct
statement.

If I/O is the problem, then Redis has a problem..
Client <- io -> (PHP, RUBY, NODE.JS) <- io -> Redis <- io (optional) -
> DB

I'm putting together an example that connects clientside JS -> Rocket
(the combined http server and key / value store) via websocket.
Initial results look promising.

Roundtrip from JS - > Rocket (send message, pull 100,000 values from
1,000,000 key/values, display time on page): 0.07499980926513672
Time to pull the 100,000 values: 0.069

Same thing with Predis and Redis.. 12 seconds.
The other thing mentioned was how data should be structured in Redis
for max performance. Shouldn't this be automatic? NoSQL that requires
complicated schema of sorts is just SQL in a new dress.

If by "apples and oranges" you mean one way to generate the exact same
goal is faster than the other, then yeah.

- Terry

P.S. I found the same results with memcache.. it actually slowed my
queries a bit. I think you need the volume before seeing the benefit.

Josiah Carlson

unread,
May 29, 2010, 3:47:44 PM5/29/10
to redi...@googlegroups.com
Now we know you're comparing apples and oranges :P . There is no way
that javascript can pull 100k values across an xmlhttp connection
without doing a bulk request... which is exactly the equivalent of an
MGET and/or PIPELINE command in redis. You should re-run your redis
benchmark using those commands, that's how Robey was able to do his
100k+ requests/second, and it's how the benchmark command is able to
do 55k request/second on your machine.

Are you sure you're counting the full round-trip time? And are you
sure you are measuring the actual display time?

Regardless, if you have a (javascript) client that needs to make an
http request directly to a cache/datastore, unless you added the
functionality to redis (not necessarily the easiest thing in the
world), that's just one more hop, and it is unlikely that redis
addresses your use-case directly.

- Josiah

paulino huerta

unread,
May 29, 2010, 4:18:55 PM5/29/10
to redi...@googlegroups.com


2010/5/29 LFReD <terryb...@gmail.com>

As I've mentioned, if you're using Redis for simply pulling values of
keys, then fine. And everything depends on the usage.
I'm running an inference engine against millions of keys.. pulling
properties, filtering, grouping, then pulling more properties etc.
Some of these more complicated queries using MySQL can take 10 or 20
seconds. But at 7700 GETS/s using say Predis, it's worse than MySQL

Now some here have said that this kind of data crunching is not what
Redis is for. From the performance I've seen, that's a correct
statement.

If I/O is the problem, then Redis has a problem..
Client <- io -> (PHP, RUBY, NODE.JS) <- io -> Redis <- io (optional) -
> DB

I'm putting together an example that connects clientside JS -> Rocket
(the combined http server and key / value store) via websocket.
Initial results look promising.

Roundtrip from JS - > Rocket (send message, pull 100,000 values from
1,000,000 key/values, display time on page): 0.07499980926513672
Time to pull the 100,000 values: 0.069

Same thing with Predis and Redis.. 12 seconds.
The other thing mentioned was how data should be structured in Redis
for max performance. Shouldn't this be automatic? NoSQL that requires
complicated schema of sorts is just SQL in a new dress.

This is also a possibility.
Where is the sin?
NOSQL not mean that there is no way, a style a recommendation for a certain task to get better perfomance. In the broad field of programming has always been the case. Surely someone, the more valid that I could improve a piece of C code written by me. Do not you think?
Regarding how to structure data, it is the same.
Surely a little later we will have a book you just try to design or data model using Redis
What is the problem to suggest: state: instead of using 20 variables using an array of 20 elements?

Demis Bellot

unread,
May 29, 2010, 4:28:43 PM5/29/10
to redi...@googlegroups.com
Seriously, is this thread still going on?

We are comparing an in-process solution to one that goes through a network layer, of course the in-process one is going to be faster, end of story.
Unless someone submits benchmarks showing the Rocket/Rebol solution extended to work over a network so we can actually make some valid comparisons, let this thread be over.

- Demis

Joubin Houshyar

unread,
May 29, 2010, 5:06:45 PM5/29/10
to Redis DB
I thought we were having fun and being ironic.

Any one who has written the most trivial server knows that the
networking latencies are orders of magnitude higher than anything else
(outside of other type of device IO) that goes on in the server. They
dwarf them. I have explicitly written a Java server implementing GET
and PING and the results are indistinguishable from Redis.

Stability, load management, rich semantics, a simple but effective
protocol, extremely versatile (count the sweet spot use cases), and
blazing performance. <<< That is why there is hype.

By far *the most painless* persistence solution I have every
encountered.
Not to mention that it is a very fun piece of software.

/R

On May 29, 4:28 pm, Demis Bellot <demis.bel...@gmail.com> wrote:
> Seriously, is this thread still going on?
>
> We are comparing an in-process solution to one that goes through a network
> layer, of course the in-process one is going to be faster, end of story.
> Unless someone submits benchmarks showing the Rocket/Rebol solution extended
> to work over a network so we can actually make some valid comparisons, let
> this thread be over.
>
> - Demis
>
> On Sat, May 29, 2010 at 9:18 PM, paulino huerta <paulinohue...@gmail.com>wrote:
>
>
>
>
>
> > 2010/5/29 LFReD <terrybrown...@gmail.com>
> >> redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> >> .
> >> > > For more options, visit this group athttp://
> >> groups.google.com/group/redis-db?hl=en.
>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "Redis DB" group.
> >> To post to this group, send email to redi...@googlegroups.com.
> >> To unsubscribe from this group, send email to
> >> redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> >> .
> >> For more options, visit this group at
> >>http://groups.google.com/group/redis-db?hl=en.
>
> >  --
> > You received this message because you are subscribed to the Google Groups
> > "Redis DB" group.
> > To post to this group, send email to redi...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> > .

LFReD

unread,
May 30, 2010, 3:17:45 AM5/30/10
to Redis DB
Josiah, I'm not pulling 100k values from Redis to client. I'm sending
in a request, pulling 100k keys, processing, then sending the result
back. In this case, the 'processing' is simply timing how long it
takes to pull the values.

Tim Haines

unread,
May 30, 2010, 3:23:34 AM5/30/10
to redi...@googlegroups.com
Terry, what's your understanding of 'client'?

LFReD

unread,
May 30, 2010, 4:08:43 AM5/30/10
to Redis DB
In the example above, I'm referring to Chrome 5.0 beta connected via
websocket. From my early experiments, having the bi-directional, full-
duplex channel over a single socket is about as fast as it gets.

I can pull a JSON value from a large Rocket datastore using this
method in under 1ms (on the same network :)
> > redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> > .
> > > >> > For more options, visit this group athttp://
> > groups.google.com/group/redis-db?hl=en.
>
> > > > --
> > > > You received this message because you are subscribed to the Google
> > Groups "Redis DB" group.
> > > > To post to this group, send email to redi...@googlegroups.com.
> > > > To unsubscribe from this group, send email to
> > redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> > .
> > > > For more options, visit this group athttp://
> > groups.google.com/group/redis-db?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Redis DB" group.
> > To post to this group, send email to redi...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> > .

LFReD

unread,
May 30, 2010, 4:29:35 AM5/30/10
to Redis DB
Let me just reiterate my hyperbole point.

I was originally referring to the SPEED (or lack thereof) when using
Redis and the available PHP libs (Predis etc).
The Redis induced benchmark was saying 55,000 GETS / s, but the
reality was 7700 GETS / s with these libraries <i> with a single
client!</i>

Now, to come back to me and say "oh, that's the <i>SUM</i> of GETs
based on multiple clients" (as was mentioned) then I call HYPERBOLE.

7700 gets a second, using the PHP libs, for a single client IS NOT
FAST.
It may be fast enough, but fast is a <i>relative</i> term, and
frankly, I'm getting better performance from MySQL.

I simply built the Rocket / Cheyenne / Rebol method to prove the point
using the same goal.. pulling values from large datastores... This
constant "comparing apples to oranges" argument is lame.

It's like saying "What vehicle will get me to New York the fastest?"
You say "Model T", and I say "Scramjet", and then argue I'm comparing
apples to oranges?

On May 29, 1:28 pm, Demis Bellot <demis.bel...@gmail.com> wrote:
> Seriously, is this thread still going on?
>
> We are comparing an in-process solution to one that goes through a network
> layer, of course the in-process one is going to be faster, end of story.
> Unless someone submits benchmarks showing the Rocket/Rebol solution extended
> to work over a network so we can actually make some valid comparisons, let
> this thread be over.
>
> - Demis
>
> On Sat, May 29, 2010 at 9:18 PM, paulino huerta <paulinohue...@gmail.com>wrote:
>
>
>
>
>
> > 2010/5/29 LFReD <terrybrown...@gmail.com>
>
> >> redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> >> .
> >> > > For more options, visit this group athttp://
> >> groups.google.com/group/redis-db?hl=en.
>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "Redis DB" group.
> >> To post to this group, send email to redi...@googlegroups.com.
> >> To unsubscribe from this group, send email to
> >> redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> >> .
> >> For more options, visit this group at
> >>http://groups.google.com/group/redis-db?hl=en.
>
> >  --
> > You received this message because you are subscribed to the Google Groups
> > "Redis DB" group.
> > To post to this group, send email to redi...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> > .

Demis Bellot

unread,
May 30, 2010, 10:54:44 AM5/30/10
to redi...@googlegroups.com
I really don't want to feed the troll but...

What you've said is that I can't get 55k per second using Redis because of the network I/O latency in my single process PHP script - Redis is Lame!
To show you all how lame this is I will run the same program but instead of using a network server I will use an in-memory datastore like a l337 hax0r!
Look at my numbers guys, access to my local hash table pwnz Redis hash tables - I Rule!

And what we're saying is No Shit Sherlock! You're comparing accessing a network datastore with one that isn't - of course that its going to be faster! If you didn't know that starting out then there is another problem here not related to Redis.
Use the best tool for the job and get on with your life.

Since you're saying that your MySQL is faster, now that is what we call an Apples to Apples comparison and I would like to see how you've got MySQL to be faster than Redis.

- Demis


To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.

Matt Todd

unread,
May 30, 2010, 3:50:06 PM5/30/10
to redi...@googlegroups.com
There really is a huge difference between "how fast is Redis?" verses
"how fast is a client connected to Redis?".

Redis can handle many more commands than a client can issue and
process results for. Correct me if I'm wrong, but I believe Redis
doesn't have to wait around for the client to finish reading before it
can continue to process the next request.

The important thing to understand is that, yes, one client may not be
able to hit a huge throughput, but that doesn't mean Redis can't
handle it. By increasing the number of clients connected, you're able
to utilize the streaming qualities and actually test Redis' capacity
instead of measuring latency.

Matt

--
Matt Todd
Highgroove Studios
www.highgroove.com
cell: 404-314-2612
blog: maraby.org

Scout - Web Monitoring and Reporting Software
www.scoutapp.com

LFReD

unread,
May 31, 2010, 1:51:53 AM5/31/10
to Redis DB
Ok, you're right. Redis is fast, and so is a Ferrari. But who cares
when a Ferrari can only go 30 mph due to speed limits (I/O)?

At least with a Ferrari you <i>can</i> break the speed limit, and run
it down the road at 200 mph. With Redis's dependency on the network
and it's latency to communicate AT ALL, it's impossible. What good is
Redis WITHOUT I/O to some library? Why is the port access considered a
feature, when this is the very thing that kills the performance of the
intended goal, which is delivering data?

If a Ferrari <i>couldn't</i> break the speed limit, I'd call their
advertising hyperbole too.

You could say the Ferrari will "get you from zero to 30 mph faster
than any other auto".

Saying Redis is not the problem is like saying the Ferrari's not the
problem, it's the speed limits imposed on it.

Well, if that's the case, then you might as well buy a Pinto... or
MySQL.

OUT!
> >> > >> > >>  ...
>
> read more »

Matt Todd

unread,
May 31, 2010, 2:36:52 AM5/31/10
to redi...@googlegroups.com
I would like to echo Demis' inquiry about your MySQL benchmarks,
though... could you share some details?

Salvatore Sanfilippo

unread,
May 31, 2010, 4:15:09 AM5/31/10
to redi...@googlegroups.com
On Mon, May 31, 2010 at 7:51 AM, LFReD <terryb...@gmail.com> wrote:
> Ok, you're right. Redis is fast, and so is a Ferrari. But who cares
> when a Ferrari can only go 30 mph due to speed limits (I/O)?

Hello Terry,

you showed your point, most of us (for sure me) don't agree.
I personally think that this is starting to sound like trolling /
spam-about-another-product, and since people subscribed to this
mailing list are interested in Redis and good arguments and good
signal/noise ratio, I suggest to stop this discussion, or move it in
some other more appropriate place.

Kind Regards,

Reply all
Reply to author
Forward
0 new messages