Predis vs. phpRedis, pipelining, clustering and others

5,673 views
Skip to first unread message

Xiangrong Fang

unread,
Nov 26, 2010, 6:07:41 PM11/26/10
to redi...@googlegroups.com
Hi There,

I wonder how Predis compare to phpRedis, in the following aspects:

1) which one support command pipelining better?

2) How about clustering support? Which one follows redis development faster?

3) Which one performs better?

4) And how about stability?

Finally, could anyone explain what exactly is pipelining?  Is it same as multi/exec?  Could anyone give me code examples of how pipelining works in phpRedis and Predis?

Thanks!
Shannon

Daniele Alessandri

unread,
Nov 27, 2010, 8:33:20 AM11/27/10
to redi...@googlegroups.com
On Sat, Nov 27, 2010 at 00:07, Xiangrong Fang <xrf...@gmail.com> wrote:

Hi,

quick preamble: I'm the developer of Predis.

> Hi There,
> I wonder how Predis compare to phpRedis, in the following aspects:
> 1) which one support command pipelining better?

I don't think there is much difference in terms of support since
command pipelining is actually based on a quite simple trick that does
not require anything special. The only real difference is in how this
feature is exposed to developers by the public API of each library,
but again it is not something relevant unless you need one or two
specific features of Predis that are related to client-side
clustering.

> 2) How about clustering support? Which one follows redis development faster?

With Predis, clustering is achieved via client-side sharding against
multiple Redis servers.
phpredis on the other hand does not support clustering and I think
their decision is to wait for Redis cluster (which IMHO is a
reasonable choice at this point), but I'm sure that Nicolas, one of
the authors of phpredis, can give you all the details about the future
directions on that matter.

As for the development speed, both libs are constantly updated to add
features and fix bugs and to add support for new commands and reflect
changes in Redis.

> 3) Which one performs better?

While Predis is implemented only in pure PHP, phpredis is a C
extension for PHP which makes it faster for obvious reasons,
especially on localhost connections. The speed and overhead of Predis
is more than acceptable for many scenarios, but it basically depends
on your requirements so you might want to do some ad-hoc tests first
to determine which one suits your application best.

> 4) And how about stability?

I'd say they both work fine in terms of stability.

> Finally, could anyone explain what exactly is pipelining?  Is it same as
> multi/exec?  Could anyone give me code examples of how pipelining works in
> phpRedis and Predis?

In Redis, pipelining commands means to send multiple requests to the
server without reading the replies to each one of them until the end,
thus saving useless network round-trips between the client and the
server (which speeds things up quite a bit in many cases). MULTI /
EXEC is a completely different beast as all the commands executed
between MULTI and EXEC are basically executed sequentially as a single
atomic operation (see the docs at
http://code.google.com/p/redis/wiki/MultiExecCommand for a more
detailed description). These two concepts are complementary and MULTI
/ EXEC "transactions" can be pipelined to a server instance.

As for the examples, they pretty much look alike in their basic forms:

/* predis */
$ret = $redis->pipeline()
->set('key1', 'val1')
->get('key1')
->set('key2', 'val2')
->get('key2')
->execute();

/* phpredis */
$ret = $redis->multi(Redis::PIPELINE)
->set('key1', 'val1')
->get('key1')
->set('key2', 'val2')
->get('key2')
->exec();

Anyway I really suggest you to take a look at the documentation and
the examples of both the libraries to get a more in-depth glimpse at
how they both support pipelining.

--
Daniele Alessandri
http://clorophilla.net/
http://twitter.com/JoL1hAHN

Xiangrong Fang

unread,
Nov 27, 2010, 9:05:17 AM11/27/10
to redi...@googlegroups.com
Hi Daniele,

Thanks for the explanations.  I hope to get some comments from phpredis developer too :)  Still have a few questions:

>> 4) And how about stability?
> I'd say they both work fine in terms of stability.

While testing with predis at the stage when we evaluate redis vs. mongodb to be used in our project, I encountered serious problems with large value.   I found some info in predis community that this is a problem of php's socket implementation?  I don't know if there are any progress with php 5.3?

> In Redis, pipelining commands means to send multiple 
> requests to the server without reading the replies to each
> one of them until the end, thus saving useless network
> round-trips between the client and the server (which 
> speeds things up quite a bit in many cases).

So pipelining is NOT multi/exec.   But what if there are errors during the pipelining? You said "until the end", which I understand is:

1) pipelining is a client behavior which has nothing to do with redis server
2) redis server still send replies to client, but client just ignore it until the "batch" of commands finished.  i.e. it send second command before receiving reply from first command and so on...  In another word, it boost performance by sending commands faster, but it does not save any bandwidth consumption (or, is there a command to instruct redis server to NOT send any reply in "pipeline" mode???)

Is that correct?

Best Regards,
Shannon

Daniele Alessandri

unread,
Nov 27, 2010, 10:30:15 AM11/27/10
to redi...@googlegroups.com
On Sat, Nov 27, 2010 at 15:05, Xiangrong Fang <xrf...@gmail.com> wrote:

> Hi Daniele,
> Thanks for the explanations.  I hope to get some comments from phpredis
> developer too :)  Still have a few questions:

He is on this mailing list, so I guess he will eventually find the
time to give his contribution.

> While testing with predis at the stage when we evaluate redis vs. mongodb to
> be used in our project, I encountered serious problems with large value.   I
> found some info in predis community that this is a problem of php's socket
> implementation?  I don't know if there are any progress with php 5.3?

Yeah, Predis by default uses PHP's socket streams but with them
there's no way for developers to set in userland the TCP_NODELAY flag
on the underlying socket, which is very annoying and can lead to
extremely low throughput when transmitting large values that do not
fit in a single TCP packet.

I have implemented a workaround for this by creating a new connection
class that internally leverages socket resources provided by the PHP
socket extension. It can be used with the current development version
of Predis, see https://github.com/nrk/predis/blob/socket_extension/lib/addons/SocketBasedTcpConnection.php

BTW phpredis was affected by the same problem right until a couple of
months ago or so, but they were able to fix this issue in a simple and
clean way since PHP exposes to C extensions a way to get the
underlying raw socket out of socket streams, thus allowing the
TCP_NODELAY flag to be set.

> So pipelining is NOT multi/exec.   But what if there are errors during the
> pipelining? You said "until the end", which I understand is:

Pretty much all the implementations of pipelining return an array of
replies, so server errors (as in -ERR replies sent back to the client
by Redis) are usually simply stored in that array together with
regular replies [*]. In the end, the only difference when operating
with pipelines is that you can't build and send commands based on the
reply from previous ones. The rest of the workflow is basically the
same.

> 1) pipelining is a client behavior which has nothing to do with redis server
> 2) redis server still send replies to client, but client just ignore it
> until the "batch" of commands finished.  i.e. it send second command before
> receiving reply from first command and so on...  In another word, it boost
> performance by sending commands faster, but it does not save any bandwidth
> consumption (or, is there a command to instruct redis server to NOT send any
> reply in "pipeline" mode???)
> Is that correct?

1) and 2) are both correct. There's currently no way to tell the
server to discard one or multiple replies for issued commands, but
this could be easily emulated client-side just by writing all the
requests to the server and disconnecting the client instance instead
of performing any read operation (while Redis simply discards the
replies buffer for the associated connection). This might seem a bit
hackish and needs further support by the client library, but it would
definitely work.


[*] by default Predis throws an exception as soon as a -ERR reply is
intercepted while reading replies from the server, this behaviour can
be overridden to return error objects by initializing the client with
the 'throw_on_error' option set to false:

$redis = new Predis\Client($server, array('throw_on_error' => false));

Xiangrong Fang

unread,
Nov 27, 2010, 7:19:14 PM11/27/10
to redi...@googlegroups.com
About pipelining I wonder why it existed, why not just use multi/exec?  For multi/exec, I don't know if the command is sent to redis directly (one by one) or buffered at client side and send to redis in a batch when exec is issued?

For both pipelining and multi/exec, I think client side buffering is an effective way to eliminate a portion of communication costs. and thus improve performance.

What is the problem with client side buffering, if any?

Thanks!
Shannon

2010/11/27 Daniele Alessandri <suppa...@gmail.com>
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.


Demis Bellot

unread,
Nov 27, 2010, 7:37:25 PM11/27/10
to redi...@googlegroups.com
Hi Shannon,

Pipelining and Multi/Exec are different features which can be used together or not.

With Pipelining the Redis Client can send multiple commands without receiving/processing the request, this leads to greater performance as there is less chatter on the network and the client doesn't have to wait for a roundtrip before sending the next command and Redis doesn't wait for the reply to be sent to the client before processing the next command.

By contrast MULTI/EXEC allows clients to compose their own atomic operations by wrapping it in a MULTI/EXEC. Only when the Redis server receives an EXEC does it process all the commands at once and returns a MultiBulk reply that contain all the responses of the commands. 

After each command is sent in a MULTI/EXEC a 'QUEUED' is returned. Now its up to the client whether to treat it as a pipeline or process each commands response individually.

Hope this explains the differences of each better.

- Demis

Derek Williams

unread,
Nov 27, 2010, 7:41:15 PM11/27/10
to redi...@googlegroups.com
Another way of pipelining, which I use in my own client, is to return
a future to access the result at a later time. All requests to the
server are sent without waiting for any responses, those are read in
as they become available. I do this for all requests, even if the code
waits for the response immediatly after the request. This is much
different then multi/exec, and allows me to easily use only 1 client
on a heavily multithreaded web app.

Most clients use the other method though, which is very similar to
just using multi/exec.

--
Derek

Nicolas Favre-Félix

unread,
Nov 28, 2010, 10:37:19 AM11/28/10
to redi...@googlegroups.com
Hello,

I'm the author of phpredis that Daniele mentioned earlier in this thread. I think the participants of this thread have described multi/exec and pipelining pretty well, and he has explained the differences between predis and phpredis well too.

APIs and speed: There have been some benchmarks between phpredis and predis in the past, some are even in the archives of this mailing-list (http://goo.gl/AOwym). The back-of-the-envelope figure is that phpredis comes out about twice as fast. I would mostly encourage you to look at the API differences by yourself, and to see which one you prefer: Benchmark your own app, with pipelining on.

Clustering: As mentioned, phpredis doesn't support consistent hashing. I haven't found a proper way to do it yet, the main issue being that the Redis server supports commands that can use several keys (e.g. do a union of these N sets and store the result in that key). A consistent hashing algorithm might map these N keys to different servers, and the command just wouldn't work. I have replied at length on this subject on our bug tracker: https://github.com/owlient/phpredis/issues/closed#issue/47/comment/450768
I'm not sure whether Redis-cluster will solve this issue and allow distributed transactions touching keys on several machines; doing so in an atomic way is very costly and I'm not sure how these commands will be implemented in a distributed system.
That said, I understand that for commands manipulating only one key this would be a very useful feature to have. I'll have a look next week at the way other client libraries handle clustering and try to implement it.

General availability of features: I think both predis and phpredis are following Redis very closely when it comes to adding support for the new features, often implementing the new commands within hours of their availability on Redis master.

Multi/exec and pipelining: At the moment, phpredis is lacking the ability to _combine_ pipelining and multi/exec. You can only enter one of these modes, and each command is sent to redis for queuing when multi/exec is enabled. Predis combines them using a closure, which we don't use in phpredis. As Daniele explained we still gain from the support of TCP_NODELAY, but by combining both in predis you might see predis perform close to the level of phpredis.

Nicolas 

Xiangrong Fang

unread,
Nov 29, 2010, 4:22:36 AM11/29/10
to redi...@googlegroups.com
Thanks Nicolas.

As for my concerns.   The reason for most of my "side" questions, including performance/stability etc.  came from my impression from the Redis Wiki about client libraries.  I got the impression is that although phpredis is C based extension (which I would prefer for obvious performance reasons), it is *less* active and does not support latest Redis 2 features well.   So, I am wrong about that.

My key concerns, however, is about clustering and pipeline/multi-exec.

Clustering: I think the idea of waiting redis cluster to mature might be a wise option for now.   What I do worry is: it seems that you cannot do operations on two keys if they spread over 2 machines, and that's why the "hash mark" (i.e. {...}) is designed in redis keys.  I feel that could be a non-trivial problem in "porting" non-cluster-ready code to cluster-ready code.

Pipeline/Multi-Exec:  from Daniele's explanation, I now have a fair understanding of the difference and purpose of the two. But I still have one question remain unclear:  how about client side buffering? i.e. when doing:

//Predis version
#1 $ret = $redis->pipeline()
#2            ->set('key1', 'val1')
#3            ->get('key1')
#4            ->set('key2', 'val2')
#5            ->get('key2')
#6            ->execute();

How is the underlying communication happening?  i.e. ALL data is sent (as a \n separated list of strings) to server at the moment of execute(), or it happens throughout line#2~#6?  I feel that if the client can "buffer" commands (as long as a pipeline or mult is requested) and send out commands in one go, performance would be best.

Thanks,
Shannon


2010/11/28 Nicolas Favre-Félix <n.favr...@gmail.com>

Nicolas Favre-Félix

unread,
Nov 29, 2010, 10:25:05 AM11/29/10
to redi...@googlegroups.com
Xiangrong,

On 29 November 2010 10:22, Xiangrong Fang <xrf...@gmail.com> wrote:

Clustering: I think the idea of waiting redis cluster to mature might be a wise option for now.   What I do worry is: it seems that you cannot do operations on two keys if they spread over 2 machines, and that's why the "hash mark" (i.e. {...}) is designed in redis keys.  I feel that could be a non-trivial problem in "porting" non-cluster-ready code to cluster-ready code.

You're right, you can't do multi-key operations in Redis if the keys are spread between machines. I don't know how Redis will address this issue with the future cluster product. Right now you can organize your data in a way that will be compatible with your use-cases. If you're building a 4square clone, shard your users geographically; if you're building an ERP, put clients 1→N with all their dependencies on the first machine, then N->2N on the second machine, etc. As long as you access your data in ways that are compatible with the technique you used to partition them, you shouldn't be bothered by these issues.



Pipeline/Multi-Exec:  from Daniele's explanation, I now have a fair understanding of the difference and purpose of the two. But I still have one question remain unclear:  how about client side buffering? i.e. when doing:

//Predis version
#1 $ret = $redis->pipeline()
#2            ->set('key1', 'val1')
#3            ->get('key1')
#4            ->set('key2', 'val2')
#5            ->get('key2')
#6            ->execute();

How is the underlying communication happening?  i.e. ALL data is sent (as a \n separated list of strings) to server at the moment of execute(), or it happens throughout line#2~#6?  I feel that if the client can "buffer" commands (as long as a pipeline or mult is requested) and send out commands in one go, performance would be best.

Pipelining is only client-side buffering. Nothing else. Here, execute() will only send a large command instead of 4 small commands. With multi-exec, you have a guarantee of atomicity from MULTI to EXEC. If we continue with this code as an example, it is entirely possible that you will do SET key1 val1, the another client will change that value, and then when you run GET key1, the data won't be "val1". Sure, you wrote to redis in a single chunk, but there's no guarantee that Redis will process all of them without doing anything else on the side. If you want to make sure you have Redis for yourself during the duration of a transaction, you need to wrap it in a call to multi() and a call to exec().
Pipelining only saves you some latency, but won't ensure that all your commands will be processed in one shot.

Currently, phpredis cannot send a multi/exec transaction in one shot, and all the commands are sent separately. Predis can do that, though.

Nicolas

Jak Sprats

unread,
Nov 29, 2010, 8:15:46 PM11/29/10
to Redis DB
Hi Nicolas and Xiangrong,

The fact that multi key commands can not be done on keys spanning
nodes does not mean you can't do multi-key commands in redis' cluster.

Currently there is a mechanism in the Ruby client (which has client
side consistent hashing), which hashes on a subsection of the key (it
is a trivial regex like "*[hash-key]*" and this means if you have two-
sets and name them "set[box1]45" and set[box1]99" they will both use
"box1" as the consistent hash key and both be places on wherever
"box1" hashes to. Additionally "SINTER set[box1]45 set[box1]99" will
work because they are on the same box {and the client can check that
both [] hash keys map to the same machine}.

This is not an automated distribution mechanism by any means, but the
problem of joins or intersections ACROSS nodes is best solved by
avoiding doing them, they can not be done efficiently. If you know
your data, then use the same [hash-key] for sets you will intersect or
union (partition your sets). If your data is so big that the set of
sets you will intersect/union can not fit on one machine, then the
only solution is to put in a proxy that will transmit ALL the SETs you
are intersecting/unioning over TCP and then do the intersect/union in
the proxy ... and in practice this is just too slow (think 5-set
intersection of 10K member sets that results in 100 rows ... requires
50K rows to be transmitted from seperate machines via TCP - which is 4
to 6 orders of magnitude slower than RAM)

- Jak

p.s is there any good documentation on the Ruby client's [hash-key]
functionality.

On Nov 29, 8:25 am, Nicolas Favre-Félix <n.favrefe...@gmail.com>
wrote:
> > 2010/11/28 Nicolas Favre-Félix <n.favrefe...@gmail.com>
>
> >> Hello,
>
> >> I'm the author of phpredis that Daniele mentioned earlier in this
> >> thread. I think the participants of this thread have described multi/exec
> >> and pipelining pretty well, and he has explained the differences between
> >> predis and phpredis well too.
>
> >> APIs and speed: There have been some benchmarks between phpredis and
> >> predis in the past, some are even in the archives of this mailing-list (
> >>http://goo.gl/AOwym). The back-of-the-envelope figure is that phpredis
> >> comes out about twice as fast. I would mostly encourage you to look at the
> >> API differences by yourself, and to see which one you prefer: Benchmark your
> >> own app, with pipelining on.
>
> >> Clustering: As mentioned, phpredis doesn't support consistent hashing. I
> >> haven't found a proper way to do it yet, the main issue being that the Redis
> >> server supports commands that can use several keys (e.g. do a union of these
> >> N sets and store the result in that key). A consistent hashing algorithm
> >> might map these N keys to different servers, and the command just wouldn't
> >> work. I have replied at length on this subject on our bug tracker:
> >>https://github.com/owlient/phpredis/issues/closed#issue/47/comment/45...
> >>> > 2010/11/27 Daniele Alessandri <suppaki...@gmail.com>
> >>>https://github.com/nrk/predis/blob/socket_extension/lib/addons/Socket...
> >>> >> this could be easily...
>
> read more »

Xiangrong Fang

unread,
Nov 30, 2010, 4:38:12 AM11/30/10
to redi...@googlegroups.com
Hi Jak,

I personally think that its wise to wait for Redis Cluster to mature, because previously I was told about using {hash} as part of key, and now you tell me [hash], apparently, the standard here is setup by various client libraries which support consistent hashing.  I hope there is an "official" standard so that all client should follow.

Also, I recently read the interface of phpRedis briefly, unfortunately its methods are not named same as Predis.   I think predis' method name is a bit better, because it use exactly same form (including case, because php is case sensitive) as native redis command.

Thanks,
Shannon

2010/11/30 Jak Sprats <jaks...@gmail.com>

--

Ezra Zygmuntowicz

unread,
Nov 30, 2010, 1:11:10 PM11/30/10
to redi...@googlegroups.com

Using foo{hash-key}bar with curly braces is the prefered way of doing this based on Salvatore's blog post about it from way back. So any client libs that implement this should use {} curly braces to denote the text to run the hash function on.

I agree that it is a primitive way of having set intersections work in a distributed redis setup and that if you can wait for redis cluster that will be better. but even then redis cluster itself is going to need to do something similar to this in the background in order to support intersection on the same shard or whatever they end up being called in redis-cluster speak.

Cheers-
-Ezra

Ezra Zygmuntowicz
ezmo...@gmail.com

Michel Martens

unread,
Nov 30, 2010, 2:33:49 PM11/30/10
to redi...@googlegroups.com
Hey,

Right now, only the tests and Salvatore's blog post explain how it works.

Sorting into a distributed environment with key tags:
http://antirez.com/post/Sorting-in-key-value-data-model.html

How to tag a key in Redis::Distributed (redis-rb):
https://github.com/ezmobius/redis-rb/blob/master/test/distributed_key_tags_test.rb#L26

How to define a custom tag format with a regular expression:
https://github.com/ezmobius/redis-rb/blob/master/test/distributed_key_tags_test.rb#L43

Nicolas Favre-Félix

unread,
Dec 1, 2010, 8:07:51 AM12/1/10
to redi...@googlegroups.com
Hi,

Thanks all for the details about key hashing. There seem to be conflicting ideas though; from what I understand some libraries use {curly braces} even though Salvatore's blog post used [square ones]. I don't know Redis-rb's customizable tag format is part of an official recommendation for client libraries or if it was just implemented out of need. Unifying the client behaviours would be good for the Redis ecosystem.

To set one thing clear though, function and method names are completely case-insensitive in PHP. Whichever library you use, calling HGETALL or hGetAll won't change a thing. Redis commands aren't case-sensitive either, for that matter.

Cheers, 
Nicolas

Jak Sprats

unread,
Dec 1, 2010, 2:56:15 PM12/1/10
to Redis DB
Hi

my mistake on the [] ... Ezra (author of the Ruby client knows
best:) ... it is {} ... repeat {}

Nicolas is correct, the mechanism for placing keys on the same node
(or hash-slot in the cluster) needs to be consistent in all the
clients (especially for the cluster). This is an old topic, so I
imagine that the Ruby client was the only client (AFAIK) to go ahead
and implement this feature, and this topic has been mentioned many
times in relation to redis-cluster.

Just a reminder: If redis-cluster is going to use {}, and stick w/ it,
as the hash-key-delim, this logic needs to be buried deep down in all
the hash slot migration code in the server also.

- Jak

On Dec 1, 5:07 am, Nicolas Favre-Félix <n.favrefe...@gmail.com> wrote:
> Hi,
>
> Thanks all for the details about key hashing. There seem to be conflicting
> ideas though; from what I understand some libraries use {curly braces} even
> though Salvatore's blog post used [square ones]. I don't know Redis-rb's
> customizable tag format is part of an official recommendation for client
> libraries or if it was just implemented out of need. Unifying the client
> behaviours would be good for the Redis ecosystem.
>
> To set one thing clear though, function and method names are completely
> case-insensitive in PHP. Whichever library you use, calling HGETALL or
> hGetAll won't change a thing. Redis commands aren't case-sensitive either,
> for that matter.
>
> Cheers,
> Nicolas
>
> On 30 November 2010 20:33, Michel Martens <mic...@soveran.com> wrote:
>
>
>
> > Hey,
>
> >https://github.com/ezmobius/redis-rb/blob/master/test/distributed_key...
>
> > How to define a custom tag format with a regular expression:
>
> >https://github.com/ezmobius/redis-rb/blob/master/test/distributed_key...
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Redis DB" group.
> > To post to this group, send email to redi...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>
> > .

Xiangrong Fang

unread,
Dec 1, 2010, 11:35:16 PM12/1/10
to redi...@googlegroups.com
To set one thing clear though, function and method names are completely case-insensitive in PHP. Whichever library you use, calling HGETALL or hGetAll won't change a thing. Redis commands aren't case-sensitive either, for that matter.

Although php functions/methods are case insensitive, you must use lowercase when using Predis... I just tested it.  Am I doing anything wrong...

Daniele Alessandri

unread,
Dec 2, 2010, 5:27:33 AM12/2/10
to redi...@googlegroups.com
On Thu, Dec 2, 2010 at 05:35, Xiangrong Fang <xrf...@gmail.com> wrote:

> Although php functions/methods are case insensitive, you must use lowercase
> when using Predis... I just tested it.  Am I doing anything wrong...

You are doing nothing wrong: Predis uses the __call metamethod to
catch method names that represent Redis commands, and names are
checked in a case-sensitive fashion.

Reply all
Reply to author
Forward
0 new messages