Messaging queue

Showing 1-39 of 39 messages
Messaging queue Burak DEDE 12/15/11 3:01 PM
I have application (tornado) where users can send/recieve private mesaages to/from each other and also can chat publicly. I am using redis (brukva) pub/sub mesaaging for realtime chat but I cant decide which mesaaging queue system should I use for private messaging part since its not realistic to send and process messages in a single post/get request. I think queue will resolve this problem by processing messages in the background.  Is redis can be considered as option in here or should I go with other messaging queue structures ? Any queue suggestions for that particular use case and why ??

--
Burak DEDE
www.burakdede.com
www.twitter.com/burakdede
www.friendfeed.com/burakdede

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
Re: Messaging queue Josiah Carlson 12/15/11 3:42 PM
You can implement direct messaging by giving each user a special
channel that only they are listening to, handling the display on the
client side.

As an alternative to pubsub, this code snippet implements a more or
less fully-featured chat server in Redis using zsets, hashes, and
polling: https://gist.github.com/1045789 .

Regards,
 - Josiah

Re: Messaging queue Jason J. W. Williams 12/15/11 3:43 PM
On Thu, Dec 15, 2011 at 4:01 PM, Burak DEDE <burak...@gmail.com> wrote:
> I have application (tornado) where users can send/recieve private mesaages
> to/from each other and also can chat publicly. I am using redis (brukva)
> pub/sub mesaaging for realtime chat but I cant decide which mesaaging queue
> system should I use for private messaging part since its not realistic to
> send and process messages in a single post/get request. I think queue will
> resolve this problem by processing messages in the background.  Is redis can
> be considered as option in here or should I go with other messaging queue
> structures ? Any queue suggestions for that particular use case and why ??


If you think you'll be wanting to have flexible and more sophisticated
routing rules for those messages in the future, I would heartily
recommend RabbitMQ.

-J

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Josiah Carlson 12/15/11 3:54 PM
On Thu, Dec 15, 2011 at 3:43 PM, Jason J. W. Williams
<jasonjw...@gmail.com> wrote:
> On Thu, Dec 15, 2011 at 4:01 PM, Burak DEDE <burak...@gmail.com> wrote:
>> I have application (tornado) where users can send/recieve private mesaages
>> to/from each other and also can chat publicly. I am using redis (brukva)
>> pub/sub mesaaging for realtime chat but I cant decide which mesaaging queue
>> system should I use for private messaging part since its not realistic to
>> send and process messages in a single post/get request. I think queue will
>> resolve this problem by processing messages in the background.  Is redis can
>> be considered as option in here or should I go with other messaging queue
>> structures ? Any queue suggestions for that particular use case and why ??
>
>
> If you think you'll be wanting to have flexible and more sophisticated
> routing rules for those messages in the future, I would heartily
> recommend RabbitMQ.

I would heartily discourage RabbitMQ. Reddit had problems with it, I
have had problems with it, and I have friends who have problems with
it (RabbitMQ has died on all of us when sending messages to it "too
fast"). It's possible that the Erlang release in the last few days
fixed the problems, but I wouldn't rely on it in any sort of
production scenario, and definitely not with a fresh Erlang release.

If a non-Redis message queue is necessary, I would recommend ActiveMQ.
If a Redis-backed message queue is desired for use with Python, I've
only used https://github.com/josiahcarlson/rpqueue , but it doesn't
currently have an async binding for Tornado.

Regards,
 - Josiah

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Burak DEDE 12/15/11 3:59 PM
@Josiah 

I already implemented the realtime chat with redis pub/sub with that knowladge I can easily use the same pattern with messaging but private messaging need persistant storage which I did not use in chat. 

@Jason

Yeah RabbitMQ is one of the option I already looked at especially Pika which is working with Tornado IOLoop. But I didnt use messaging queue before so here is my general idea of usage, please correct me if I am wrong or show general pattern for it. User A wants to send message to user B. User A post his/her message with normal post request. I put the message into the queue. There should be some kind of queue processor that consumes queue periodically. Processors checks the queue finds out message gets the message, takes it and persist to user B messages in database. So when user B checks his/her messages find out the new message.
--
Burak DEDE
www.burakdede.com
www.twitter.com/burakdede
www.friendfeed.com/burakdede

Re: Messaging queue Josiah Carlson 12/15/11 5:06 PM
On Thu, Dec 15, 2011 at 3:59 PM, Burak DEDE <burak...@gmail.com> wrote:
> @Josiah
>
> I already implemented the realtime chat with redis pub/sub with that
> knowladge I can easily use the same pattern with messaging but private
> messaging need persistant storage which I did not use in chat.

Ahh, you want users to be able to message each other when they are offline.

You could use a hash+zset for each user. A hash to store the message
information, and a zset to store the order (based on timestamp). If
you time out old messages after a while, you can limit your maximum
memory use.

If you are planning on persisting them anyways, a reasonable option
would just be to just use available mail tools and utilities (postfix
to accept the mail and dump it to disk, and whatever pop/imap server
that can consume what you have configured postfix to write). Why
reinvent mail delivery?

> @Jason
>
> Yeah RabbitMQ is one of the option I already looked at especially Pika which
> is working with Tornado IOLoop. But I didnt use messaging queue before so
> here is my general idea of usage, please correct me if I am wrong or show
> general pattern for it. User A wants to send message to user B. User A post
> his/her message with normal post request. I put the message into the queue.
> There should be some kind of queue processor that consumes queue
> periodically. Processors checks the queue finds out message gets the
> message, takes it and persist to user B messages in database. So when user B
> checks his/her messages find out the new message.

That's generally the use, yes. But if tornado already has access to
your db, and you are just writing a new row to the db, there isn't
really a lot of reason to toss a message in the queue to be written to
a db later; you may as well just add it to the db immediately.

Regards,
 - Josiah

Re: Messaging queue Jason J. W. Williams 12/15/11 6:17 PM
Hi Josiah,

We've had very good luck with it in 1000s/messages/sec scenarios since 2009. It's been far more stable than ActiveMQ which we can crash very reliably. Since Reddit wouldn't work with the Rabbit core devs that reached out to them when they had issues, I wouldn't quote that as a case study.

-J

Sent via iPhone

Is your email Premiere?

Re: Messaging queue Jason J. W. Williams 12/15/11 6:29 PM

>
> @Jason
>
> Yeah RabbitMQ is one of the option I already looked at especially Pika which is working with Tornado IOLoop. But I didnt use messaging queue before so here is my general idea of usage, please correct me if I am wrong or show general pattern for it. User A wants to send message to user B. User A post his/her message with normal post request. I put the message into the queue. There should be some kind of queue processor that consumes queue periodically. Processors checks the queue finds out message gets the message, takes it and persist to user B messages in database. So when user B checks his/her messages find out the new message.

Yes that's the general pattern. Messages get tags and published into exchanges, then routing rules route messages to one or more queues where your processors consume them.

-J

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Pixy Misa 12/15/11 8:59 PM
On Dec 16, 10:54 am, Josiah Carlson <josiah.carl...@gmail.com> wrote:
> On Thu, Dec 15, 2011 at 3:43 PM, Jason J. W. Williams
>
> I would heartily discourage RabbitMQ. Reddit had problems with it, I
> have had problems with it, and I have friends who have problems with
> it (RabbitMQ has died on all of us when sending messages to it "too
> fast"). It's possible that the Erlang release in the last few days
> fixed the problems, but I wouldn't rely on it in any sort of
> production scenario, and definitely not with a fresh Erlang release.
>
> If a non-Redis message queue is necessary, I would recommend ActiveMQ.
> If a Redis-backed message queue is desired for use with Python, I've
> only usedhttps://github.com/josiahcarlson/rpqueue, but it doesn't

> currently have an async binding for Tornado.

We've had mostly the opposite experience: RabbitMQ 2.6 has worked
flawlessly, where ActiveMQ caused nothing but pain.  That's for
pushing 1TB of messages a day (inbound; we have many consumers so the
outbound traffic is about 10TB).  Our entire production environment
depends on it, and it's one of the things we have the least trouble
with.

That said, we had some problems with RabbitMQ 2.5 and 2.7 - mostly
related to message TTL.  On the other hand, last I checked, message
TTL in ActiveMQ didn't work at all.

Andrew

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Josiah Carlson 12/15/11 10:05 PM
We had problems with Rabbit moving 300-400/second about 2 weeks after
the Reddit incident, and my friend had problems at 150/second just 6
months ago. Maybe we just had buggy RabbitMQ versions.

That said, if Rabbit has known issues, and has known fixes, it should
include them in the README. In the year and a half since we had
issues, I just saw a message from about a month ago that may have
solved our problem (the Rabbit folks think it was a file handle
exhaustion issue).

How about I change my opinion: try RabbitMQ, ActiveMQ, and one of the
Redis Queue modules/packages (resque for Ruby, rpqueue for Python,
etc.). Move messages 10x what you plan on needing in the next 6
months. If it lacks features you need, isn't fast enough, or crashes,
it's not the one for you.

Of course YMMV, IANL, etc.

Regards,
 - Josiah

On Thu, Dec 15, 2011 at 6:17 PM, Jason J. W. Williams

Re: Messaging queue Geoffrey Hoffman 12/15/11 10:49 PM
I'm mainly a PHP dev and came to Redis because its a billion times more useful than Memcache (okay maybe only a million or 100,000 times). I don't have too much experience with MQ's. But a friend of mine turned me onto ZeroMQ and it looked pretty incredible. I don't see it mentioned nearly as often as RabbitMQ, ActiveMQ & beanstalkd. Is that just because it's newer and the others are older/more established? I'm curious to hear your experience, or where ZeroMQ fits in or stacks up against Redis.
Re: Messaging queue Demis Bellot 12/15/11 10:50 PM
We had problems with Rabbit moving 300-400/second about 2 weeks after

Crap that's pretty poor, always thought purpose-built Rabbit was better than that.

For anyone that's interested in impl details I've got a naive Redis-backed MqHost running off a single background thread which should be fairly easy to follow at:

The tests show how to use it:

Requests are wrapped in a JSON Message and it supports InQ, OutQ, Dlq (Dead Letter Q) and a PriorityQ which all use Redis Lists except for PriorityQ which uses a SortedSet (to bump priority on important messages). 
It uses a central MQ Topic to signal the BG thread there are messages pending which just triggers it to go an look at all its registered InQ's and PriorityQ's for sent messages.
Processing messages are done using this generic MessageService impl (that's also shared with MQ solutions not using Redis):

It's a naive implementation that's still fairly un-optimized (i.e. shares a central topic) but using a local redis instance on my dev pc at work I'm getting around 5k m/s whilst on my 4yo iMac at home its only 1.4-1.5k m/s so YMMV.

As it only uses a single BG thread it should only be used with non-blocking solutions as the blocking IO kills perf. If you have blocking IO you should use the ThreadPool implementation.

Hope it helps!

Cheers,
Re: Messaging queue Burak DEDE 12/16/11 2:38 AM

Yes basically when they are offline (like on facebook, twitter etc). Actually you get me with the last words. Tornado already have access to db (mongodb) and when one user send message to other its basically inserting new message to reciever user's data but that just have to happen in a short http request. But I thought why user have to wait for this kind of processing, just toss it into the queue and return the response to user, queue will handle insertion in the background. As a matter of fact I didnt get the complete benefit of using messaging queue structures. Is it benefical on high load websites where responses must be quick or is it applicable on mid-level examples too. This is the key to understand the purpose of messaging queue's if its doesn't benefit anything I just dont use it and directly make insertion on every new message send operation. 

And another fact that I will deploy application to my linode vps and I am not sure mongodb + redis pub/sub + some kind of queue will take up all the resources where I can just go with mongodb + redis pub/sub.
Re: Messaging queue Josiah Carlson 12/16/11 9:09 AM
They are relatively different beasts.

You can think of ActiveMQ, RabbitMQ, beanstalkd, etc., as more
"traditional" message queues that behave like the queues in any
standard language. Multiple readers and writers can put and remove
things from the queue, and it behaves more or less like the RPUSH/LPOP
simple queue that is trivially seen in Redis.

ZeroMQ's standard behavior is closer (in concept) to Redis pub/sub, as
ZeroMQ doesn't persist messages that are being routed, and there can
be multiple listeners to the same message. If someone wasn't connected
to hear about a message, it's not going to get it. You can set ZeroMQ
to do a round-robin delivery to multiple recipients (messages are only
delivered to a single client, but it is distributed in a round-robin
fashion), but I don't believe it can be configured to offer you
"traditional" queue semantics (someone please correct me if I am
wrong, I didn't take a huge dive into the docs).

Incidentally, because of ZeroMQ's behavior, if you use it, you are
basically building your application to respond to and issue ZeroMQ
messages, and it turns your system into a distributed reactor system.
That's not a problem, it just different (for those on the Python side
of things, it's a bit like moving to Twisted after doing
forks/threads/asyncore socket servers).

With all that said, ZeroMQ is really fast. It can be really fast
because there is no real request/response cycle; it's all streamed to
the clients immediately. If you need something like that, ZeroMQ can
make it happen. But overall, Redis is more flexible as a message queue
platform (even if it doesn't come configured to offer as much
persistency and failover as ActiveMQ or RabbitMQ come with by
default).

Regards,
 - Josiah

Re: Messaging queue alexis 12/16/11 11:14 AM
Josiah,

On Dec 15, 11:54 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:
>
> I would heartily discourage RabbitMQ. Reddit had problems with it,

To the best of my knowledge Reddit are still using RabbitMQ, and
happily.


> I
> have had problems with it,

I'm sorry you had problems with it.

Did you report them to the RabbitMQ team?  Typically this makes
problems go away fast.  We help people who ask.

Almost all problems are caused by using one of the hundreds of clients
on the web.  Not all of these are implemented right.  I believe that
the Redis team get as frustrated as we do, when we get blamed for
someone's work that we have no knowledge of ...

There is a subset of problems that are solved by upgrading RabbitMQ.
We are now on 2.7.0 with 2.7.1 due soon.  The move to 2.0 was very
important since it introduced disk paging, allowing a disk-bounded
persistent address space instead of RAM-bounded.

See here: http://www.rabbitmq.com/blog/2011/01/20/rabbitmq-backing-stores-databases-and-disks/


> and I have friends who have problems with
> it (RabbitMQ has died on all of us when sending messages to it "too
> fast").

A messaging system should support multiple fast producers and slow
consumers at scale.  In practice this means:

- disk paging (see above) to flow data to disk when consumers
disappear
- producer back pressure and other 'self protection' modes

Also useful for this may be:

- a management capability
- a way to use multiple machines to scale with different topologies
- various ack & nack modes

Rabbit provides all these but people often do not use them.  This can
lead to problems.

I love Redis and would not want to discourage its use.  But as you
scale, you may find you need the above things.  We make a big effort
to support them well.  As with all this, please make sure you
understand your use case, test software for yourself, and make sure
you ask for help when you want it.

> It's possible that the Erlang release in the last few days
> fixed the problems, but I wouldn't rely on it in any sort of
> production scenario, and definitely not with a fresh Erlang release.

Very many folks are delighted with RabbitMQ in production.

Here are two examples, one talking about availability, one about
throughput (ingress).

http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2011-July/013981.html
http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2011-April/012321.html


> If a non-Redis message queue is necessary, I would recommend ActiveMQ.

http://news.ycombinator.com/item?id=2588072

Cheers,

alexis


> Regards,
>  - Josiah

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue alexis 12/16/11 11:18 AM

On Dec 16, 6:05 am, Josiah Carlson <josiah.carl...@gmail.com> wrote:
> We had problems with Rabbit moving 300-400/second about 2 weeks after
> the Reddit incident, and my friend had problems at 150/second just 6
> months ago. Maybe we just had buggy RabbitMQ versions.

The Reddit 'incident' was caused by a set of things, and not as I
recall Rabbit specifically.  I've asked the Reddit guys to clarify
since this is a non-story at best, and old news at worst.

You should be able to get 20,000/second on a single CPU with the Java
or C client, no problem.


> That said, if Rabbit has known issues, and has known fixes, it should
> include them in the README.

We release detailed notes with each release.

Unfortunately we are unable to track every client in the wild..

> In the year and a half since we had
> issues, I just saw a message from about a month ago that may have
> solved our problem (the Rabbit folks think it was a file handle
> exhaustion issue).
>
> How about I change my opinion: try RabbitMQ, ActiveMQ, and one of the
> Redis Queue modules/packages (resque for Ruby, rpqueue for Python,
> etc.). Move messages 10x what you plan on needing in the next 6
> months. If it lacks features you need, isn't fast enough, or crashes,
> it's not the one for you.

This is what I would recommend too.

a

> Of course YMMV, IANL, etc.
>
> Regards,
>  - Josiah
>
> On Thu, Dec 15, 2011 at 6:17 PM, Jason J. W. Williams
>
>
>
>
>
>
>
> <jasonjwwilli...@gmail.com> wrote:
> > Hi Josiah,
>
> > We've had very good luck with it in 1000s/messages/sec scenarios since 2009. It's been far more stable than ActiveMQ which we can crash very reliably. Since Reddit wouldn't work with the Rabbit core devs that reached out to them when they had issues, I wouldn't quote that as a case study.
>
> > -J
>
> > Sent via iPhone
>
> > Is your email Premiere?
>
> > On Dec 15, 2011, at 16:54, Josiah Carlson <josiah.carl...@gmail.com> wrote:
>
> >> On Thu, Dec 15, 2011 at 3:43 PM, Jason J. W. Williams
> >> <jasonjwwilli...@gmail.com> wrote:

> >>> On Thu, Dec 15, 2011 at 4:01 PM, Burak DEDE <burakded...@gmail.com> wrote:
> >>>> I have application (tornado) where users can send/recieve private mesaages
> >>>> to/from each other and also can chat publicly. I am using redis (brukva)
> >>>> pub/sub mesaaging for realtime chat but I cant decide which mesaaging queue
> >>>> system should I use for private messaging part since its not realistic to
> >>>> send and process messages in a single post/get request. I think queue will
> >>>> resolve this problem by processing messages in the background.  Is redis can
> >>>> be considered as option in here or should I go with other messaging queue
> >>>> structures ? Any queue suggestions for that particular use case and why ??
>
> >>> If you think you'll be wanting to have flexible and more sophisticated
> >>> routing rules for those messages in the future, I would heartily
> >>> recommend RabbitMQ.
>
> >> I would heartily discourage RabbitMQ. Reddit had problems with it, I
> >> have had problems with it, and I have friends who have problems with
> >> it (RabbitMQ has died on all of us when sending messages to it "too
> >> fast"). It's possible that the Erlang release in the last few days
> >> fixed the problems, but I wouldn't rely on it in any sort of
> >> production scenario, and definitely not with a fresh Erlang release.
>
> >> If a non-Redis message queue is necessary, I would recommend ActiveMQ.
> >> If a Redis-backed message queue is desired for use with Python, I've
> >> only usedhttps://github.com/josiahcarlson/rpqueue, but it doesn't

> >> currently have an async binding for Tornado.
>
> >> Regards,
> >> - Josiah
>
> >> --
> >> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> >> To post to this group, send email to redi...@googlegroups.com.
> >> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> >> For more options, visit this group athttp://groups.google.com/group/redis-db?hl=en.

>
> > --
> > You received this message because you are subscribed to the Google Groups "Redis DB" group.
> > To post to this group, send email to redi...@googlegroups.com.
> > To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> > For more options, visit this group athttp://groups.google.com/group/redis-db?hl=en.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Josiah Carlson 12/16/11 12:37 PM
On Fri, Dec 16, 2011 at 11:14 AM, alexis <alexis.r...@gmail.com> wrote:
> Josiah,
>
> On Dec 15, 11:54 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:
>>
>> I would heartily discourage RabbitMQ. Reddit had problems with it,
>
> To the best of my knowledge Reddit are still using RabbitMQ, and
> happily.

The last thing I heard about it was:
http://blog.reddit.com/2010/05/reddits-may-2010-state-of-servers.html
, which we read just before our problem. The lesson we took from it,
when we had our problems, was: "Things have improved thus far, but
replacing rabbitmq is at the top end of our extremely long list of
things to do."

We had an engineer that had had zero problems with ActiveMQ at his
previous employer, where it worked well for them for over a year. He
spent a couple hours updating our bindings, and we were good to go.

>> I
>> have had problems with it,
>
> I'm sorry you had problems with it.
>
> Did you report them to the RabbitMQ team?  Typically this makes
> problems go away fast.  We help people who ask.

We read documentation, searched the internet for our exact problems,
and read your responses to the Reddit folks. At the time, the RabbitMQ
response was (paraphrased): we will have a new persistence engine in
the next version which will solve the problem, we don't know when that
version will be released. We needed a solution that day, so we
executed a change in backend.

> Almost all problems are caused by using one of the hundreds of clients
> on the web.  Not all of these are implemented right.  I believe that
> the Redis team get as frustrated as we do, when we get blamed for
> someone's work that we have no knowledge of ...

Indeed.

> There is a subset of problems that are solved by upgrading RabbitMQ.
> We are now on 2.7.0 with 2.7.1 due soon.  The move to 2.0 was very
> important since it introduced disk paging, allowing a disk-bounded
> persistent address space instead of RAM-bounded.
>
> See here: http://www.rabbitmq.com/blog/2011/01/20/rabbitmq-backing-stores-databases-and-disks/
>
>
>> and I have friends who have problems with
>> it (RabbitMQ has died on all of us when sending messages to it "too
>> fast").
>
> A messaging system should support multiple fast producers and slow
> consumers at scale.  In practice this means:
>
> - disk paging (see above) to flow data to disk when consumers
> disappear
> - producer back pressure and other 'self protection' modes
>
> Also useful for this may be:
>
> - a management capability
> - a way to use multiple machines to scale with different topologies
> - various ack & nack modes
>
> Rabbit provides all these but people often do not use them.  This can
> lead to problems.
>
> I love Redis and would not want to discourage its use.  But as you
> scale, you may find you need the above things.  We make a big effort
> to support them well.  As with all this, please make sure you
> understand your use case, test software for yourself, and make sure
> you ask for help when you want it.

It's great that it has all of those options now. But, having been bit
in the ass in the past and lost data, I'm not one to go back to that
same abusive mistress, even if "she is better now, no more hitting
you!".

Also, those benchmarks are in bulk streaming mode (rather than 1 at a
time) and are not applicable to what I would consider to be the vast
majority of message queue applications. In that mode, everything looks
great. In particular, you may as well use something like ZeroMQ and
move millions of messages/second.

>> It's possible that the Erlang release in the last few days
>> fixed the problems, but I wouldn't rely on it in any sort of
>> production scenario, and definitely not with a fresh Erlang release.
>
> Very many folks are delighted with RabbitMQ in production.
>
> Here are two examples, one talking about availability, one about
> throughput (ingress).
>
> http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2011-July/013981.html
> http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2011-April/012321.html
>
>
>> If a non-Redis message queue is necessary, I would recommend ActiveMQ.
>
> http://news.ycombinator.com/item?id=2588072

Note the quote from the article: "well, ActiveMQ is used by lots of
very happy users, so you must be doing something wrong."

Note your quote from above: "Very many folks are delighted with
RabbitMQ in production." and "Rabbit provides all these but people


often do not use them.  This can lead to problems."

I'm seeing no significant difference in your words vs. the words from
the ActiveMQ folks, and can only go by my experience.

My experience differs from others. It's like going to a restaurant,
maybe the waitress had a bad day, and that's why my strawberry
milkshake comes to me as strawberry milk. I'm not going to go back.
But you go in on a different day, get the best apple pie you've ever
eaten, and tell everyone about it. We will have to agree to disagree.

Regards,
 - Josiah

>> Regards,
>>  - Josiah
>
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Josiah Carlson 12/16/11 12:39 PM
We were using RabbitMQ with Celery. My friend was using the same.
Incidentally, that's the same software you were just offering as a
solution.

YMMV indeed.

Regards,
 - Josiah

On Fri, Dec 16, 2011 at 11:20 AM, alexis <alexis.r...@gmail.com> wrote:
> 300/second means you have enormous messages or are doing something
> wrong.  Most likely it's a rotted client.
>
> alexis


>
>
> On Dec 16, 6:50 am, Demis Bellot <demis.bel...@gmail.com> wrote:
>> > We had problems with Rabbit moving 300-400/second about 2 weeks after
>>
>> Crap that's pretty poor, always thought purpose-built Rabbit was better
>> than that.
>>
>> For anyone that's interested in impl details I've got a naive Redis-backed
>> MqHost running off a single background thread which should be fairly easy
>> to follow at:https://github.com/ServiceStack/ServiceStack.Redis/blob/master/src/Se...
>>
>> The tests show how to use it:https://github.com/ServiceStack/ServiceStack.Redis/blob/master/tests/...

>>
>> Requests are wrapped in a JSON Message and it supports InQ, OutQ, Dlq (Dead
>> Letter Q) and a PriorityQ which all use Redis Lists except for PriorityQ
>> which uses a SortedSet (to bump priority on important messages).
>> It uses a central MQ Topic to signal the BG thread there are messages
>> pending which just triggers it to go an look at all its registered InQ's
>> and PriorityQ's for sent messages.
>> Processing messages are done using this generic MessageService impl (that's
>> also shared with MQ solutions not using Redis):https://github.com/ServiceStack/ServiceStack/blob/master/src/ServiceS...

>>
>> It's a naive implementation that's still fairly un-optimized (i.e. shares a
>> central topic) but using a local redis instance on my dev pc at work I'm
>> getting around 5k m/s whilst on my 4yo iMac at home its only 1.4-1.5k m/s
>> so YMMV.
>> The benchmark I'm running:https://github.com/ServiceStack/ServiceStack.Redis/blob/master/tests/...

>>
>> As it only uses a single BG thread it should only be used with non-blocking
>> solutions as the blocking IO kills perf. If you have blocking IO you should
>> use the ThreadPool implementation.https://github.com/ServiceStack/ServiceStack.Redis/blob/master/src/Se...
>>
>> Hope it helps!
>>
>> Cheers,
>>
>> On Fri, Dec 16, 2011 at 1:05 AM, Josiah Carlson <josiah.carl...@gmail.com>wrote:

>>
>>
>>
>>
>>
>>
>>
>>
>>
>> > We had problems with Rabbit moving 300-400/second about 2 weeks after
>> > the Reddit incident, and my friend had problems at 150/second just 6
>> > months ago. Maybe we just had buggy RabbitMQ versions.
>>
>> > That said, if Rabbit has known issues, and has known fixes, it should
>> > include them in the README. In the year and a half since we had

>> > issues, I just saw a message from about a month ago that may have
>> > solved our problem (the Rabbit folks think it was a file handle
>> > exhaustion issue).
>>
>> > How about I change my opinion: try RabbitMQ, ActiveMQ, and one of the
>> > Redis Queue modules/packages (resque for Ruby, rpqueue for Python,
>> > etc.). Move messages 10x what you plan on needing in the next 6
>> > months. If it lacks features you need, isn't fast enough, or crashes,
>> > it's not the one for you.
>>
>> > >> I would heartily discourage RabbitMQ. Reddit had problems with it, I
>> > >> have had problems with it, and I have friends who have problems with

>> > >> it (RabbitMQ has died on all of us when sending messages to it "too
>> > >> fast"). It's possible that the Erlang release in the last few days

>> > >> fixed the problems, but I wouldn't rely on it in any sort of
>> > >> production scenario, and definitely not with a fresh Erlang release.
>>
>> > >> If a non-Redis message queue is necessary, I would recommend ActiveMQ.
>> > >> If a Redis-backed message queue is desired for use with Python, I've
>> > >> only usedhttps://github.com/josiahcarlson/rpqueue, but it doesn't
>> > >> currently have an async binding for Tornado.
>>
>> - Demis
>>
>> http://twitter.com/demisbellothttp://www.servicestack.net/mythz_blog

>
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Jak Sprats 12/16/11 12:48 PM
Hi Josiah,

> you could just as
> easily get similar performance from PostgreSQL or MySQL*.

This isnt entirely accurate. For short requests, the single threaded
event driven architecture (the c10K architecture) clearly outperforms
the multi-threaded architecture. So tweaking Postgres or Mysql will
only get you so far in terms of thruput, concurrency, and latency.

I think MongoDB has the c10K architecture ... but I dont know

- jak

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Josiah Carlson 12/16/11 1:14 PM
When you are socket IO bound by many parallel read requests against an
in-memory dataset, the c10k architecture can win*. But if you are
syncing every write to disk, they all perform within a few percent of
each other.

Case in point on the other side of things: I met some folks who
disabled the syncing options in PostgreSQL and were pushing over 100k
writes/second - the same kinds of numbers (and even a bit higher) than
have been reported from LevelDB, direct InnoDB, SQLite without
syncing, Riak, MongoDB, etc.

* the write performance of Postgres with syncing disabled suggests
that with modern hardware, the c10k problem isn't really much of a
problem as long as you minimize shared resource contention. Heck, that
there are 3 different Python (considered by many to be slow)
webservers (one is a multi-process + worker threads model) that can
handle over 15k concurrent connections without serious issue, kind-of
tells me that for the majority of loads nowadays, c10k with async
isn't technically necessary:
http://nichol.as/benchmark-of-python-web-servers (check the Tsung
benchmark).

Regards,
 - Josiah

Re: Messaging queue Burak DEDE 12/16/11 3:03 PM
Thanks for all the answers , I wanna make some points clear thats not clear at the beginning of the thread

I am looking for "offline" user messaging solution, not realtime without persistance at the end (whether its processed by message queue or not) it will persist those messages into my db so that user can look them up again. RabbitMq (pika client which has Tornado Connection looks good ) and Celery definitely worth to check.

My main concerns is just these two

1- Should I use message queue to put and process them later or just directly insert messages instantly to my mongodb without processsing them in the queue in a single http request when user post new one ( I am looking for the performant one here, my user messages are not tolerant to lost by the way) ?
2- If I should use message queue will tornado + mongodb + brukva( async redis - pub/sub) + my message queue combination will take up too much resource on server side ? 
--
Burak DEDE
www.burakdede.com
www.twitter.com/burakdede
www.friendfeed.com/burakdede

Re: Messaging queue Burak DEDE 12/16/11 3:57 PM


On Sat, Dec 17, 2011 at 1:13 AM, Josiah Carlson <josiah....@gmail.com> wrote:
On Fri, Dec 16, 2011 at 3:03 PM, Burak DEDE <burak...@gmail.com> wrote:
> Thanks for all the answers , I wanna make some points clear thats not clear
> at the beginning of the thread
>
> I am looking for "offline" user messaging solution, not realtime without
> persistance at the end (whether its processed by message queue or not) it
> will persist those messages into my db so that user can look them up again.
> RabbitMq (pika client which has Tornado Connection looks good ) and Celery
> definitely worth to check.
>
> My main concerns is just these two
>
> 1- Should I use message queue to put and process them later or just directly
> insert messages instantly to my mongodb without processsing them in the
> queue in a single http request when user post new one ( I am looking for the
> performant one here, my user messages are not tolerant to lost by the way) ?

If there are situations where you might need to handle more messages
than you can write *now*, but which you may be able to catch up later,
use a message queue.

Thanks I am looking into RabbitMQ, I am thinking using one queue for private messages say named "messages" when new message arrives push into queue and mutliple workers pops the message queue check the user id and insert into database is this valid use case ?

 

> 2- If I should use message queue will tornado + mongodb + brukva( async
> redis - pub/sub) + my message queue combination will take up too much
> resource on server side ?

It depends on your box and the number of clients. A message queue by
itself doesn't need to take a lot of resources, and most do pretty
well on modest loads (fewer than 1k messages/second). Assuming your
queue is well-behaved, MongoDB will be the first thing to run out of
resources.

One huge bit to note: 32-bit MongoDB has a maximum database size of 3
gigs. Use a 64 bit box and MongoDB if you want to store more data.

I know Mongodb can take lots of resources thats one thing I knew when started with it.
Re: Messaging queue Jak Sprats 12/17/11 3:52 AM
Hi Josiah,

good points and a good link.

I have had the experience that the c10K architecture is the optimal
architecture on a per core basis. I built a novel webserver into redis
and it outperforms any of the webservers in that comparison chart on
many of the key metrics, because it is just really really simple (i.e.
very few instructions to do a [parse-reply,write-response]) and it is
based on the c10K architecture.

Benchmarks are hard to do perfectly, so I will keep my ear to the
ground, to hear if the c10K architecture is on its way down :)

- jak

On Dec 16, 6:14 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:
> When you are socket IO bound by many parallel read requests against an
> in-memory dataset, the c10k architecture can win*. But if you are
> syncing every write to disk, they all perform within a few percent of
> each other.
>
> Case in point on the other side of things: I met some folks who
> disabled the syncing options in PostgreSQL and were pushing over 100k
> writes/second - the same kinds of numbers (and even a bit higher) than
> have been reported from LevelDB, direct InnoDB, SQLite without
> syncing, Riak, MongoDB, etc.
>
> * the write performance of Postgres with syncing disabled suggests
> that with modern hardware, the c10k problem isn't really much of a
> problem as long as you minimize shared resource contention. Heck, that
> there are 3 different Python (considered by many to be slow)
> webservers (one is a multi-process + worker threads model) that can
> handle over 15k concurrent connections without serious issue, kind-of
> tells me that for the majority of loads nowadays, c10k with async
> isn't technically necessary:http://nichol.as/benchmark-of-python-web-servers(check the Tsung
> benchmark).
>
> Regards,
>  - Josiah

>
>
>
>
>
>
>
> On Fri, Dec 16, 2011 at 12:48 PM, Jak Sprats <jakspr...@gmail.com> wrote:
> > Hi Josiah,
>
> >> you could just as
> >> easily get similar performance from PostgreSQL or MySQL*.
>
> > This isnt entirely accurate. For short requests, the single threaded
> > event driven architecture (the c10K architecture) clearly outperforms
> > the multi-threaded architecture. So tweaking Postgres or Mysql will
> > only get you so far in terms of thruput, concurrency, and latency.
>
> > I think MongoDB has the c10K architecture ... but I dont know
>
> > - jak
>
> > --
> > You received this message because you are subscribed to the Google Groups "Redis DB" group.
> > To post to this group, send email to redi...@googlegroups.com.
> > To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> > For more options, visit this group athttp://groups.google.com/group/redis-db?hl=en.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue alexis 12/17/11 11:54 AM
Burak

On Dec 16, 11:57 pm, Burak DEDE <burakded...@gmail.com> wrote:
>
> > > My main concerns is just these two
>
> > > 1- Should I use message queue to put and process them later

This will work for your case.

> Thanks I am looking into RabbitMQ, I am thinking using one queue for
> private messages say named "messages" when new message arrives push into
> queue and mutliple workers pops the message queue check the user id and
> insert into database is this valid use case ?

Yes, if you want to put the messages in a DB to make them queryable or
archived.  But, for chat, you don't need this for all cases.  The
Celery project that I mentioned, provides "workers" semantics.  It can
use RabbitMQ or Redis.

> > > 2- If I should use message queue will tornado + mongodb + brukva( async
> > > redis - pub/sub) + my message queue combination will take up too much
> > > resource on server side ?
>
> > It depends on your box and the number of clients. A message queue by
> > itself doesn't need to take a lot of resources, and most do pretty
> > well on modest loads (fewer than 1k messages/second). Assuming your
> > queue is well-behaved, MongoDB will be the first thing to run out of
> > resources.
>
> > One huge bit to note: 32-bit MongoDB has a maximum database size of 3
> > gigs. Use a 64 bit box and MongoDB if you want to store more data.
>
> I know Mongodb can take lots of resources thats one thing I knew when
> started with it.

RabbitMQ also uses more resources than Redis.  But, you can always use
more machines.  Rabbit is pretty lightweight in resource use compared
to lots of software ;-)

I don't think you need Mongo here.  Try Tornado+Rabbit first...

alexis

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue alexis 12/17/11 12:30 PM
Josiah,


On Dec 17, 9:01 am, Josiah Carlson <josiah.carl...@gmail.com> wrote:
> On Fri, Dec 16, 2011 at 6:42 PM, alexis <alexis.richard...@gmail.com> wrote:
> > But it was not replaced, AFAIK, because it works..
>
> I'm glad that they made it work.

I think the main issue was at system level not individual products.
They were also using a very early of Cassandra...

> > Ok, so you are talking about the persistence engine.  This was added
> > in 2.0 in late summer 2010.  Redis and 0mq do not have this feature.
> > It is very hard to implement well.
>
> Redis has had optional AOF persistence on every write (or every
> second) for as long as I've used it (somewhere around 20 months by
> now).

AOF persistence is a different thing.

You often want to deliver the message, before trying to persist.  You
persist only if you can't deliver.  This is the *number one*
difference between a cache (eg Redis) and a messaging system (eg
Rabbit).

Also, persist-per-write is disk-bound and thus slow.  You need to
batch writes.  You need to ack asynchronously (though after the flush
to disk in order to maintain data safety).  You may need to batch
acks.

Journalling is also not very scalable, due to the read-time of massive
logs on recovery from failure.  See, eg, Mongo.

> If you are talking about paging, Redis also has had the (now
> deprecated) VM for as long as I've used it.

Indeed.

I am talking about a very general purpose solution to the problem of
pumping as much data as you like into one or millions of queues, with
no consumers, indefinitely, without serious performance degradation.


> Incidentally, edis (an Erlang implementation of the Redis server) that
> was released sometime in the last couple days, has a LevelDB backend
> for on-disk storage, so also has persistence for a Redis-protocol
> queue system.

Yes!  This looks cool.   I wonder if Salvatore's protocol will achieve
wider use due to this.  I'd like to see Edis as a RabbitMQ plugin, so
that you can speak the Redis protocol with Rabbit, and get the extra
features and protocols Rabbit provides.

I looked at LevelDB.  Did you know Rabbit includes its own DB?  I
wonder how they compare.  Rabbit also works with (or did work with)
one of the 'Tokyo' stores.

> Incidentally, I'm hoping that Salvatore sees the error in his ways and
> decides to use InnoDB or LevelDB for "paging", "dumps", etc. The
> benchmarks say that InnoDB is more consistent over the long-term for
> writes (and reclaiming unused disk space) than LevelDB, but LevelDB is
> the "new hotness" thanks to Google.

But do you agree that he made the right call picking clustering over
paging?  I think he did.  Memory and networks are fast, and one day we
won't use disks at all.


> >> Also, those benchmarks are in bulk streaming mode (rather than 1 at a
> >> time) and are not applicable to what I would consider to be the vast
> >> majority of message queue applications.
>
> > What do you mean?  I have no idea what you are referring to, sorry.
> > Please can you describe what the "vast majority of message queue
> > applications" do.
>
> Put a message in a queue. Someone else consumes it. Not necessarily
> immediately. And the queue is able to recover after node failure and
> recovery. That's what I would consider to be a "traditional" message
> queue, and it is what I would consider to be the primary use-case for
> message queues and the consumers thereof (not caring about message
> persistence between unexpected failures is fairly uncommon in my
> experience, which is why I ask people about it when they bring it up
> here). Basically #2 with durability fromhttp://www.rabbitmq.com/getstarted.html(#6 is just 2 queues
> configured in a certain way)

Understood, but Rabbit is much more than that...  Though, yes, it
should do this job well and I think it does.

> Systems that implement something similar to the publish/subscribe
> semantic, like ZeroMQ, and RabbitMQ in configuration #3, #4, and #5
> from the getting started page, etc., I wouldn't consider to be message
> queues (in that particular configuration). Why? Because the "queue"
> used most commonly in plain English and CS parlance both have
> effectively the same definition. That definition is substantively
> different from the semantics of a publish/subscribe system.

Both JMS and AMQP desire a broker to provide queues, pubsub, and
various combinations thereof.  Moreover most modern messaging systems
do more than that.  This is because many people want a general purpose
data delivery system or 'network'.  This has different requirements
from a typical store (see my comments about Redis vs Rabbit above).


> More to the point:
> * If you have persistence, and are claiming 30k messages/second
> between nodes, then it's lazy persistence.

Whatever technology you use, the disk speed is going to be your enemy
here.  RabbitMQ can safely persist 3,000-5,000 messages per second to
disk on a standard machine you can purchase in the High St.  If you
allow for lazy persistence, the same machine will do 10-12,000/sec.
If you turn off persistence, you should get around 20k/sec per CPU.
To get higher throughput, add more CPUs or cores...


> * If it's without persistence, then Rabbit loses many of the benefits
> that you are claiming.

Like what?


> * If it's without persistence, then Rabbit doesn't come close to
> coming out on top in terms of performance for a non-persistent message
> broker.

ZeroMQ is not a broker.   That's the whole point.

> >> In that mode, everything looks
> >> great. In particular, you may as well use something like ZeroMQ and
> >> move millions of messages/second.
>
> > Please explain.
>
> http://www.zeromq.org/results:10gbe-tests-v031
>
> ZeroMQ can move millions of non-persisted (pub/sub style)
> messages/second through a single remote network node.

So can TCP.  What's the value here?  I am not trying to knock ZeroMQ,
which is a fine piece of software.


> If you want them
> persisted: ZeroMQ + round-robin distribution + a few LevelDB or InnoDB
> writers -> millions of messages persisted to disk/second.


I don't think it's that simple...

1) Which disk writers can safely write millions of messages to disk
per second, across multiple use cases, without making latency
skyrocket?  None that I know of, at least in 2011.

2) How do you ack from disk?  Is it sync or async?  Is it batched?

3) How does this behave when you have 2Tb of data on disk?  Or when
you have producer spikes?  Etc etc.

Rabbit tries to be a good general purpose solution to all this,
without loss of safety, and with good perf in 'most' cases.  And it
hopes to be very quick to install and easy to use.  I know it is
imperfect, but most people say they are very happy with it.  Please do
tell us how to make it better.

Redis also works well for a lot of this.


>  Once you
> have them on disk, you just have another set of listeners for
> requests, that respond by pulling persisted messages from the DB
> stores and sending to the proper endpoint.

This sounds painful.  Why not just push?


> It gets it's speed from the fast persistence layer, distributed
> topology, and arguably best-in-breed network IO subsystem.

Combining 2 or more 'fast' things does not create one fast thing.

Rabbit gets what speed it has from integrating persistence,
distribution, and IO.


> >> My experience differs from others. It's like going to a restaurant,
> >> maybe the waitress had a bad day, and that's why my strawberry
> >> milkshake comes to me as strawberry milk. I'm not going to go back.
> >> But you go in on a different day, get the best apple pie you've ever
> >> eaten, and tell everyone about it. We will have to agree to disagree.
>
> > You recommended Redis and ZeroMQ.  Neither has disk paging, flow
> > control, management, and other features that matter under load.  They
> > are both great technologies.
>
> At no point in this thread did I recommend ZeroMQ.

Apologies; you suggested rather than recommended.


> ..However, I did explain what it
> is able to do, and explained that it was closer to Redis pub/sub than
> a "queue". In other threads I may have suggested it, but only because
> I thought that it might fit the application better than the solution
> they were currently examining.
>
> You are right about me recommending Redis. For many use-cases,
> enabling Redis AOF and using a list is really all most people need -
> especially when they've already got Redis in their stack (fewer moving
> pieces, fewer potential failures, fewer potentially buggy client
> libraries, all leading to easier maintenance).

Journalling has several problems, eg:

1. You need to flush from memory too.  Copying to disk is not enough.

2. Replay from disk does not scale.

3. Different flows need different write strategies.

> There are many things
> that can be said about Redis, but I've never encountered a problem
> with Redis crashing that wasn't explicitly my fault (breaking 3 gigs
> on a 32 bit box, personal untested patches, etc.).

Prior to 2.0, Rabbit used AOF style journalling.  It was accused of
"failing" when people tried to put more data into it than the machine
could cope with.  This led us to have to cope complaints similar to
yours.  I'm not sure what we could have done better ...


> You are also right that Redis doesn't have flow control, HA,
> management, etc. And those things very well may be necessary in some
> of the more extreme scenarios (for 7 months I used a 150 million item
> ZSET in Redis as a time-based priority queue for items that had
> varying refresh times; and could consume it at a rate of 50k+/second
> when necessary). However, the OP has a single Linode VPS, and I don't
> think he has any need for any of the "high-end" features of RabbitMQ.

If the OP has offline consumers, he will probably need flow control.

alexis

> Regards,
>  - Josiah

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Josiah Carlson 12/17/11 2:37 PM
On Sat, Dec 17, 2011 at 3:52 AM, Jak Sprats <jaks...@gmail.com> wrote:
> good points and a good link.
>
> I have had the experience that the c10K architecture is the optimal
> architecture on a per core basis. I built a novel webserver into redis
> and it outperforms any of the webservers in that comparison chart on
> many of the key metrics, because it is just really really simple (i.e.
> very few instructions to do a [parse-reply,write-response]) and it is
> based on the c10K architecture.

Starting out with something that can move a lot of bits fast is the
first step, minimizing wasteful overhead (data copies, context
switches, slow languages, etc.) is the next. Part of the reason that
Python servers aren't able to do much better than a few 10s of
thousands of requests is primarily the context switch and data copy
problem.

> Benchmarks are hard to do perfectly, so I will keep my ear to the
> ground, to hear if the c10K architecture is on its way down :)

I've seen good success with taking the c10k architecture and running
it with worker threads/processes that actually handle the processing.

 - Josiah

Re: Messaging queue Josiah Carlson 12/17/11 4:15 PM
On Sat, Dec 17, 2011 at 12:30 PM, alexis <alexis.r...@gmail.com> wrote:
> On Dec 17, 9:01 am, Josiah Carlson <josiah.carl...@gmail.com> wrote:
>> On Fri, Dec 16, 2011 at 6:42 PM, alexis <alexis.richard...@gmail.com> wrote:
>> > Ok, so you are talking about the persistence engine.  This was added
>> > in 2.0 in late summer 2010.  Redis and 0mq do not have this feature.
>> > It is very hard to implement well.
>>
>> Redis has had optional AOF persistence on every write (or every
>> second) for as long as I've used it (somewhere around 20 months by
>> now).
>
> AOF persistence is a different thing.

It is, but your implication was that Redis lacked any sort of
real-time persistence.

> You often want to deliver the message, before trying to persist.  You
> persist only if you can't deliver.  This is the *number one*
> difference between a cache (eg Redis) and a messaging system (eg
> Rabbit).

It smells like it gets a little nasty with the involvement of acks.
From a "keep it simple" engineering perspective, syncing everything
doesn't seem like such a bad idea.

> Also, persist-per-write is disk-bound and thus slow.  You need to
> batch writes.  You need to ack asynchronously (though after the flush
> to disk in order to maintain data safety).  You may need to batch
> acks.

Indeed, I also mentioned this.

> Journalling is also not very scalable, due to the read-time of massive
> logs on recovery from failure.  See, eg, Mongo.

MySQL and PostgreSQL both use journaling and manage to recover from
failure very quickly. Mongo is still working on ... coming up to the
features that established software offers. Redis right now does pretty
well with AOF rewriting, and indeed it could be better, but it's not
been it's focus.

>> If you are talking about paging, Redis also has had the (now
>> deprecated) VM for as long as I've used it.
>
> Indeed.
>
> I am talking about a very general purpose solution to the problem of
> pumping as much data as you like into one or millions of queues, with
> no consumers, indefinitely, without serious performance degradation.

Redis with a reworked VM could do the same, which is what I was
getting at with bringing up edis and Redis diskstore.

>> Incidentally, edis (an Erlang implementation of the Redis server) that
>> was released sometime in the last couple days, has a LevelDB backend
>> for on-disk storage, so also has persistence for a Redis-protocol
>> queue system.
>
> Yes!  This looks cool.   I wonder if Salvatore's protocol will achieve
> wider use due to this.  I'd like to see Edis as a RabbitMQ plugin, so
> that you can speak the Redis protocol with Rabbit, and get the extra
> features and protocols Rabbit provides.
>
> I looked at LevelDB.  Did you know Rabbit includes its own DB?  I
> wonder how they compare.  Rabbit also works with (or did work with)
> one of the 'Tokyo' stores.

I haven't really paid a lot of attention to Rabbit in the year and a
half since we stopped using it.

There have been benchmarks of LevelDB against InnoDB that show
performance over time (I wish I could find that article I found a
little over a month ago...) If I remember correctly, LevelDB will peak
about 50% higher than InnoDB, but InnoDB will do consistent
high-throughput writes around 20% higher than LevelDB's consistent
throughput. I can't remember read rates.

Some of the Tokyo stores are pretty fast on writes too, and LevelDB's
own benchmarks show that a Kyoto store can be faster than LevelDB.

>> Incidentally, I'm hoping that Salvatore sees the error in his ways and
>> decides to use InnoDB or LevelDB for "paging", "dumps", etc. The
>> benchmarks say that InnoDB is more consistent over the long-term for
>> writes (and reclaiming unused disk space) than LevelDB, but LevelDB is
>> the "new hotness" thanks to Google.
>
> But do you agree that he made the right call picking clustering over
> paging?  I think he did.  Memory and networks are fast, and one day we
> won't use disks at all.

I think that taking the time to build a B-Tree database without having
new or innovative research pointing him to a new type of store that
could be more performant than anything ever seen, was a 6 month delay
in clustering, scripting, etc. I agree that dropping that branch was a
good idea.

On the other hand, I do think that if Salvatore had just borrowed
someone else's B+Tree implementation (SQLite has a great one that is
public domain, InnoDB is solid, etc.), it would have been a 1-month
distraction that could have addressed all of the shortcomings with VM,
offered live dumps, etc., and increased Redis' long-term marketshare.

Even with fast networks, cheap RAM, etc., disk space doubles every 12
months, which far outpaces just about any other technology, including
RAM and SSDs.

>> Put a message in a queue. Someone else consumes it. Not necessarily
>> immediately. And the queue is able to recover after node failure and
>> recovery. That's what I would consider to be a "traditional" message
>> queue, and it is what I would consider to be the primary use-case for
>> message queues and the consumers thereof (not caring about message
>> persistence between unexpected failures is fairly uncommon in my
>> experience, which is why I ask people about it when they bring it up
>> here). Basically #2 with durability fromhttp://www.rabbitmq.com/getstarted.html(#6 is just 2 queues
>> configured in a certain way)
>
> Understood, but Rabbit is much more than that...  Though, yes, it
> should do this job well and I think it does.

It didn't always do it's job well. ;)

> Both JMS and AMQP desire a broker to provide queues, pubsub, and
> various combinations thereof.  Moreover most modern messaging systems
> do more than that.  This is because many people want a general purpose
> data delivery system or 'network'.  This has different requirements
> from a typical store (see my comments about Redis vs Rabbit above).

Indeed. Redis, as a store, incidentally offers many of the same
features and benefits (not all) that are claimed by RabbitMQ. Are
there cases where RabbitMQ would be better suited? Assuming it doesn't
crash :P, absolutely. Is this the case for the OP? I'd bet a pitcher
of beer that the OP's use-case would find no discernable difference
between the effectiveness of RabbitMQ or Redis with per-second AOF
writes and 15-minute AOF rewrites.

>> * If it's without persistence, then Rabbit loses many of the benefits
>> that you are claiming.
>
> Like what?

Recovery from unexpected crashes, high-availability, reliable message
delivery, etc.

>> * If it's without persistence, then Rabbit doesn't come close to
>> coming out on top in terms of performance for a non-persistent message
>> broker.
>
> ZeroMQ is not a broker.   That's the whole point.

Apollo is competitive (and seems to be potentially more consistent
throughput-wise than RabbitMQ according to
http://hiramchirino.com/stomp-benchmark/ec2-c1.xlarge/index.html#queue_load_unload
),  and OpenAMQ can do over 100k messages/second. Both are brokers.
The latter is written by the same folks as ZeroMQ.

>> http://www.zeromq.org/results:10gbe-tests-v031
>>
>> ZeroMQ can move millions of non-persisted (pub/sub style)
>> messages/second through a single remote network node.
>
> So can TCP.  What's the value here?  I am not trying to knock ZeroMQ,
> which is a fine piece of software.

I hope we aren't going in circles. If you are saying that super high
performance pub/sub style delivery isn't useful, then 3 of the 6
examples in the "getting started" page on the Rabbit site is
unnecessary. The point of high-speed delivery is the high-speed
delivery; it's the ability to move messages faster so that you are
waiting less and aren't wasting time waiting on delivery. More to the
point, if a person's use-case is distributed pubsub delivery with
reactors/etc, ZeroMQ is effectively ideal - far more so (arguably)
than the pubsub capabilities of Rabbit.

>> If you want them
>> persisted: ZeroMQ + round-robin distribution + a few LevelDB or InnoDB
>> writers -> millions of messages persisted to disk/second.
>
> I don't think it's that simple...
>
> 1) Which disk writers can safely write millions of messages to disk
> per second, across multiple use cases, without making latency
> skyrocket?  None that I know of, at least in 2011.

InnoDB can do one sync/second writes in the 100k+ range (as can Redis,
incidentally). It takes 10 spinning disks to do 1 million. You can
scale that up almost linearly, assuming you have network bandwidth and
spinning disks. It really is that easy.

> 2) How do you ack from disk?  Is it sync or async?  Is it batched?

An ack is the same as a write. You don't need to reclaim the old data,
you just need to make sure that your B+Tree knows that the key doesn't
exist anymore. Since you just read the message from disk, you have
already cached all of the relevant nodes, so you can do the COW B+Tree
update just like every viable database already does.

> 3) How does this behave when you have 2Tb of data on disk?  Or when
> you have producer spikes?  Etc etc.

Producer spikes aren't a problem, assuming you aren't running at peak
capacity at the beginning.

> Rabbit tries to be a good general purpose solution to all this,
> without loss of safety, and with good perf in 'most' cases.  And it
> hopes to be very quick to install and easy to use.  I know it is
> imperfect, but most people say they are very happy with it.  Please do
> tell us how to make it better.

Have recommended client libraries that are known to work well. Make
sure that the default configuration doesn't crash, and that any
configuration option that could lead to instability has multi-line
WARNING: CHANGING THIS CAN CAUSE INSTABILITY AND CAUSE YOU TO LOSE
DATA.

>>  Once you
>> have them on disk, you just have another set of listeners for
>> requests, that respond by pulling persisted messages from the DB
>> stores and sending to the proper endpoint.
>
> This sounds painful.  Why not just push?

I was talking about using ZeroMQ as a method of building a traditional
queue system with persistence, which includes the potential that
consumers aren't going to be faster than producers, necessitating the
persisting of messages.

>> It gets it's speed from the fast persistence layer, distributed
>> topology, and arguably best-in-breed network IO subsystem.
>
> Combining 2 or more 'fast' things does not create one fast thing.

My assumption is that there is a competent engineer putting these
things together. LevelDB is trivial to use, as is ZeroMQ. Both are
quite performant. It doesn't seem to be terribly difficult to combine
the two with some straightforward semantics and get something that can
behave like a brokerless persisted queue. Could I be wrong? Sure. But
I've not read anything that would suggest that it is impossible, or
even difficult.

>> > You recommended Redis and ZeroMQ.  Neither has disk paging, flow
>> > control, management, and other features that matter under load.  They
>> > are both great technologies.
>>
>> At no point in this thread did I recommend ZeroMQ.
>
> Apologies; you suggested rather than recommended.

I didn't suggest it either. The OP asked what ZeroMQ did, because it
has some pretty impressive claims (over 1M messages delivered/second).
I wrote a few long paragraphs explaining what it did. The only point
where I even offered ZeroMQ as an alternative was in the context of
this argument, which is unrelated to recommending or suggesting it's
use to the OP.

>> You are right about me recommending Redis. For many use-cases,
>> enabling Redis AOF and using a list is really all most people need -
>> especially when they've already got Redis in their stack (fewer moving
>> pieces, fewer potential failures, fewer potentially buggy client
>> libraries, all leading to easier maintenance).
>
> Journalling has several problems, eg:
>
> 1. You need to flush from memory too.  Copying to disk is not enough.

Redis fsyncs.

> 2. Replay from disk does not scale.
>
> 3. Different flows need different write strategies.

All valid points, but Redis has AOF rewriting, which is sort-of a
misnomer. Redis unstable actually re-dumps the database in a format
that is about as fast to reload as a standard dump (the old version
could be slower).

>> There are many things
>> that can be said about Redis, but I've never encountered a problem
>> with Redis crashing that wasn't explicitly my fault (breaking 3 gigs
>> on a 32 bit box, personal untested patches, etc.).
>
> Prior to 2.0, Rabbit used AOF style journalling.  It was accused of
> "failing" when people tried to put more data into it than the machine
> could cope with.  This led us to have to cope complaints similar to
> yours.  I'm not sure what we could have done better ...

Offered options. Sync every write and sync every second options,
delayed acknowledgement until messages get to disk. All stuff that you
now do, but which should have been done before releasing the original
version.

>> You are also right that Redis doesn't have flow control, HA,
>> management, etc. And those things very well may be necessary in some
>> of the more extreme scenarios (for 7 months I used a 150 million item
>> ZSET in Redis as a time-based priority queue for items that had
>> varying refresh times; and could consume it at a rate of 50k+/second
>> when necessary). However, the OP has a single Linode VPS, and I don't
>> think he has any need for any of the "high-end" features of RabbitMQ.
>
> If the OP has offline consumers, he will probably need flow control.

I guess we are talking about different "flow control" concepts. This
is close to the meaning I have in my head:
http://en.wikipedia.org/wiki/Flow_control . In the context of a
"traditional" message queue, flow control is implied by the "I'll only
get something from the queue when I ask for it". That is trivially
accomplished in Redis by not using pubsub, and instead using lists.

 - Josiah

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Josiah Carlson 12/17/11 4:17 PM
On Sat, Dec 17, 2011 at 11:54 AM, alexis <alexis.r...@gmail.com> wrote:
> Burak
> On Dec 16, 11:57 pm, Burak DEDE <burakded...@gmail.com> wrote:
>> I know Mongodb can take lots of resources thats one thing I knew when
>> started with it.
>
> RabbitMQ also uses more resources than Redis.  But, you can always use
> more machines.  Rabbit is pretty lightweight in resource use compared
> to lots of software ;-)
>
> I don't think you need Mongo here.  Try Tornado+Rabbit first...

I think he also wanted offline persistent direct messages. Presumably
he's also got user accounts, so he needs some kind of database.

Regards,
 - Josiah

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue alexis 12/17/11 4:45 PM
Burak, Josiah,

On Dec 18, 12:17 am, Josiah Carlson <josiah.carl...@gmail.com> wrote:


> On Sat, Dec 17, 2011 at 11:54 AM, alexis <alexis.richard...@gmail.com> wrote:
> > Burak
> > On Dec 16, 11:57 pm, Burak DEDE <burakded...@gmail.com> wrote:
> >> I know Mongodb can take lots of resources thats one thing I knew when
> >> started with it.
>
> > RabbitMQ also uses more resources than Redis.  But, you can always use
> > more machines.  Rabbit is pretty lightweight in resource use compared
> > to lots of software ;-)
>
> > I don't think you need Mongo here.  Try Tornado+Rabbit first...
>
> I think he also wanted offline persistent direct messages. Presumably
> he's also got user accounts, so he needs some kind of database.

RabbitMQ includes a data store suitable for direct messages and their
routing information.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue alexis 12/17/11 5:25 PM
Josiah,

I'll try to be brief since this is now getting a bit OT for the group.

On Dec 18, 12:15 am, Josiah Carlson <josiah.carl...@gmail.com> wrote:
>
> It smells like it gets a little nasty with the involvement of acks.
> From a "keep it simple" engineering perspective, syncing everything
> doesn't seem like such a bad idea.

Sync'ing is too slow.

> MySQL and PostgreSQL both use journaling and manage to recover from
> failure very quickly.

Rabbit uses it too, but like MySQL and PostgreSQL, Rabbit has its own
'long term' store.  Redis will not perform well in the offline case at
scale because it lacks a long term store.  On reboot, spooling the
entire journalled log into memory is too slow when it gets "big".
That is why Rabbit, MySQL and PostgreSQL have a long term store.

In Rabbit's case this store is 'pluggable', btw.  To plug in a new
store you need to write a driver with an API, and figure out how to
store message heaps in your database.  This is non-trivial.


> Redis with a reworked VM could do the same, which is what I was
> getting at with bringing up edis and Redis diskstore.

Correct, with a reworked VM, Redis could offer more scalable
persistence.  But doing this is not completely simple.  And doing it
for the case where the dataset is large numbers of messages flowing to
different people, is non-trivial.  In particular, compared to most
databases, you need a different set of "push or cache or persist"
strategies when your primary focus is routing and delivery to 1..N
consumers.

> I haven't really paid a lot of attention to Rabbit in the year and a
> half since we stopped using it.

Rabbit has changed a lot in that time.  As have most of the worth-
bothering-with storage and messaging products.  You were very adamant
in your recommendation to not use Rabbit.  That could have included a
disclaimer "I HAVE NOT USED THIS FOR A LONG TIME BUT.." ;-)

> Some of the Tokyo stores are pretty fast on writes too, and LevelDB's
> own benchmarks show that a Kyoto store can be faster than LevelDB.

OK, I think it was specifically Kyoto storage that the Rabbit team
thought could be used instead of the Rabbit store.  The Rabbit store
is pretty fast, and optimised for messaging.

> On the other hand, I do think that if Salvatore had just borrowed
> someone else's B+Tree implementation (SQLite has a great one that is
> public domain, InnoDB is solid, etc.), it would have been a 1-month
> distraction that could have addressed all of the shortcomings with VM,
> offered live dumps, etc., and increased Redis' long-term marketshare.

Maybe he found what we did, namely that most existing stores suck in
some specific area and so cannot be used without compromising
something.


> Even with fast networks, cheap RAM, etc., disk space doubles every 12
> months, which far outpaces just about any other technology, including
> RAM and SSDs.

Indeed.  Yet users also ask us to change our implementation to enable
multi-disk storage!

> > Understood, but Rabbit is much more than that...  Though, yes, it
> > should do this job well and I think it does.
>
> It didn't always do it's job well. ;)

All products go through phases.  We went through a phase where Rabbit
would crash if you put more data in it than the machine could retain
in RAM.  Yes, we could have stated this limit more clearly.  But it is
not a limit any more.

> Indeed. Redis, as a store, incidentally offers many of the same
> features and benefits (not all) that are claimed by RabbitMQ. Are
> there cases where RabbitMQ would be better suited? Assuming it doesn't
> crash :P, absolutely. Is this the case for the OP? I'd bet a pitcher
> of beer that the OP's use-case would find no discernable difference
> between the effectiveness of RabbitMQ or Redis with per-second AOF
> writes and 15-minute AOF rewrites.

I'll be in LA in January and would happily accept that beer from
you ;-)  The reason is that AOF journals grow per message, which makes
recovery time unbounded ie. unscalable.


> >> * If it's without persistence, then Rabbit loses many of the benefits
> >> that you are claiming.
>
> > Like what?
>
> Recovery from unexpected crashes, high-availability, reliable message
> delivery, etc.

Rabbit can have replicas without persistence.

Rabbit has quite fast persistence (3-5,000/sec) and will not get an
order of magnitude faster if you use different stores (they are all
disk-limited).  If you turn off persistence it is faster: with a big
machine you can get 100,000/sec throughput (ingress+egress).

> > ZeroMQ is not a broker.   That's the whole point.
>
> Apollo is competitive (and seems to be potentially more consistent
> throughput-wise than RabbitMQ according tohttp://hiramchirino.com/stomp-benchmark/ec2-c1.xlarge/index.html#queu...
> )

I am not sure what use case needs more than a few 100,000/sec messages
in 2011.  Apollo gets speed by not doing anything much.  As it adds
more features you need in a broker, it will slow down.


>  and OpenAMQ can do over 100k messages/second.

OpenAMQ is long abandoned.


> The latter is written by the same folks as ZeroMQ.

The guy who wrote ZeroMQ used to work for iMatix where he wrote
OpenAMQ.  ZeroMQ now has a great community around it, but I don't
think many of them touched OpenAMQ.

Apollo, OpenAMQ and ZeroMQ will all run into the same problem with
persistence: disks are slow.


> >> ZeroMQ can move millions of non-persisted (pub/sub style)
> >> messages/second through a single remote network node.
>
> > So can TCP.  What's the value here?  I am not trying to knock ZeroMQ,
> > which is a fine piece of software.
>
> I hope we aren't going in circles. If you are saying that super high
> performance pub/sub style delivery isn't useful

I'm saying that both TCP and ZeroMQ do not solve the problem solved by
brokers.  This limits the value of like-for-like comparison.


> The point of high-speed delivery is the high-speed
> delivery; it's the ability to move messages faster so that you are
> waiting less and aren't wasting time waiting on delivery. More to the
> point, if a person's use-case is distributed pubsub delivery with
> reactors/etc, ZeroMQ is effectively ideal - far more so (arguably)
> than the pubsub capabilities of Rabbit.

I don't see where reactors come into it with ZeroMQ and not any other
messaging system.  You were talking about adding persistence to a
ZeroMQ network as if this would be some kind of panacea.  It won't.

> >> If you want them
> >> persisted: ZeroMQ + round-robin distribution + a few LevelDB or InnoDB
> >> writers -> millions of messages persisted to disk/second.
>
> > I don't think it's that simple...
>
> > 1) Which disk writers can safely write millions of messages to disk
> > per second, across multiple use cases, without making latency
> > skyrocket?  None that I know of, at least in 2011.
>
> InnoDB can do one sync/second writes in the 100k+ range (as can Redis,
> incidentally).

Er... I think we mean different things.

When I talk about sync writes, I mean the following block:

- accept message from client
- process message
- sync to disk
- flush buffer
- notify processor of success
- notify client of success

Doing this serially is expensive.  InnoDB won't do it at 100k/sec
rates as fas as I know.  Nor will adding ZeroMQ help.

In general: If InnoDB or LevelDB can do some needed level of speed,
then I think Kyoto can do it or get close.  In which case Rabbit can
so the same (approx.).

> > 3) How does this behave when you have 2Tb of data on disk?  Or when
> > you have producer spikes?  Etc etc.
>
> Producer spikes aren't a problem, assuming you aren't running at peak
> capacity at the beginning.

Oh really?  If you start at zero load, and have 3,000 messages per
second, each of 2,000 bytes, then without consumers you grow data load
on the broker by approx 6Mb  per second.  In just over 20 minutes this
will exceed an 8Gb machine's capacity.  This is true if you use
Rabbit, Redis, ZeroMQ or anything else.

(Correct me if I am wrong... it's late here).

> > Rabbit tries to be a good general purpose solution to all this,
> > without loss of safety, and with good perf in 'most' cases.  And it
> > hopes to be very quick to install and easy to use.  I know it is
> > imperfect, but most people say they are very happy with it.  Please do
> > tell us how to make it better.
>
> Have recommended client libraries that are known to work well.

We do this, it's a good idea.


> Make
> sure that the default configuration doesn't crash

I don't think it crashes but we can't warrant for other people's
actions, alas.


>, and that any
> configuration option that could lead to instability has multi-line
> WARNING: CHANGING THIS CAN CAUSE INSTABILITY AND CAUSE YOU TO LOSE
> DATA.

We'll do that if all other products do ;-)

alexis

>
>
>
>
>
>
>
> >>  Once you
> >> have them on disk, you just have another set of listeners for
> >> requests, that respond by pulling persisted messages from the DB
> >> stores and sending to the proper endpoint.
>
> > This sounds...
>
> read more »

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue alexis 12/20/11 8:52 AM
Josiah

I'm attempting to wrap this up, unless you want to continue, or other
folks want us to..  A few more comments below:

On Dec 20, 2:03 am, Josiah Carlson <josiah.carl...@gmail.com> wrote:
>
> Redis offers every write (AOF), every second (AOF), every X writes
> over Y seconds (dumps), and never. It has *almost* all the settings
> that I think are nominally useful (just the variable seconds, and "as
> often as we can").

That's certainly a useful starting point.

> > Anyway, Rabbit lets you tweak these settings, as do many full-fledged
> > data stores.
>
> Looking at the documentation for Rabbit, it doesn't list a lot of
> information about different persistence options:http://www.rabbitmq.com/configure.html. You all may want to update
> the docs.

That's because Rabbit implements AMQP (and other protocols).  In AMQP
you configure behaviour from the client: http://www.rabbitmq.com/tutorials/amqp-concepts.html

We also offer options through a console: http://www.rabbitmq.com/management.html

But - point taken.  Expectations vary on docs, so the web site needs
to get better.

> >> That said, message routing in Redis already is an ancillary
> >> convenience, which could be augmented with something like diskstore to
> >> make it more viable for longer for those who prefer to minimize
> >> 3rd-party dependencies.
>
> > Indeed.  But combining it with storage, and "properly", is quite
> > tricky.
>
> I've found Salvatore and Pieter both to be very competent engineers. I
> have faith that either one or together they would have been able to do
> it well.

Agreed --- they are colleagues ;-)


> >> Note those last 4 words in my paragraph "15 minute AOF rewrites".
> >> Every 15 minutes, rewrite the AOF to be as compact and as fast as a
> >> dump. The OP hasn't stated his expected volume of messages, but if
> >> it's under a few megs/second, I think you owe me the beer :P
>
> > Possibly.  But, does the OP want to handcode all this?  What are the
> > failure scenarios?  How long can readers go offline for?  Etc.
>
> Hardcode? Redis' configuration is pretty flexible and can be changed
> at runtime. There is no "hardcode" when discussing Redis.

Handcode not hardcode.

> > Agreed, we do this too.  But, per the above, users need to know what
> > the settings mean.  They may want to make sure they do not
> > optimistically notify listeners of writes, OR, they may be ok with
> > that.
>
> I've seen configuration options of the form "lie to me" before. I've
> never found it to be a viable solution for software to lie, and it
> just delays the inevitable "oh crap, it failed, and the lie mattered
> this time". My advice: toss the optimisitc config option.

Some products do lie by claiming safe flush to disk, and then don't do
it right.  Rabbit does not lie to people.

To your advice -- alas, it's widely used in enough places to need
keeping.  Mostly things like market data - low(ish) latency publish/
subscribe.  Monitoring is another case.  Don't ask me why people want
to persist those flows, but they do...

> >> Pulling [messages] back from disk is a little more work if you need messages
> >> to be acked, but I can think of a few tricks for timing, re-delivery
> >> of un-acked messages, etc.
>
> > Yes even if you DIY and get it right: the more of this you wire up,
> > the more complex your handmade system gets, and the slower you become
> > relative to your fastest components...
>
> The structures involved are very simple. It's not nearly as difficult
> as one would think, and the runtime to keep those structures updated
> is in the sub-microsecond category (basically 6 round-trips from the
> processor to main memory).

Cool.

FYI -- Our store is documented here, along with the requirements that
led to it.  It would be possible to replace parts of this with
LevelDB.  That would be interesting.

http://hg.rabbitmq.com/rabbitmq-server/file/728c45e9a267/src/rabbit_msg_store.erl

Cheers,

alexis


>
> Regards,
>  - Josiah

--

You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Salvatore Sanfilippo 12/20/11 9:08 AM
I want to add some info about what we do wrong with our messaging system, just as an example of how Rabbit is surely not just more full featured but also a much more mature messaging platform
(in my opinion we should not try to cover all the cases RabbitMQ covers, it's not our work, but to improve the current use cases).

1) Slow consumers are an issue with Redis. If a producer will push new messages faster than all the consumers can consume those messages, the Redis client buffers will start to get bigger and bigger, and no one will notice before the explosion. (Actually with CLIENT LIST you can check that, but we need an higher level solution).
2) We, for design, only cover persistence in the list case, that is, N consumers, N producers, but every single message is captured just by a single consumer. There is no persistence in Pub/Sub mode that is completely fire-and-forget. I think this is good for Redis but there are people with different needs.
3) We don't provide an high level reliable delivery of messages, even in the list case. We provide tools that make you able to build your reliability logic in a library. Again I think this is the right abstraction level for what Redis should provide. Honestly I even think that it is better to design systems that are able to tolerate unreliable messaging. But when it is not possible Redis will force you to build your abstractions. So for Redis to be considered a solution with reliability you need to consider Redis+Library. Fully featured message systems like RabbitMQ can support this and even much more by default.

While 2 and 3 show how a fully featured messaging system like RabbitMQ can cover much more use cases, I'm concerned with "1", that is what we should get fixed ASAP.
The fix is trivial, the major problem is how to fix this, we need a simple solution... the simplest of the solutions is to close the connection once consumers have a too long queue of messages not processed, the most complex solution is to send some special control-channel message to consumers (we can do that easily since we already have a "type" field in messages) and ultimately implement some form of congestion control.
I think that we should go for the simplest solution that works in the use cases where Redis messaging makes sense.

IMHO if we turn this thread into "what Redis can copy from RabbitMQ and the other way around" we can gain from this thread. Instead if it is a "what is better" thread it will hardly be useful, since both systems are decently engineered I guess, so it is a matter of different tradeoffs as usually.

Cheers,
Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act, but a habit." -- Aristotele
Re: Messaging queue Josiah Carlson 12/20/11 1:36 PM
On Tue, Dec 20, 2011 at 8:52 AM, alexis <alexis.r...@gmail.com> wrote:
> Josiah
>
> I'm attempting to wrap this up, unless you want to continue, or other
> folks want us to..  A few more comments below:
> On Dec 20, 2:03 am, Josiah Carlson <josiah.carl...@gmail.com> wrote:
>> >> Note those last 4 words in my paragraph "15 minute AOF rewrites".
>> >> Every 15 minutes, rewrite the AOF to be as compact and as fast as a
>> >> dump. The OP hasn't stated his expected volume of messages, but if
>> >> it's under a few megs/second, I think you owe me the beer :P
>>
>> > Possibly.  But, does the OP want to handcode all this?  What are the
>> > failure scenarios?  How long can readers go offline for?  Etc.
>>
>> Hardcode? Redis' configuration is pretty flexible and can be changed
>> at runtime. There is no "hardcode" when discussing Redis.
>
> Handcode not hardcode.

That "n" totally looked like an "r". My mistake.

But handcoding isn't difficult, or an issue. The majority of Redis
users will have already dug into their configuration file. Actually,
I'd say that within the first 2-3 questions asked by someone, one of
those questions will be of the form "I want this kind of data
integrity, what settings should I use?"

>> > Agreed, we do this too.  But, per the above, users need to know what
>> > the settings mean.  They may want to make sure they do not
>> > optimistically notify listeners of writes, OR, they may be ok with
>> > that.
>>
>> I've seen configuration options of the form "lie to me" before. I've
>> never found it to be a viable solution for software to lie, and it
>> just delays the inevitable "oh crap, it failed, and the lie mattered
>> this time". My advice: toss the optimisitc config option.
>
> Some products do lie by claiming safe flush to disk, and then don't do
> it right.  Rabbit does not lie to people.

Well... optimism WRT telling them that data got to disk is a lie, just
a lie that you tell them about. I'll concede that Rabbit at least
tells you, so it isn't as bad as others have been.

> To your advice -- alas, it's widely used in enough places to need
> keeping.  Mostly things like market data - low(ish) latency publish/
> subscribe.  Monitoring is another case.  Don't ask me why people want
> to persist those flows, but they do...

I'll bet you another beer it's business requirements. ;)

But yeah, I'm okay to let this part of the thread die. Ping me
off-list to tell me when you're going to be in LA, and we can set up a
time to buy each other a beer.

 - Josiah

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Josiah Carlson 12/20/11 2:57 PM
On Tue, Dec 20, 2011 at 9:08 AM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
> I want to add some info about what we do wrong with our messaging system,
> just as an example of how Rabbit is surely not just more full featured but
> also a much more mature messaging platform
> (in my opinion we should not try to cover all the cases RabbitMQ covers,
> it's not our work, but to improve the current use cases).

That's a great point. One thing that got lost in what I was trying to
say, what with all of the side topics, is that while there are
queueing systems like RabbitMQ, ActiveMQ, ZeroMQ, etc., that are able
to serve some XX% of the use-cases out there, Redis can and does solve
an overlapping set of use-cases.

Maybe Rabbit does 90+%, maybe ActiveMQ does an overlapping 90+% too.
But based on people I've spoken with and consulted with, Redis
probably hits a solid 80+% of those use-cases. What is the right
answer to the "which queue system should I use" question that comes
up? It depends on the use-cases.


> 1) Slow consumers are an issue with Redis. If a producer will push new
> messages faster than all the consumers can consume those messages, the Redis
> client buffers will start to get bigger and bigger, and no one will notice
> before the explosion. (Actually with CLIENT LIST you can check that, but we
> need an higher level solution).
> 2) We, for design, only cover persistence in the list case, that is, N
> consumers, N producers, but every single message is captured just by a
> single consumer. There is no persistence in Pub/Sub mode that is completely
> fire-and-forget. I think this is good for Redis but there are people with
> different needs.
> 3) We don't provide an high level reliable delivery of messages, even in the
> list case. We provide tools that make you able to build your reliability
> logic in a library. Again I think this is the right abstraction level for
> what Redis should provide. Honestly I even think that it is better to design
> systems that are able to tolerate unreliable messaging. But when it is not
> possible Redis will force you to build your abstractions. So for Redis to be
> considered a solution with reliability you need to consider Redis+Library.
> Fully featured message systems like RabbitMQ can support this and even much
> more by default.

I agree, but even RabbitMQ is a RabbitMQ + Library. There is always a library.


> While 2 and 3 show how a fully featured messaging system like RabbitMQ can
> cover much more use cases, I'm concerned with "1", that is what we should
> get fixed ASAP.
> The fix is trivial, the major problem is how to fix this, we need a simple
> solution... the simplest of the solutions is to close the connection once
> consumers have a too long queue of messages not processed, the most complex
> solution is to send some special control-channel message to consumers (we
> can do that easily since we already have a "type" field in messages) and
> ultimately implement some form of congestion control.
> I think that we should go for the simplest solution that works in the use
> cases where Redis messaging makes sense.

I'm a big fan of configuration options. No limit and fixed
user-settable limit per socket should be standard. I can think of
other good ones, but some of them get intricate to implement...

A combination of SO_SNDBUF, SO_SNDLOWAT, and the current outgoing
speed, along with a few other pieces should offer the ability to
calculate the expected time for the buffer to be consumed by the
client, which may offer a path to an "automatic" method of
disconnecting recently slow clients, clients that will never catch up,
etc.


> IMHO if we turn this thread into "what Redis can copy from RabbitMQ and the
> other way around" we can gain from this thread. Instead if it is a "what is
> better" thread it will hardly be useful, since both systems are decently
> engineered I guess, so it is a matter of different tradeoffs as usually.

Being that RabbitMQ and Redis are primarily serving two different
use-cases (one is a general data structure server, the other is a
message queue), I hope that no one was getting the impression that the
discussion was about which was better. They do overlap in the use-case
for which one was designed (both the pub/sub and request/response
message queue), but both were overall designed with different intents,
so will have different tradeoffs.

What can be copied? The concept of pluggable datastores...? Other than
that, I believe that the difference in motivating use-cases is
sufficiently different to limit copy-able concepts (that and that
RabbitMQ is written in Erlang).

What got lost in the shuffle is that Redis may or may not be adequate
for the OP's use-case (the OP is already using it for pubsub, so using
a list for a queue isn't that much of a stretch). And whether sticking
with Redis or adding RabbitMQ to the OP's stack is the right thing.

Regards,
 - Josiah

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Will P 12/21/11 8:21 AM
I submitted a patch for issue 525  http://code.google.com/p/redis/issues/detail?id=525#c3
which implements a simple maxclientqueue limit to prevent slow clients
from harming the server.  It adds a new config parameter
'maxclientqueue' (disabled by default) that the server uses to drop
client connections that are blocked.  It logs when it happens, and the
setting is runtime tunable w/ the CONFIG SET maxclientqueue XX
command.

Hope this helps!

-Will Pierce
wi...@nuclei.com

On Dec 20, 9:08 am, Salvatore Sanfilippo <anti...@gmail.com> wrote:
> ...


>
> 1) Slow consumers are an issue with Redis. If a producer will push new
> messages faster than all the consumers can consume those messages, the
> Redis client buffers will start to get bigger and bigger, and no one will
> notice before the explosion. (Actually with CLIENT LIST you can check that,
> but we need an higher level solution).
...

> While 2 and 3 show how a fully featured messaging system like RabbitMQ can
> cover much more use cases, I'm concerned with "1", that is what we should
> get fixed ASAP.
> The fix is trivial, the major problem is how to fix this, we need a simple
> solution... the simplest of the solutions is to close the connection once
> consumers have a too long queue of messages not processed, the most complex
> solution is to send some special control-channel message to consumers (we
> can do that easily since we already have a "type" field in messages) and
> ultimately implement some form of congestion control.
> I think that we should go for the simplest solution that works in the use
> cases where Redis messaging makes sense.

--

You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Salvatore Sanfilippo 12/21/11 8:31 AM
P.S.

also note that, even if we go for the simple solution of closing the connection, there are many ways to do this.
For instance: if the client is not Pub/Sub I would set this limit very, very high.
Otherwise this limit could just apply to clients doing Pub/Sub.
Instead of using objects, we could use bytes, or limiting this to Pub/Sub even number of messages pending (it is not trivial if done this way but I think there is a shortcut).

Another issue is that:

A client does KEYS * -> a zillion objects in the client structure that must be processed... but this does not mean we should close the connection at all.
So we may think more in terms of num of objects monotonically increasing up to a given limit.

In short this are some of the concerns why I did not implemented the vanilla solution, as in this case really there the evil is in the details.

Salvatore

On Wed, Dec 21, 2011 at 5:26 PM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
Hey! Thanks for your work. The new issue about this is: https://github.com/antirez/redis/issues/91

Your approach is perfect IF we want to follow the rule of dropping the connection.
Honestly I did not thought about it hard enough, I wonder what people using Redis Pub/Sub or interesting into this subject think about this.

Salvatore
--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act, but a habit." -- Aristotele



--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act, but a habit." -- Aristotele
Re: Messaging queue Salvatore Sanfilippo 12/21/11 9:19 AM


On Wed, Dec 21, 2011 at 5:35 PM, Dave Peticolas <da...@krondo.com> wrote:
Only counting pubsub messages would be just fine with me,
that's where the main problem lies.

Also makes sense, but there are many problems.

Example:

MULI..
PUBLISH foo x 1 million times
EXEC

All the subscribers to "foo" will have a lot of pending messages. This does not mean they can't process those messages fast enough.

I think we need something more interesting, like sampling the client output list length every second with a window of N seconds (a simple circular buffer) then running some clever algorithm against it to understand if it is more or less monotonically increasing, then close the connection.

Salvatore
 


dave


2011/12/21 Salvatore Sanfilippo <ant...@gmail.com>
Re: Messaging queue Pixy Misa 12/21/11 8:20 PM
Would it be feasible to add a TTL to Pub/Sub?

Say, after a client subscribes, it can submit a TTL command
(SUBSCRIBETTL or something like that) that says, if I don't read the
messages waiting for me within N seconds, it's okay to drop them.

That's how RabbitMQ handles it, and it works very well.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Messaging queue Will P 12/22/11 3:56 AM
Operationally I think we should keep the configuration option simple.
A single value for a maximum permitted queue length is easy to
understand from a client perspective.  From the administrator's
perspective, this limit really is more useful to provide a guarantee
against sudden bloat.  For redis on virtual servers where memory sizes
are much smaller, there's less room to tolerate spikes in memory
usage.  (I'm sure you already know this, Salvatore!)

The concern I have with a timer based implementation is that is
doesn't give as strong a guarantee against sudden spikes.  Between two
timer-based checks of a clients' queue length, a lot can happen.
Permitting the queue length to grow unbounded in between some of the
checks seems risky.  If we go with a timer+window implementation, then
I think we need to provide a second config option to tune the window
size, so that it can be set to 1 (or equivalent behavior). Then we can
immediately respond to a slow client at the next timer invocation to
provide stronger memory usage guarantees for those who need it.

For the case of a client requesting more data via KEYS or LRANGE etc
than the <maxclientqueue>, which doesn't use PUB/SUB, I don't think we
should exempt those clients from the same memory limitations.  If it
means KEYS * causes your connection to drop, for a database that has 1
skrillion keys in it, that's better than bloating up memory for VM
users.

Another option that we could consider, if we need more flexibility, is
to allow a client to set their own session-level queue length limit
with a special CONFIG type of command, perhaps.  (Maybe:  CONFIG SET
__maxqueue 1000  or for window-based:  CONFIG SET __maxqueue 1000,10)
If the client doesn't set their own limit, then we fall back to the
server limit.  That would give more control to users to override this
limit, and let them shoot themselves in the foot too.  On the
downside, we're making a bigger commitment with a session-level queue
limit, because clients all over the place will start using it.

-Will

On Dec 21, 9:19 am, Salvatore Sanfilippo <anti...@gmail.com> wrote:


> On Wed, Dec 21, 2011 at 5:35 PM, Dave Peticolas <d...@krondo.com> wrote:
> > Only counting pubsub messages would be just fine with me,
> > that's where the main problem lies.
>
> Also makes sense, but there are many problems.
>
> Example:
>
> MULI..
> PUBLISH foo x 1 million times
> EXEC
>
> All the subscribers to "foo" will have a lot of pending messages. This does
> not mean they can't process those messages fast enough.
>
> I think we need something more interesting, like sampling the client output
> list length every second with a window of N seconds (a simple circular
> buffer) then running some clever algorithm against it to understand if it
> is more or less monotonically increasing, then close the connection.
>
> Salvatore
>
>
>
>
>
>
>
>
>
>
>
> > dave
>
> > 2011/12/21 Salvatore Sanfilippo <anti...@gmail.com>

>
> >> P.S.
>
> >> also note that, even if we go for the simple solution of closing the
> >> connection, there are many ways to do this.
> >> For instance: if the client is not Pub/Sub I would set this limit very,
> >> very high.
> >> Otherwise this limit could just apply to clients doing Pub/Sub.
> >> Instead of using objects, we could use bytes, or limiting this to Pub/Sub
> >> even number of messages pending (it is not trivial if done this way but I
> >> think there is a shortcut).
>
> >> Another issue is that:
>
> >> A client does KEYS * -> a zillion objects in the client structure that
> >> must be processed... but this does not mean we should close the connection
> >> at all.
> >> So we may think more in terms of num of objects monotonically increasing
> >> up to a given limit.
>
> >> In short this are some of the concerns why I did not implemented the
> >> vanilla solution, as in this case really there the evil is in the details.
>
> >> Salvatore
>
> >> On Wed, Dec 21, 2011 at 5:26 PM, Salvatore Sanfilippo <anti...@gmail.com>wrote:

More topics »