Data distribution to multiple consumers in the presence of faults

98 views
Skip to first unread message

David Welton

unread,
Feb 2, 2017, 4:14:42 PM2/2/17
to rabbitm...@googlegroups.com
Hi,

I'm considering using RabbitMQ for some data distribution tasks. I've
been reading the docs and looking through the excellent tutorials, but
still have some questions, some of which are a bit more high level.

Let me describe what we have in place right now:

* A server and DB that keeps track of users, clients, and client locations.
* A series of microservice-ish applications with their own DB's
containing copies of the above data.
* A custom set of HTTP services that, on creation or update of these
records, distributes them to the 'consumer' applications so that they
have up to date data.

This works ok, but is kind of creaky. However, it is fairly resilient
to problems in the 'consumer' - they can go down and sends get
retried. I think they've done it with timestamps to coordinate change
times, which has problems of its own, but not at the scale we're
operating at (that or we've just gotten lucky so far).

https://www.rabbitmq.com/reliability.html is pretty good, but I'm
still struggling a bit with how things might work to replicate our
ad-hoc system.

* The number of consumers is arbitrary. These could even include a
system spun up for some development work.
* If a consumer system is down for a bit, it needs to resync in some way.
* When a system is spun up, it needs to do an initial fetch of everything.
* Something missing from the existing system is handling of 'deletion' events.
* Being super speedy is not important.
* We have very low volumes of these updates, so there are no
challenges from that point of view.

It's entirely possible that some of these tasks are not within
RabbitMQ's remit, or that additional components are necessary, such as
some kind of sequence number for each row (we have one master DB, so
things like version vectors are probably overkill).

Thanks for reading!

--
David N. Welton

http://www.welton.it/davidw/

http://www.dedasys.com/

Michael Klishin

unread,
Feb 2, 2017, 4:21:20 PM2/2/17
to rabbitm...@googlegroups.com, David Welton
Hi David,

Message distribution in RabbitMQ (well, AMQP 0-9-1) happens at the routing stage,
so with N consumers you will have N queues (if each service needs to get its own copy).

If a consumer goes down or cannot process a delivery, it can re-queue it (note: if there's only
one consumer it will immediately get a redelivery; message TTL can help with that to some extent).
Consumers also can lose network connections, RabbitMQ will notice and re-queue unacknowledged
deliveries automatically. This is covered in http://www.rabbitmq.com/confirms.html, as a publisher
confirms, a very important and relevant topic.

Now, I'm not sure what you mean by "initial fetch of everything" but messages in RabbitMQ can be consumed
and acknowledged only once (and only one consumer can have an outstanding delivery at a time — see above).
There is no way to read the history of messages. If that feature is crucial to have, use Kafka, which is a hybrid
messaging/data store service.

I'm also not sure what you mean by "Something missing from the existing system is handling of 'deletion' events."
Can you elaborate?

HTH.
> --
> You received this message because you are subscribed to the Google Groups "rabbitmq-users"
> group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
> To post to this group, send an email to rabbitm...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

--
MK

Staff Software Engineer, Pivotal/RabbitMQ


David Welton

unread,
Feb 2, 2017, 7:46:35 PM2/2/17
to Michael Klishin, rabbitm...@googlegroups.com
Hi,

> Hi David,
>
> Message distribution in RabbitMQ (well, AMQP 0-9-1) happens at the routing stage,
> so with N consumers you will have N queues (if each service needs to get its own copy).

What's this look like in terms of the queue's? I'm looking at the Ruby
examples like http://www.rabbitmq.com/tutorials/tutorial-four-ruby.html
which has this as the queue setup:

q = ch.queue("", :exclusive => true)

What do I need to change in that example to make things work like so:

1. Start two receivers
2. Send a message: they both get it.
3. Bring down one of the receivers
4. Send a message.
5. Bring it back up - it gets the message.

Do I need to give the queues names, or make them durable, or something
else? I'm guessing names so that it was something permanent to deal
with that it knows ought to come back?

> If a consumer goes down or cannot process a delivery, it can re-queue it (note: if there's only
> one consumer it will immediately get a redelivery; message TTL can help with that to some extent).
> Consumers also can lose network connections, RabbitMQ will notice and re-queue unacknowledged
> deliveries automatically. This is covered in http://www.rabbitmq.com/confirms.html, as a publisher
> confirms, a very important and relevant topic.

Ok, I added ch.confirm_select to the above example, but doesn't do
what I want.

> Now, I'm not sure what you mean by "initial fetch of everything" but messages in RabbitMQ can be consumed
> and acknowledged only once (and only one consumer can have an outstanding delivery at a time — see above).
> There is no way to read the history of messages. If that feature is crucial to have, use Kafka, which is a hybrid
> messaging/data store service.

Yeah, I suspected RabbitMQ might not be the right tool for that
particular part of the job. Was just throwing it out there to see if
anyone had ideas. I'm not sure Kafka is what I want either, to be
honest; I don't like of storing thte data in one place, and then
storing the whole queue of the same data, but that's what it'd take to
be able to do a "seed" of a new app's database. More thinking, I
guess. Thanks for the suggestion in any case!

> I'm also not sure what you mean by "Something missing from the existing system is handling of 'deletion' events."
> Can you elaborate?

Say a user is deleted in the main database. That event needs to be
propagated through the rest of the system, probably with some kind of
message including the user ID and a delete flag. It's not really
RabbitMQ specific - it's missing in our system. I was sort of hoping
someone might have seen this 'pattern' of usage and have some ideas.

Thank you!

Michael Klishin

unread,
Feb 2, 2017, 7:59:11 PM2/2/17
to David Welton, rabbitm...@googlegroups.com
 On 3 February 2017 at 03:46:32, David Welton (davidn...@gmail.com) wrote:
> Hi,
>
> > Hi David,
> >
> > Message distribution in RabbitMQ (well, AMQP 0-9-1) happens at the routing stage,
> > so with N consumers you will have N queues (if each service needs to get its own copy).
>
> What's this look like in terms of the queue's? I'm looking at the Ruby
> examples like http://www.rabbitmq.com/tutorials/tutorial-four-ruby.html
> which has this as the queue setup:
>
> q = ch.queue("", :exclusive => true)
>
> What do I need to change in that example to make things work like so:
>
> 1. Start two receivers
> 2. Send a message: they both get it.
> 3. Bring down one of the receivers
> 4. Send a message.
> 5. Bring it back up - it gets the message.
>
> Do I need to give the queues names, or make them durable, or something
> else? I'm guessing names so that it was something permanent to deal
> with that it knows ought to come back?

It depends on your requirements. Consumers need to know queue name and the queue should
probably not be transient if consumers are expected to go away, come back and consume
what was routed to the queue in recent past.

There are at least 2 tutorials and 2 pretty detailed concepts and "advanced" feature guides
available:

http://rubybunny.info/articles/getting_started.html#blabblr_onetomany_publishsubscribe_pubsub_example
http://www.rabbitmq.com/tutorials/tutorial-three-ruby.html
http://www.rabbitmq.com/tutorials/amqp-concepts.html
http://rubybunny.info/articles/queues.html

> > If a consumer goes down or cannot process a delivery, it can re-queue it (note: if there's
> only
> > one consumer it will immediately get a redelivery; message TTL can help with that to
> some extent).
> > Consumers also can lose network connections, RabbitMQ will notice and re-queue unacknowledged
> > deliveries automatically. This is covered in http://www.rabbitmq.com/confirms.html,
> as a publisher
> > confirms, a very important and relevant topic.
>
> Ok, I added ch.confirm_select to the above example, but doesn't do
> what I want.

I didn't suggest that it will do exactly what you want. You are asking questions around failures
and there are many things that can fail, including publisher connections to RabbitMQ. So I mentioned
publisher confirms.

> > I'm also not sure what you mean by "Something missing from the existing system is handling
> of 'deletion' events."
> > Can you elaborate?
>
> Say a user is deleted in the main database. That event needs to be
> propagated through the rest of the system, probably with some kind of
> message including the user ID and a delete flag. It's not really
> RabbitMQ specific - it's missing in our system. I was sort of hoping
> someone might have seen this 'pattern' of usage and have some ideas.

This is how I started using RabbitMQ in 2009 and roughly what a sizeable % of the user base do:
propagate events and commands between services.

David Welton

unread,
Feb 5, 2017, 7:10:34 PM2/5/17
to rabbitm...@googlegroups.com
Hi,

>> What do I need to change in that example to make things work like so:
>>
>> 1. Start two receivers
>> 2. Send a message: they both get it.
>> 3. Bring down one of the receivers
>> 4. Send a message.
>> 5. Bring it back up - it gets the message.
>>
>> Do I need to give the queues names, or make them durable, or something
>> else? I'm guessing names so that it was something permanent to deal
>> with that it knows ought to come back?
>
> It depends on your requirements. Consumers need to know queue name and the queue should
> probably not be transient if consumers are expected to go away, come back and consume
> what was routed to the queue in recent past.

So setting things up that way would require some kind of 'garbage
collection' of unused queue names that would be added a bit at a time
by applications spun up on developers' machines for testing purposes,
it seems, with that kind of setup. We already have a list of
'registered apps' so that's not entirely out of the question.
Thanks.

>> > If a consumer goes down or cannot process a delivery, it can re-queue it (note: if there's
>> only
>> > one consumer it will immediately get a redelivery; message TTL can help with that to
>> some extent).
>> > Consumers also can lose network connections, RabbitMQ will notice and re-queue unacknowledged
>> > deliveries automatically. This is covered in http://www.rabbitmq.com/confirms.html,
>> as a publisher
>> > confirms, a very important and relevant topic.
>>
>> Ok, I added ch.confirm_select to the above example, but doesn't do
>> what I want.

> I didn't suggest that it will do exactly what you want. You are asking questions around failures
> and there are many things that can fail, including publisher connections to RabbitMQ. So I mentioned
> publisher confirms.

Sure... I guess I'm just trying to work out a bit where the border
lies between "make the end to end via the message queue super
reliable" and backing off on that, and including some mechanisms in
the data itself and some architecture/algorithms on top of the message
queue.

For instance, rather than worry about messages lost in the queue, have
a system like this, to distribute user records:

* On startup, "Consumer" app sends a message to "Publisher": {hello,
max(LocalUserId), ConsumerId}
* Consumer also subscribes to messages it's interested in.
* Publisher iterates through everything from max(PublisherUserId) to
max(UserId), sending it to ConsumerId. (We don't have a bunch of data,
so resending it all is not a problem)
* Consumer is now caught up, and as long as it's on line it'll get new
user records. If it goes down or loses its connection, it'll attempt
a refetch.

Or am I doing things wrong/going against the grain by trying to "build
on top" like that?

Thank you

Michael Klishin

unread,
Feb 5, 2017, 7:20:44 PM2/5/17
to rabbitm...@googlegroups.com
If you need to use a temporary queue, see the exclusive and auto-delete

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send an email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages