Theoretical foundation for 0MQ / distributed messaging

Martin Sustrik

no leída,

5 jul 2011, 1:14:05 p.m.5/7/11

para sp-discu...@googlegroups.com

Hi all,

Sorry for the delay... In the recent discussions (and especially in a
long IM discussion with Michael Bridgen last week) I've realised I am
using a lot of 0MQ terminology that was never clearly defined and
explained. Which of course made the discussion pretty hard.

So, I've spent couple of days writing it all down. The result can be
found here:

http://www.250bpm.com/concepts

Hopefully, it'll make the discussion more clear and comprehensible.

For reference, here's a similar document for broker-based messaging
(written by Pieter back in 2004 IIRC):

http://www.openamq.org/doc:amqp-background

Martin

Pieter Hintjens

no leída,

5 jul 2011, 1:56:11 p.m.5/7/11

para sp-discu...@googlegroups.com

On Tue, Jul 5, 2011 at 7:14 PM, Martin Sustrik <sus...@250bpm.com> wrote:

> So, I've spent couple of days writing it all down. The result can be found
> here:
> http://www.250bpm.com/concepts

Nice, this is very useful.

-Pieter

Kohei Honda

no leída,

5 jul 2011, 6:27:02 p.m.5/7/11

para sp-discu...@googlegroups.com,Nobuko Yoshida

This is nice, I appreciate it as a great step towards reaching deep
and formal articulation of what we are and we shall be doing in this
design space --- I believe this will accelerate this sp-project, since
from the first one of the great things about messaging is it makes
clear what is going on in concurrent and distributed computing.

If I paraphrase one quote, messaging makes (concurrent and
distributed) computation digestible.

kohei

> --
> Note Well: This discussion group is meant to become an IETF working group in
> the future. Thus, the posts to this discussion should comply with IETF
> contribution policy as explained here:
> http://www.ietf.org/about/note-well.html
>

Michael Bridgen

no leída,

6 jul 2011, 7:07:56 a.m.6/7/11

para sp-discu...@googlegroups.com

> Sorry for the delay... In the recent discussions (and especially in a
> long IM discussion with Michael Bridgen last week) I've realised I am
> using a lot of 0MQ terminology that was never clearly defined and
> explained. Which of course made the discussion pretty hard.
>
> So, I've spent couple of days writing it all down. The result can be
> found here:
>
> http://www.250bpm.com/concepts

This is a great explanation, thank you for writing it Martin.

Do you have ideas how ZeroMQ might be adapted to fulfill the design
principles given in the appendix?

For instance, the uniformity principle: "Consider PUB/SUB pattern as
currently implemented in �MQ. It allows for multiple publishers in the
topology which introduces non-uniformity".

So far as I can tell, the design given at http://www.250bpm.com/pubsub
also suffers from this problem. (Not sure if that comes under "currently
implemented in �MQ").

Is the implication that a topology should have just one publisher? Or
that the topology name should resolve to a single address for
publishers? (sorry, mixing in terminology, I know ..)

--Michael

Gary Berger

no leída,

6 jul 2011, 12:13:34 p.m.6/7/11

para sp-discu...@googlegroups.com

This is great.. I think however the concept of a topology needs to be
dealt with primarily for proper service segmentation.

This is important in most enterprises but also in multi-tenant data
centers which utilize the concept of Central Limit Theorem to properly
size capacity (basically flatten the supply/demand curve).

I am curious what others think about how tightly bound the service is to
topology and the expectations for instance that information cannot be
leaked out of the segmented boundary.

The concept that a "service" is not associated with a specific endpoint
helps to promote the share-nothing[1] architecture people want for
stateless services. Rod Johnson recently has been promoting the need to
embrace data-grid technologies to promote this decoupling and enhance
scalability(He believes as do I that this is central to building Platform
as a Service).

Certainly data-grid approaches like Gigaspaces, Coherence, and Gemfire
have a component which includes messaging so maybe there are some
interesting ideas surrounding this space.

As for the name resolution discussion some interesting work has been done
by Van Jacobsen specifically Networking Named Content[2] which the group
might find interesting.

1. http://en.wikipedia.org/wiki/Shared_nothing_architecture
2. http://www.named-data.net/education.html

Fabien Niñoles

no leída,

6 jul 2011, 9:40:19 p.m.6/7/11

para sp-discu...@googlegroups.com

2011/7/6 Michael Bridgen <mi...@rabbitmq.com>:

> For instance, the uniformity principle: "Consider PUB/SUB pattern as

> currently implemented in ØMQ. It allows for multiple publishers in the

> topology which introduces non-uniformity".

Although the example of the problem is a little bit subject to
discussion -- you can see the graph as two topologies, one endpoint
(B) participating in both -- the pub/sub pattern is often inherently
broken in regards to at least one of the three principals. To scale
reliably in a uniform network and permit interjection of nodes would
require that all nodes would need to seen almost all messages at one
point. Scaling on pub/sub would necessarely required some tradeoff at
some point.

Fabien

Martin Sustrik

no leída,

8 jul 2011, 6:01:09 a.m.8/7/11

para sp-discu...@googlegroups.com,Michael Bridgen

On 07/06/2011 01:07 PM, Michael Bridgen wrote:

> Do you have ideas how ZeroMQ might be adapted to fulfill the design
> principles given in the appendix?

I think it's important to distinguish 0MQ as a product and the
"scalability" issues that's this group is meant to research.

While there's a lot of intersection, 0MQ contains patterns that are
inherently non-scalable (pair) or offer limited scalability (pipeline).

There are various reasons for that: Some people really want to use
non-scalable patterns and there's no legitimate reason not to allow them
to do so. There are backward compatibility reasons. Etc.

As for SP work these reasons don't apply as the goal is to address
scalability per se.

> For instance, the uniformity principle: "Consider PUB/SUB pattern as
> currently implemented in �MQ. It allows for multiple publishers in the
> topology which introduces non-uniformity".
>
> So far as I can tell, the design given at http://www.250bpm.com/pubsub
> also suffers from this problem. (Not sure if that comes under "currently
> implemented in �MQ").
>
> Is the implication that a topology should have just one publisher? Or
> that the topology name should resolve to a single address for
> publishers?

I would say there's a need to separate "pub/sub" and "aggregator"
pattern. Pub/sub would have a single publisher and a tree of
subscribers, while aggregator would allow for multiple publishers and
only a single consumer.

Note that both these patterns meet the outlined principles.

Moreover, any topology created using existing 0MQ pub/sub can be broken
to "new pub/sub" and "aggregator" topologies, providing exactly the same
functionality. The only difference is that instead of one big
inconsistent topology, user would be forced to define couple of smaller
consistent topologies.

As for resolving the topology names I have no idea. There was almost no
work done in that area, so everybody is free to propose suggestions.

Martin

Martin Sustrik

no leída,

8 jul 2011, 6:42:23 a.m.8/7/11

para sp-discu...@googlegroups.com,Gary Berger

Hi Gary,

> This is great.. I think however the concept of a topology needs to be
> dealt with primarily for proper service segmentation.
>
> This is important in most enterprises but also in multi-tenant data
> centers which utilize the concept of Central Limit Theorem to properly
> size capacity (basically flatten the supply/demand curve).

Yes. I believe that the main point here.

TCP tried to codify the notion of "service" (TCP port) which was only
partly successful due to many reasons not the least one being that TCP
service accounts only for classic star topology (server & clients, all
communicating on the same port).

We have a chance now to define what service is in a broader fashion.

By providing a strict definition and making services automatically
distinguishable one from another we are basically providing an
information about formal properties of the business logic to the network.

Network, having that information available, can provide all kinds of
smart behaviour that is currently either implemented in applications or
not implemented at all.

> I am curious what others think about how tightly bound the service is to
> topology and the expectations for instance that information cannot be
> leaked out of the segmented boundary.

I have no experience with security, but secure separation of the
topologies (say in multi-tenant environment) sounds like one possible
application of the model.

> The concept that a "service" is not associated with a specific endpoint
> helps to promote the share-nothing[1] architecture people want for
> stateless services. Rod Johnson recently has been promoting the need to
> embrace data-grid technologies to promote this decoupling and enhance
> scalability(He believes as do I that this is central to building Platform
> as a Service).
>
> Certainly data-grid approaches like Gigaspaces, Coherence, and Gemfire
> have a component which includes messaging so maybe there are some
> interesting ideas surrounding this space.

GenStone was acquired by vmWare IIRC, so RabbitMQ guys may have a
contact there, however, I've dealt with GenStone in the past and my
feeling was that they are interested in DB side of the things rather
than in the networking. I may be wrong though.

> As for the name resolution discussion some interesting work has been done
> by Van Jacobsen specifically Networking Named Content[2] which the group
> might find interesting.

I'll give it a look. Thanks!

Martin

Martin Sustrik

no leída,

8 jul 2011, 6:48:08 a.m.8/7/11

para sp-discu...@googlegroups.com,Fabien Niñoles

On 07/07/2011 03:40 AM, Fabien Ni�oles wrote:
> 2011/7/6 Michael Bridgen<mi...@rabbitmq.com>:
>> For instance, the uniformity principle: "Consider PUB/SUB pattern as

>> currently implemented in �MQ. It allows for multiple publishers in the

>> topology which introduces non-uniformity".
>
> Although the example of the problem is a little bit subject to
> discussion -- you can see the graph as two topologies, one endpoint
> (B) participating in both -- the pub/sub pattern is often inherently
> broken in regards to at least one of the three principals. To scale
> reliably in a uniform network and permit interjection of nodes would
> require that all nodes would need to seen almost all messages at one
> point. Scaling on pub/sub would necessarely required some tradeoff at
> some point.

If pub/sub is constrained to a distribution tree (ie. one publisher, N
consumers) all messages would have to be generated at a single point
which would place a sane cap on massage flow. Note that adding more
nodes (intermediaries, consumers) won't add any more messages to the
topology, meaning that each node would have to process at most the
number of messages generated by the ultimate publisher.

Martin

Se borró el mensaje

Tan Yew Wei

no leída,

9 jul 2011, 5:03:23 a.m.9/7/11

para sp-discu...@googlegroups.com

Great article!

I have 2 questions regarding the statements:

Actual endpoint(s) to receive the message are selected in transparent manner by ØMQ.
if you want to send data to specific endpoint you should use TCP or similar protocol.
If you want to send it to the topology and let topology to decide on the destination, you should use ØMQ.

with respect to the use-case scenario of mobile devices.

In this case, we have endpoints that are:

(a) rapidly changing - IP addresses would change say from going from the office Wi-fi to the Wi-fi down at Starbucks

(b) Likely behind NATs

So the questions are then:

(1) How does ØMQ actually determine the endpoints so efficiently?

(2) What are the underlying assumptions made in order to perform such routing?

Does this even work through NATs? (via some hole-punching mechanism or otherwise)

I've read the guide, as well as all the articles on the site and have not found much updated info (many deprecated articles, like the one on the chat and vidconferencing), but I suspect that the answer to my question is in the code. It would be great if you could point me to the areas where such functionality is defined.

Incidentally, P2P communication through NATs is a prime UDP use-case, given that NAT-traversal techniques like those described in ietf STUN, TURN, and ICE work with much higher probability over UDP than over TCP.

Thanks for the time!

Tan Yew Wei

no leída,

10 jul 2011, 12:11:12 a.m.10/7/11

para sp-discu...@googlegroups.com

Just finished reading the "What is 'Publish'" thread, and I think a better way to phrase my question (1) above is:

Given rapidly changing endpoints, what underlying mechanisms does ØMQ use to maintain topology?

Rapidly changing can mean as low as every 5 minutes.

(2) would then be:

Does this then work through NATs? Why or why not?

Understanding the routing mechanism (somewhere in the code) would definitely aid in answering this.

Hopefully that's a little clearer. =)

Martin Sustrik

no leída,

10 jul 2011, 1:20:53 a.m.10/7/11

para sp-discu...@googlegroups.com,Tan Yew Wei

On 07/10/2011 06:11 AM, Tan Yew Wei wrote:

> Given rapidly changing endpoints, what underlying mechanisms does �MQ

> use to maintain topology?
> Rapidly changing can mean as low as every 5 minutes.

I am not an expert on mobile applications, however, given that TCP is
used underneath, I would assume that if IP address changes abruptly
(because you walk from the office to starbucks) the other end won't be
notified about the old address disappearing in less than 2 hours (see
TCP keepalives spec). As for the mobile end I have no idea what happens,
however, even if disconnect notification is issued by the OS, and
topology is reestablished from the new IP address, the old lingering TCP
connection is not going to go away.

If you have any thoughts of how to make SP robust on mobile devices,
feel free to make suggestions.

Taking a risk of appearing stupid: Isn't this an L3 or L4 problem,
rather than a messaging problem?

Martin

Tan Yew Wei

no leída,

12 jul 2011, 3:42:00 a.m.12/7/11

para sp-discu...@googlegroups.com,Tan Yew Wei

Sorry for the late reply, was busy with some stuff.

Yes, you're right about the point on TCP (the fix would be out of scope).

You are also right that it is traditionally a L4 problem, but I'm playing around with some ideas from other IETF drafts like RELOAD that could potentially fix this in the App layer.

In any case, thanks for the reply. I think I'll poke around and experiment a bit, starting with getting ZMQ working with UDP (yup, I want the unreliability =P)

Responder a todos

Responder al autor

Reenviar