[SP] what is 'publish'

Alexis Richardson

unread,

Jun 28, 2011, 5:59:29 AM6/28/11

to sp-discu...@googlegroups.com

Question(s)
----------------

Do all the pubsub approaches agree on what it means to publish?

Do you publish to:

- a destination or 'final address'
- an address
- a topic
- a name?

Can you publish if these do not (yet) exist?

How does publishing relate to having an authenticated connection in place?

Can you publish without making a network hop?

alexis

Michael Bridgen

unread,

Jun 28, 2011, 7:23:14 AM6/28/11

to sp-discu...@googlegroups.com

> Do all the pubsub approaches agree on what it means to publish?
>
> Do you publish to:
>
> - a destination or 'final address'
> - an address
> - a topic
> - a name?

Alternatively:

* Topics v Addresses v Locations

In XEP 0060 (XMPP pub/sub) these are all distinct: you connect to your
server (address), and publish to a topic at a location.

Some systems identify two or more of these. For instance, there's no
distinction between addresses and locations in 0MQ; there's (usually) no
notion of location in DHT-based pub/sub.

I'm not sure how this can be represented as a feature (and in particular
as a feature scale -- no support, some support, full support).

> Can you publish if these do not (yet) exist?

Some systems require topics to be declared, e.g., by "advertisement".
Again, not sure if this can be put on a scale.

> How does publishing relate to having an authenticated connection in place?

There ought to be some representation of confidentiality and
authorisation in the feature list -- perhaps under "trust model".

> Can you publish without making a network hop?

Do you mean "can you consign publications to a local agent?" (e.g., MSMQ
works like this)

Alexis Richardson

unread,

Jun 28, 2011, 7:26:10 AM6/28/11

to sp-discu...@googlegroups.com

To your last Qn, yes I mean that.

On Jun 28, 2011 12:23 PM, "Michael Bridgen" <mi...@rabbitmq.com> wrote:

> Do all the pubsub approaches agree on what it means to publish? > > Do you publish to: > > - a des...

Alternatively:

* Topics v Addresses v Locations

In XEP 0060 (XMPP pub/sub) these are all distinct: you connect to your server (address), and publish to a topic at a location.

Some systems identify two or more of these. For instance, there's no distinction between addresses and locations in 0MQ; there's (usually) no notion of location in DHT-based pub/sub.

I'm not sure how this can be represented as a feature (and in particular as a feature scale -- no support, some support, full support).
> Can you publish if these do not (yet) exist?
Some systems require topics to be declared, e.g., by "advertisement". Again, not sure if this can be put on a scale.
> How does publishing relate to having an authenticated connection in place?
There ought to be some representation of confidentiality and authorisation in the feature list -- perhaps under "trust model".
> Can you publish without making a network hop?
Do you mean "can you consign publications to a local agent?" (e.g., MSMQ works like this)

--
Note Well: This discussion group is meant to become an IETF working group in the future. Thus, the posts to this discussion should comply with IETF contribution policy as explained here: http://www.ietf.org/about/note-well.html

Martin Sustrik

unread,

Jun 28, 2011, 9:55:42 AM6/28/11

to sp-discu...@googlegroups.com, Michael Bridgen

On 06/28/2011 01:23 PM, Michael Bridgen wrote:

> Do you mean "can you consign publications to a local agent?" (e.g., MSMQ
> works like this)

Additionally, 0mq allows for messaging within a single process which
eventually leads to Erlang-style system of processes and mailboxes. This
is a useful feature and should be kept in mind IMO.

Martin

Tony Garnock-Jones

unread,

Jun 28, 2011, 11:12:51 AM6/28/11

to sp-discu...@googlegroups.com

On 28 June 2011 05:59, Alexis Richardson <ale...@rabbitmq.com> wrote:

Do you publish to:
- a destination or 'final address'
- an address
- a topic
- a name?

I think all the different approaches all do the same thing while using different terminology here. They all have publications take an indication of target, somehow resolve that target into zero or more destinations, and forward the publication to those destinations.

The terminology Day[1] uses is that the label offered for publication along with a message by a publisher is called a "name"; the resolution process uses a "routing information base"; and the destinations to which the message is ultimately forwarded are "addresses".

Different systems have different functions and data types at each step; for instance,

some systems will use DNS to resolve a domain name into a set of IP addresses
IP uses local routing tables to resolve an IP address into a specific interface
AMQP uses exchange definitions and binding tables to resolve an exchange name and routing key to a collection of queue names
Naive unswitched ethernet "resolves" a MAC address (a "name" in Day's terminology) into the collection of physical interfaces physically present on the wire at the time of publication (where the filtering of received packets by the locally-configured MAC address can be seen as the operation of a distributed query algorithm)
Switched ethernet resolves a MAC address to a collection of ports on the switch
etc.

They all have the same structure.

How does publishing relate to having an authenticated connection in place?

This is a great question. Or rather, it's similar in form to something I consider to be a great question :-) i.e. "How does publishing relate to having a connection in place?" But I think addressing that question might be off-topic here.

Can you publish without making a network hop?

I think this is a confused question because it has a definition of "network" in mind. Is a network something with an ethernet involved? Does the loopback interface count? As Martin pointed out, 0MQ supports in-process messaging: is the routing table there a kind of network? What about communication between Erlang processes? What about shared memory on a single-CPU system? On a NUMA machine?

I think the question could be used in two ways: either to disqualify certain operations as publication because they don't fit a certain definition of network, or to define the term network as something that can be used to publish. The former I don't think is useful :-) but the latter might have some use. It's verging on sketchy philosophy though.

Regards,
Tony

[1] J. Day, Patterns in Network Architecture: A Return to Fundamentals, Prentice Hall, 2008.

Alexis Richardson

unread,

Jun 28, 2011, 11:39:32 AM6/28/11

to sp-discu...@googlegroups.com

if I send an email to b...@messaging.org then is that a 'publish'?

Tony Garnock-Jones

unread,

Jun 28, 2011, 12:23:37 PM6/28/11

to sp-discu...@googlegroups.com

On 28 June 2011 11:39, Alexis Richardson <ale...@rabbitmq.com> wrote:

if I send an email to b...@messaging.org then is that a 'publish'?

I think so. What might disqualify it from being a "publish"?

Tony

Alexis Richardson

unread,

Jun 28, 2011, 12:31:00 PM6/28/11

to sp-discu...@googlegroups.com

Well suppose b...@messaging.org is subscribed to dis...@lists.com. I
email the latter list, and an email is then sent to b...@messaging.org.
Is the latter operation a publish?

More generally is delivery to subscribers publish?

Do we desire symmetries of this kind?

> Tony

Ian Barber

unread,

Jun 28, 2011, 12:34:06 PM6/28/11

to sp-discu...@googlegroups.com

On Tue, Jun 28, 2011 at 5:31 PM, Alexis Richardson <ale...@rabbitmq.com> wrote:

Well suppose b...@messaging.org is subscribed to dis...@lists.com. I
email the latter list, and an email is then sent to b...@messaging.org.
Is the latter operation a publish?

More generally is delivery to subscribers publish?

Do we desire symmetries of this kind?

I think to distinguish publishing from just general messaging there has to be some notion of indirection - e.g. publishing denotes messaging that goes to some kind of abstract representation which only then results in messages being sent to various concrete locations (or similar with better wording). In that case you would publish when you emailed the list, but not if directly emailing bob.

Ian

Tony Garnock-Jones

unread,

Jun 28, 2011, 12:57:23 PM6/28/11

to sp-discu...@googlegroups.com

On 28 June 2011 12:34, Ian Barber <ian.b...@gmail.com> wrote:

I think to distinguish publishing from just general messaging there has to be some notion of indirection - e.g. publishing denotes messaging that goes to some kind of abstract representation which only then results in messages being sent to various concrete locations (or similar with better wording). In that case you would publish when you emailed the list, but not if directly emailing bob.

Perhaps this is something that disqualifies mailing-lists from being "pub sub": that the names li...@example.com and the addresses to...@example.com are held in the same directory at the same level. All of the other systems map from names in one space to addresses in the space below.

Or, a different way of looking at it: with respect to the relay (the software behind li...@example.com), there is a publish operation (the inbound leg) and a deliver operation (the outbound leg), and the same can be said with respect to the destination (to...@example.com) except that *there* the deliver operation ends up delivering the message to a wetware address in a different namespace: the end-user's eyeballs.

This kind of stuff is what I'm directly interested in researching, but afaik there are no good answers yet, so it might be best to place it in the "philosophy" basket for the purposes of this group.

--
Tony Garnock-Jones
tonygarn...@gmail.com
http://homepages.kcbbs.gen.nz/tonyg/

Alexis Richardson

unread,

Jun 28, 2011, 2:06:45 PM6/28/11

to sp-discu...@googlegroups.com

On Tue, Jun 28, 2011 at 5:57 PM, Tony Garnock-Jones
<tonygarn...@gmail.com> wrote:
>
> Perhaps this is something that disqualifies mailing-lists from being "pub
> sub": that the names li...@example.com and the addresses to...@example.com
> are held in the same directory at the same level. All of the other systems
> map from names in one space to addresses in the space below.

IMO the right pubsub model should be implementable using email,
perhaps with some added bits. I'm not convinced that topics and
destinations being in one syntactic space rules this out.

Martin Sustrik

unread,

Jun 28, 2011, 2:34:40 PM6/28/11

to sp-discu...@googlegroups.com, Tony Garnock-Jones

On 06/28/2011 05:12 PM, Tony Garnock-Jones wrote:

> "How does publishing relate to

> having a *connection* in place?" But I think addressing that question
> might be off-topic here.

I think it's very much on-topic, actually, it's the crucial question.

The answer is one of the couple of important lessons I've learned during
N years in messaging business. It's not very intuitive, so the
explanation may be a bit lengthy, so please bear with me.

I believe the crucial principle here is strict separation between
"topology establishment" or "wiring" if you wish and actual routing of
messages within the topology.

"Topology establishment" means creating a graph of nodes connected by links.

"Routing" means moving messages through this graph.

Let's illustrate the principle on the mailing list example. For
simplicity, let's assume the mailing list requires being subscribed
before you can post messages to it.

So, when you want to post a message you first subscribe to the list
("join the topology"), then you send the message ("routing").

Note that you need to do the first step once only and then you can send
arbitrary number of messages, ie. the steps are orthogonal.

I believe the mess we have with addressing in the messaging area is
caused by conflating the two concepts (eg. when mailing list doesn't
require a subscription to be posted to). Actually, the conflation is
almost inevitable in single-broker scenarios. Clear separation emerges
only as you move to more complex topologies. Think of how the AMQP
federation has to be set up using special config files.

How does that affect the concept of addressing?

A. You definitely need an "address" to join the topology. That can be
either address(es) of adjacent peers or an abstract name to resolved by
some name resolution service.

B. When sending a message, you don't need an address[1]. You send the
message to a particular topology and the topology does the routing for
you. If it's a "broadcast" topology [2], it will deliver message to all
nodes in the topology, if it's a "load-balancing" topology, it will
deliver the message to one of the nodes etc.

In summary: You need an address to join a topology. You need no address
to send a message.

Comments:

[1] You definitely need some way to reference the topology you are
sending the message to, but that doesn't need to be an address. For
example, in 0MQ it's a simple file descriptor created when you join the
topology.

[2] Note that with PUB/SUB and subscriptions the message is routed
depending on its content. However, that doesn't mean that message
contains an address. Rather, message contains business data and topology
is able to do smart routing decisions based on the those business data.

Martin

Tony Garnock-Jones

unread,

Jun 28, 2011, 3:32:09 PM6/28/11

to Martin Sustrik, sp-discu...@googlegroups.com

On 28 June 2011 14:34, Martin Sustrik <sus...@250bpm.com> wrote:

On 06/28/2011 05:12 PM, Tony Garnock-Jones wrote:

"How does publishing relate to
having a *connection* in place?" But I think addressing that question
might be off-topic here.

I think it's very much on-topic, actually, it's the crucial question.

I have been thinking of it in terms of "where does the necessary state for flow control and error correction live"? Some kind of connection-like thing is needed to provide a place to hang that state. (See DELTA-T, TCP, AMQP 0-10 sessions (I'm so sorry), AMQP 1.0 links (?) etc.)

So, when you want to post a message you first subscribe to the list ("join the topology"), then you send the message ("routing").

But only some mailing lists require you to be subscribed in order to post. Perhaps there are *three* things going on:

subscription (topology establishment)
permission to publish
publication (incl. routing in the middle and delivery out the other side)

I believe the mess we have with addressing in the messaging area is caused by conflating the two concepts (eg. when mailing list doesn't require a subscription to be posted to). Actually, the conflation is almost inevitable in single-broker scenarios. Clear separation emerges only as you move to more complex topologies. Think of how the AMQP federation has to be set up using special config files.

Urk. I'd rather not :-) But do you think that "permission to publish" and "becoming a subscriber" are usefully separable? Is something subscriptionesque lost if I rephrase what you wrote above to be:

So, when you want to post a message you first obtain permission, then you send the message ("routing").

? (Thinking further on this: maybe something is lost after all? See below.)

A. You definitely need an "address" to join the topology. That can be either address(es) of adjacent peers or an abstract name to resolved by some name resolution service.

(In general, it's the latter: I haven't moved far from Day's "addresses are names in the network one layer down" complete with possible routing information base and translation to an address another layer further away)

I agree with you that you need an "address" to subscribe to some source. I am not so sure about needing an address to inject messages into a topology. For that it seems you simply need permission. (Perhaps we need to be clear about which layers we mean; the two notions of address are almost certainly related to different networks/data-sources.)

B. When sending a message, you don't need an address[1]. You send the message to a particular topology and the topology does the routing for you. If it's a "broadcast" topology [2], it will deliver message to all nodes in the topology, if it's a "load-balancing" topology, it will deliver the message to one of the nodes etc.

Totally agreed.

Back to connections and where to put the state. There is something fishy going on with the notion of "permission to publish". It's like a kind of subscription by the routing-topology to the things the publisher might have to say!

(Publisher): "Hey, I have things for you to route and deliver"
(Router): "I am fascinated and wish to subscribe to your newsletter"
(Publisher): [messages]

Complete with credit, acking etc. Curiously, here's where your footnote 1 from the message I'm replying to comes in: the subscription request sent by the router to the publisher has to include the address (not name—that's the query) to which to send deliveries intended for routing and further delivery. So when the publisher finally has something to send through the router, it builds a packet with an envelope-to pointing to the address the router subscribed with, and with a message-body containing a blob of data within which are routing instructions and a further encapsulated payload, both of which are intended for the router to examine and interpret in relaying further to the router's own subscribers. An outbound message then has two address-ish things in it, one for each "layer" the message traverses.

Regards,
Tony

Martin Sustrik

unread,

Jun 29, 2011, 3:15:22 AM6/29/11

to sp-discu...@googlegroups.com

Hi Tony,

>
> On 06/28/2011 05:12 PM, Tony Garnock-Jones wrote:
>
> "How does publishing relate to
> having a *connection* in place?" But I think addressing that
> question
> might be off-topic here.
>
>
> I think it's very much on-topic, actually, it's the crucial question.
>
>
> I have been thinking of it in terms of "where does the necessary state
> for flow control and error correction live"? Some kind of
> connection-like thing is needed to provide a place to hang that state.
> (See DELTA-T, TCP, AMQP 0-10 sessions (I'm so sorry), AMQP 1.0 links (?)
> etc.)

Right. That's something I wasn't speaking about but it's just another
facet of the same problem.

>
> So, when you want to post a message you first subscribe to the list
> ("join the topology"), then you send the message ("routing").
>
>

> But only *some* mailing lists require you to be subscribed in order to

> post. Perhaps there are *three* things going on:
>

> * subscription (topology establishment)
> * permission to publish
> * publication (incl. routing in the middle and delivery out the

> other side)
>
> I believe the mess we have with addressing in the messaging area is
> caused by conflating the two concepts (eg. when mailing list doesn't
> require a subscription to be posted to). Actually, the conflation is
> almost inevitable in single-broker scenarios. Clear separation
> emerges only as you move to more complex topologies. Think of how
> the AMQP federation has to be set up using special config files.
>
>
> Urk. I'd rather not :-) But do you think that "permission to publish"
> and "becoming a subscriber" are usefully separable?

Definitely. But that was not my point.

The point was that whatever you are doing, whether you are publisher or
consumer, whether you are doing pub/sub, req/rep, ESB, simple load
balancing or whatever exotic kind of messaging pattern you can think of
the first thing you have to do is to specify who are your peers, in
other words, which topology you belong to.

Think of, say, NASDAQ stock quotes. JMPC wants them, GS wants them, my
grandma doesn't want them. There has to be some way to specify that.

Think of, say, image transformation service, box A provides it, so does
box B, but not so box C. It's a standard req/rep, there are no
subscriptions, but still, we have to specify somehow that A & B belong
to the topology whereas C does not.

> Is something
> subscriptionesque lost if I rephrase what you wrote above to be:
>

> So, when you want to post a message you first *obtain permission*,

> then you send the message ("routing").

You can rephrase it that way, but then the essence of "specifying which
service/topology you want to be part of" is lost.

> ? (Thinking further on this: maybe something /is/ lost after all? See

> below.)
>
> A. You definitely need an "address" to join the topology. That can
> be either address(es) of adjacent peers or an abstract name to
> resolved by some name resolution service.
>
>
> (In general, it's the latter: I haven't moved far from Day's "addresses
> are names in the network one layer down" complete with possible routing
> information base and translation to an address another layer further away)
>
> I agree with you that you need an "address" to subscribe to some source.
> I am not so sure about needing an address to inject messages into a
> topology.

By address I meant specifying the topology to publish to. Ie. whether
the message you publish is a NASDAQ stock quote or whether it is request
for processing an image. You surely need to specify that.

As I already said, this distinction is not clear with a single broker
setup where all the services are located in a single place (the broker).
However, once you start thinking in Internet terms, it's pretty clear
that to publish to NASDAQ stock quote topology you would have to connect
to nasdaq.com while to process a picture you would connect to pixar.com.

> For that it seems you simply need permission. (Perhaps we need
> to be clear about which layers we mean; the two notions of address are
> almost certainly related to different networks/data-sources.)
>
> B. When sending a message, you don't need an address[1]. You send
> the message to a particular topology and the topology does the
> routing for you. If it's a "broadcast" topology [2], it will deliver
> message to all nodes in the topology, if it's a "load-balancing"
> topology, it will deliver the message to one of the nodes etc.
>
>
> Totally agreed.
>
> Back to connections and where to put the state. There is something fishy
> going on with the notion of "permission to publish". It's like a kind of
> subscription by the routing-topology to the things the publisher might
> have to say!
>
> (Publisher): "Hey, I have things for you to route and deliver"
> (Router): "I am fascinated and wish to subscribe to your newsletter"
> (Publisher): [messages]

Yes. I would say so.

> Complete with credit, acking etc. Curiously, here's where your footnote
> 1 from the message I'm replying to comes in: the subscription request
> sent by the router to the publisher has to include the address (not

> nameï¿½that's the query) to which to send deliveries intended for routing
> and further delivery.

Once you've joined a specific topology, there's no address needed IMO.
Topology has precise business logic. It's either NASDAQ stock quote
distribution tree or image processing cluster. Never both. Thus, what
you send to image processing cluster is definitely an image processing
request, there's no need to make the fact double-clear by adding an
address field to the request as such.

As for comment [1], it was just about single application taking part in
multiple topologies. Say, if your app visualises stock information you
need both NASDAQ stock quotes and image processing. Thus, when sending a
message, you need some kind of handle to say which topology you are
sending it to, whether to nasdaq.com or pixar.com. However, the handle
is a local construct and has no counterpart in the wire protocol (eg.
file descriptors).

> So when the publisher finally has something to
> send through the router, it builds a packet with an envelope-to pointing
> to the address the router subscribed with, and with a message-body
> containing a blob of data within which are routing instructions and a
> further encapsulated payload, both of which are intended for the router
> to examine and interpret in relaying further to the router's own
> subscribers. An outbound message then has two address-ish things in it,
> one for each "layer" the message traverses.

Martin

Tony Garnock-Jones

unread,

Jun 29, 2011, 9:26:09 AM6/29/11

to sp-discu...@googlegroups.com

Hi Martin,

I think there's something I've not stated: I have been viewing the system as made of up to *three*[1] possibly disjoint networks:

The network a publisher uses to address and transmit a message to the service topology for that topology to route to its subscribers.
The internal network formed by the routing topology itself.
The transport network used at the egress points of the routing topology to deliver messages to subscribers.

Each of these three has a concept of name, for messages submitted to the network; of query, for associating names with addresses; of address, for delivering routed messages. Furthermore, each may have its own distinct languages for expressing names, queries and addresses.

(Of course, any or all of these might actually be the same network. This touches on yesterday's discussion of mailing-lists and whether they are in or out of scope.)

Please forgive me if this message is getting too far off track or repeating points that were already clear. I think this model (heavily indebted to John Day's book) is valuable for placing structure on networks generally, so could be useful for pubsub, but if I'm off in the weeds, please do let me know!

On 29 June 2011 03:15, Martin Sustrik <sus...@250bpm.com> wrote:

Think of, say, NASDAQ stock quotes. JMPC wants them, GS wants them, my grandma doesn't want them. There has to be some way to specify that.

This is a "downstream" arrangement: JPMC & GS want to receive messages from NASDAQ. This is what I've been thinking of as management of subscriptions. Networks 1 and 3 are the internet. Network 2 is the topic tree within NASDAQ. A net-3 name is a net-2 address. Subscribers bind their net-3-names/net-2-addresses to a net-2 wildcard/regex name. People publishing quotes do so by sending a packet to a net-1 name, which is routed to a net-1 address (which in this instance is not the same as a net-2 name). The packet contains the ticker symbol as the net-2 name, which traverses the topic-tree of net-2 queries, resulting in a collection of net-2 addresses, which in this instance are net-3 names. NASDAQ submits a message into net-3 for delivery to the subscribers.

Now here's a point I was confused about earlier: do you see a need for publishers-of-quotes (sending them into NASDAQ for distribution) to have any kind of relationship with net-2? I don't, but I *do* think it needs to know the net-1 name for the quote routing service.

Think of, say, image transformation service, box A provides it, so does box B, but not so box C. It's a standard req/rep, there are no subscriptions, but still, we have to specify somehow that A & B belong to the topology whereas C does not.

This is still a "downstream" arrangement (but there's a wrinkle): A and B want to receive service requests for service X. Let's imagine there's a load-balancing box somewhere: box N. Network 1 is the public internet. Network 2 is the routing table inside box N. Network 3 is the protected firewalled rfc-1918 net where A and B live. A net-3 name is a net-2 address. A and B register with box N, and in doing so implicitly subscribe their net-3-names/net-2-addresses to the sole, trivial net-2 name available in our (assumed) dumb load balancer box N. Service requests are injected into the net-1, the internet, using a net-1 name (DNS name) which is resolved (by the network, not by the requestor) into a net-1 address (IP address). Since there's only the one net-2 name at the load-balancer, the net-2 name doesn't need to be explicitly mentioned in any packet. Box N chooses one of A or B according to its own routing logic, and injects a fresh message into net-3 with the net-3 name of A or B. Network 3 takes over from there.

The wrinkle is that there's a reply name, carried (formally) in the payload of the request message. Since names only make sense within a given network, we have to specify somehow which network the reply name applies to. In this case it's net-1, where the service request originated. This means that A and B, present at least within net-3, have to also have the ability to inject packets addressed to a net-1 name into net-1.

You can rephrase it that way, but then the essence of "specifying which service/topology you want to be part of" is lost.

It's this idea that's still slippery for me. Could you try phrasing things in terms of net-1 and net-2 membership?

By address I meant specifying the topology to publish to. Ie. whether the message you publish is a NASDAQ stock quote or whether it is request for processing an image. You surely need to specify that.

Yep, definitely: that's what I've been calling the net-1 name for the service, above.

Topology has precise business logic. It's either NASDAQ stock quote distribution tree or image processing cluster. Never both. Thus, what you send to image processing cluster is definitely an image processing request, there's no need to make the fact double-clear by adding an address field to the request as such.

Certainly there's no need, for the image processing service, to explicitly say the name "image processing service" in the bits-on-the-wire intended to be routed through net-2, since there's just the one possibility. It *is* present *logically*, though, in the model I've sketched above: it's just that the net-2 concerned—box N, the load-balancer—has a trivially simple namespace/queryspace in which there's just the one possible name. (In the NASDAQ case, the net-2 name is the ticker symbol, and does have to be explicitly mentioned on the wire.)

Regards,

Tony

[1]: Of course three is not a reasonable number. So really there are more networks both up- and down-stream of the scenarios discussed here: one within the JVM sending service requests, another within the Python instance implementing the service, etc. (Interestingly, in both these networks, names and addresses are identified: they're both object pointers! Research topic: what does a programming language look like where object names are disjoint from object addresses?)

Michael Bridgen

unread,

Jun 29, 2011, 1:32:42 PM6/29/11

to sp-discu...@googlegroups.com

> I think there's something I've not stated: I have been viewing the
> system as made of up to *three*[1] possibly disjoint networks:
>

> 1. The network a publisher uses to address and transmit a message to

> the service topology for that topology to route to its subscribers.

> 2. The internal network formed by the routing topology itself.
> 3. The transport network used at the egress points of the routing

> topology to deliver messages to subscribers.
>
> Each of these three has a concept of name, for messages submitted to the
> network; of query, for associating names with addresses; of address, for
> delivering routed messages. Furthermore, each may have its own distinct
> languages for expressing names, queries and addresses.
>
> (Of course, any or all of these might actually be the same network. This
> touches on yesterday's discussion of mailing-lists and whether they are
> in or out of scope.)
>
> Please forgive me if this message is getting too far off track or
> repeating points that were already clear. I think this model (heavily
> indebted to John Day's book) is valuable for placing structure on
> networks generally, so could be useful for pubsub, but if I'm off in the
> weeds, please do let me know!

I don't know if this is what you're getting at Tony, but it seems
there's two things going on: resolution (hostname to IP address, key to
queues), and enrolment (connection, subscription, query).

Interestingly, AMQP has two internal networks: there's exchange
resolution (exchange to directory of bindings), and routing key
resolution (key to queues).

Hermes has three layers of network: the routing overlay, topic routing,
and content routing. All have analogues of enrolment and resolution.

> On 29 June 2011 03:15, Martin Sustrik <sus...@250bpm.com

The wrinkle is very interesting. If you break down a subscription into a
query *and* a destination, it makes more sense. In most cases the
destination is implicitly "back down this pipe"; but, in the case of an
overlay network say, there's a reverse path being built (either
explicitly in the subscription, or implicitly in local state at each hop).

Perhaps this is inappropriate generalisation, but it opens the way for
other arrangements: "please enroll with this pattern, and forward
messages over there"; "once you're done with this message, send it on to
..." etc.

> You can rephrase it that way, but then the essence of "specifying
> which service/topology you want to be part of" is lost.
>
>
> It's this idea that's still slippery for me. Could you try phrasing
> things in terms of net-1 and net-2 membership?
>
> By address I meant specifying the topology to publish to. Ie.
> whether the message you publish is a NASDAQ stock quote or whether
> it is request for processing an image. You surely need to specify that.
>
>
> Yep, definitely: that's what I've been calling the net-1 name for the
> service, above.
>
> Topology has precise business logic. It's either NASDAQ stock quote
> distribution tree or image processing cluster. Never both. Thus,
> what you send to image processing cluster is definitely an image
> processing request, there's no need to make the fact double-clear by
> adding an address field to the request as such.
>
>
> Certainly there's no need, for the image processing service, to
> explicitly say the name "image processing service" in the
> bits-on-the-wire intended to be routed through net-2, since there's just
> the one possibility. It *is* present *logically*, though, in the model

> I've sketched above: it's just that the net-2 concernedï¿½box N, the
> load-balancerï¿½has a trivially simple namespace/queryspace in which

> there's just the one possible name. (In the NASDAQ case, the net-2 name
> is the ticker symbol, and does have to be explicitly mentioned on the wire.)

[...]

> Research topic: what does a programming language look like
> where object names are disjoint from object addresses?)

Corba?

mkb.

Tony Garnock-Jones

unread,

Jun 30, 2011, 8:22:30 AM6/30/11

to sp-discu...@googlegroups.com

On 29 June 2011 13:32, Michael Bridgen <mi...@rabbitmq.com> wrote:

I don't know if this is what you're getting at Tony, but it seems there's two things going on: resolution (hostname to IP address, key to queues), and enrolment (connection, subscription, query).

Yep, that's it. Resolution is happening within net-2. Enrolment is happening between some net-3 entity and net-2. The third part is the publication, which happens between a net-1 entity and net2, and ends up triggering resolution.

Interestingly, AMQP has two internal networks: there's exchange resolution (exchange to directory of bindings), and routing key resolution (key to queues).

Exactly. Or rather, I see it just slightly differently: each exchange itself is like a mini network, embedded within the broker. The same model applies.

The wrinkle is very interesting. If you break down a subscription into a query *and* a destination, it makes more sense. In most cases the destination is implicitly "back down this pipe"; but, in the case of an overlay network say, there's a reverse path being built (either explicitly in the subscription, or implicitly in local state at each hop).

Totally. It's also interesting to think about TCP's 3-way handshake here, especially in the context of "asking for permission" or the strange agency-inversion when a publisher actively has something to send to a receiver that is unaware of the publisher's existence.

Perhaps this is inappropriate generalisation, but it opens the way for other arrangements: "please enroll with this pattern, and forward messages over there"; "once you're done with this message, send it on to ..." etc.

Hey that's neat. Echoes some of the rabbithub stuff, with third-party control of subscription. (I'm afraid this is the best link I have for that subtopic.) In terms of acking, I'd been thinking about third-party acks: where a published message has a field in it that means "if you wish to take responsibility for this message (i.e. acknowledge receipt of it), do so by issuing an acknowledgement message to this network name". That way stateless exchanges can forward messages willy-nilly and the buck only stops where an agent is actively willing to take responsibility for a message.

[...]

:-)

Research topic: what does a programming language look like
where object names are disjoint from object addresses?)

Corba?

Owwww. That was a scary idea for a moment. But no, I don't think so: because the mapping between an object reference (name) and its network location (address) still maps a name to just a single address, and the relevant equivalence is over addresses. I meant more like what if an object reference were something that objects subscribed to? A pub-sub language, if you like. It'd enable things like Erlang's trace facility, like Smalltalk's (and other model/view systems) observer-observable, like aspect-oriented-programming, ...

Tony

Alexis Richardson

unread,

Jul 1, 2011, 5:45:00 AM7/1/11

to sp-discu...@googlegroups.com

On Thu, Jun 30, 2011 at 1:22 PM, Tony Garnock-Jones
<tonygarn...@gmail.com> wrote:
> On 29 June 2011 13:32, Michael Bridgen <mi...@rabbitmq.com> wrote:
>
>
> Hey that's neat. Echoes some of the rabbithub stuff, with third-party
> control of subscription. (I'm afraid this is the best link I have for that
> subtopic.) In terms of acking, I'd been thinking about third-party acks:
> where a published message has a field in it that means "if you wish to take
> responsibility for this message (i.e. acknowledge receipt of it), do so by
> issuing an acknowledgement message to this network name". That way stateless
> exchanges can forward messages willy-nilly and the buck only stops where an
> agent is actively willing to take responsibility for a message.

+1 for this kind of thing. But note that this means we are not
describing just pubsub, but something a bit more than that - with just
enough information flowing back upstream to enable group
conversations.

>>> Research topic: what does a programming language look like
>>> where object names are disjoint from object addresses?)
>>
>> Corba?
>
> Owwww. That was a scary idea for a moment. But no, I don't think so: because
> the mapping between an object reference (name) and its network location
> (address) still maps a name to just a single address, and the relevant
> equivalence is over addresses. I meant more like what if an object reference
> were something that objects subscribed to? A pub-sub language, if you like.
> It'd enable things like Erlang's trace facility, like Smalltalk's (and other
> model/view systems) observer-observable, like aspect-oriented-programming,

Something like this exists with caching systems and to a lesser extent
tuplespaces.

In a distributed object cache that is accessed by navigating a
distributed shared heap (analogous to Ln caching on SMP), you can
think of 'clients' as holding local (weak) references to objects that
live on the client (as 'smart stubs') and on a remote server which
masters all shared state. When changes occur on the shared state on
the server they are propagated to anyone holding any reference. For
clients referring to shared state across the network, this reference
is an implicit subscription. Examples of this begin with the Isis
system and feed all the way through to work on eventual consistency
more recently. Putting the event back into eventual ;-)

Examples of such caching systems include Terracotta and Gemfire. Then
you have things like memcache and Coherence whereby access to objects
is organised around a shared map. In such systems the communication
to the client may be restricted to notifications that a shared object
has changed and must be refetched. This seems closer to a value
passing system.

If we want pubsub to not be complicated I suggest that we only think
in terms of subscriptions to values (eg names) rather than remote
locations.

alexis

Martin Sustrik

unread,

Jul 8, 2011, 8:20:49 AM7/8/11

to sp-discu...@googlegroups.com, Tony Garnock-Jones

Hi Tony,

It took a lot of thinking to answer this one :)

First, I believe we are using different terminology and referring to
different messaging models (broker-based vs. broker-less). I've wrote
"0MQ: The theoretical foundation" partly as a reply to this email, to
define and explain the concepts in broker-less messaging.

Unfortunately, I haven't read John Day's book so some of my comments
below may be misguided...

> I think there's something I've not stated: I have been viewing the
> system as made of up to *three*[1] possibly disjoint networks:
>

> 1. The network a publisher uses to address and transmit a message to

> the service topology for that topology to route to its subscribers.

> 2. The internal network formed by the routing topology itself.
> 3. The transport network used at the egress points of the routing

> topology to deliver messages to subscribers.

What's the rationale for separating the three? Even with broker-based
messaging 1 & 3 are basically the same thing (AMQP for example).

As for 2, this is special as it's being implemented in-process in
broker-based solutions, however, when you stretch it across the network,
as 0MQ does, you can use the same protocol even there.

> A net-3 name is a net-2 address.

This, AFAIU, implies that these are layered one on top of the other, right?

However, that would imply that net-3 always needs net-1 as an underlying
layer, but not vice versa. In other words, consumer must be able to
publish a message, but publisher does not have to be able to consumer a
message. That sound a bit strange and asymmetric.

> Now here's a point I was confused about earlier: do you see a need for
> publishers-of-quotes (sending them into NASDAQ for distribution) to have
> any kind of relationship with net-2? I don't, but I *do* think it needs
> to know the net-1 name for the quote routing service.

This has to do with the concept of topology. I've tried to explain it in
the whitepaper mentioned above. The idea is that your net-1, net-2 and
net-3 form a single unit called "topology", basically a graph. and it's
the topology that's addressable, not individual pieces of it.

> You can rephrase it that way, but then the essence of "specifying
> which service/topology you want to be part of" is lost.
>
>
> It's this idea that's still slippery for me. Could you try phrasing
> things in terms of net-1 and net-2 membership?

The same as above. All three nets can be considered to form a single
object, a "topology". However, when sending a message you still have to
specify, which topology (NASDAQ? image transformation?) it is meant to
be sent to.

Martin

Martin Sustrik

unread,

Jul 8, 2011, 8:48:35 AM7/8/11

to sp-discu...@googlegroups.com, Michael Bridgen

On 06/29/2011 07:32 PM, Michael Bridgen wrote:

> Interestingly, AMQP has two internal networks: there's exchange
> resolution (exchange to directory of bindings), and routing key
> resolution (key to queues).

You've forgot about resolving the hostname when client connects the broker.

Martin

Martin Sustrik

unread,

Jul 8, 2011, 8:58:13 AM7/8/11

to sp-discu...@googlegroups.com, Alexis Richardson

On 07/01/2011 11:45 AM, Alexis Richardson wrote:

> If we want pubsub to not be complicated I suggest that we only think
> in terms of subscriptions to values (eg names) rather than remote
> locations.

+1

Martin

Martin Sustrik

unread,

Jul 8, 2011, 9:13:10 AM7/8/11

to sp-discu...@googlegroups.com, Alexis Richardson

Actually, let me propose even simpler view of subscriptions:

Imagine that pub/sub is a dumb broadcast, ie. it delivers each message
to each consumer within the topology.

Now, some subscribers don't need some messages for their business logic,
so they just ignore them.

Later on it turns out that there's a bandwidth problem as the messages
are delivered even to the clients that will drop them straight away.

So, *optimisation* is implemented, where a client can ask its upstream
node to filter messages instead instead of it.

This is what happened with IP multicast: IGMP was added later on to make
it more efficient. However, IGMP is not considered an intrinsic part of
IP multicast, rather an optional optimisation.

Martin

Gary Berger

unread,

Jul 8, 2011, 9:17:36 AM7/8/11

to sp-discu...@googlegroups.com, Alexis Richardson

Agreed, The upstream should filter to avoid costly interrupt processing..
I am wondering if this is an interesting use-case to link OpenFlow with
messaging. I.e. You could send Flow_Modification message to upstream host
(or centralized controller) to drop based on a tuple-set.

-g

Michael Bridgen

unread,

Jul 8, 2011, 9:28:58 AM7/8/11

to sp-discu...@googlegroups.com

Alexis's statement is consistent with both "subscriptions are to
topologies" and "subscriptions are to abstract names, distinct from
topologies".

Which did you mean, Alexis?

mkb

Alexis Richardson

unread,

Jul 8, 2011, 9:57:47 AM7/8/11

to sp-discu...@googlegroups.com

On Fri, Jul 8, 2011 at 2:28 PM, Michael Bridgen <mi...@rabbitmq.com> wrote:
>>
>>> If we want pubsub to not be complicated I suggest that we only think
>>> in terms of subscriptions to values (eg names) rather than remote
>>> locations.
>>
>> +1
>
> Alexis's statement is consistent with both "subscriptions are to topologies"
> and "subscriptions are to abstract names, distinct from topologies".
>
> Which did you mean, Alexis?

I had the latter in mind.

Martin Sustrik

unread,

Jul 8, 2011, 10:41:23 AM7/8/11

to sp-discu...@googlegroups.com, Alexis Richardson

And my +1 was to the latter as well.

M.

Martin Sustrik

unread,

Jul 9, 2011, 1:18:42 PM7/9/11

to sp-discu...@googlegroups.com, Gary Berger, Alexis Richardson

Hi Gary,

> Agreed, The upstream should filter to avoid costly interrupt processing..
> I am wondering if this is an interesting use-case to link OpenFlow with
> messaging. I.e. You could send Flow_Modification message to upstream host
> (or centralized controller) to drop based on a tuple-set.

Exactly. Treating subscriptions as end-to-end functionality, with
implementation in middle nodes seen only as a performance optimisation
allows to shift the filtering not just to SP hop-by-hop layer (in other
words to the broker), but also further down the stack, to L4, L3 or even
to L2.

On a related topic, I've never written down the rationale for 0MQ's
subscription model :( but one thing that pops into mind here is that
filtering based on prefixes (as opposed to regular expressions, SQL
statements, key-value pairs et al.) was chosen because of its
HW-friendliness, with the long-term perspective of performing it inside
the network infrastructure (routers, switches) at wire speed.

Specifically, prefix-based subscriptions have a nice property that even
if they are truncated, they can still (imperfectly) filter the message feed.

For example: If subscription for NASDAQ.FUTURES.CSCO.MAY15 is received
by an intermediary that's able to match at most 64 bits, it can still
filter out all the messages beginning with NASDAQ.F filtering out all
non-NASDAQ quotes as well as most financial instruments (those not
beginning in 'F'). This alone should cut down the bandwidth usage by an
order of magnitude. The network stack at the consumer -- or at the next
intermediary with perfect matching -- can then filter out all the
remaining unneeded messages (such as NASDAQ.FUTURES.CSCO.MAY16).

Obviously, if topic elements are represented by IDs rather than by
human-readable strings, the filtering in the network devices will become
drastically more efficient.

As for OpenFlow, I have no experience with it. However, if there are
OpenFlow switches that allow for offset+value+mask packet matching we
can do the HW-based filtering even today. Wow!

Obviously, we would have to define packet based transport for the
messages, which is not that big a problem.

If there's no such capability in existing switches, OpenFlow can be
still be used to implement a proof-of-concept by redirecting all packets
to the controller and doing matching there. Of course, the performance
penalty would be probably greater than the benefit gained.

Martin

Kohei Honda

unread,

Jul 19, 2011, 1:48:30 PM7/19/11

to sp-discu...@googlegroups.com, Tony Garnock-Jones

Each message in this thread is a fertile source of discussions. Here I record my response to a message by Martin, already responded by Tony but I feel like adding some additional points.

First, adding one concept which can be useful from now on:

"Topology establishment" means creating a graph of nodes connected by links.

"Routing" means moving messages

through this graph.

We may say the former can also understood in terms of "capability" --- well, perhaps after joining, you can get different capabilities, never mind it is about obtaining a capability to use the topology.

By the way, I am using the capability in the standard sense, for example Kerberos gives a process a ticket to use a printer for some period, and that ticket is a capability. Simply, it is about a capability to do something that matters.

In some cases, you only know the address, and that is capability. In some cases, not.

But interesting discussions by Martin come after this.

How does that affect the concept of addressing?

A. You definitely need an "address" to join the topology. That can be either address(es) of adjacent peers or an abstract name to resolved by some name resolution service.

If we take the "name" option, we use a name, to reach a topology, and you get a capability, or capabilities.

What is this initial name?

And I believe the capability you got this way is a handle, an operation, which you can operate on that topology. Perhaps this is be named differently from the initial name.

Never mind, it is also a name, and you obtain a capability to operate on that name, or on the entity, a topology.

Martin do you find this interpretation capturing A?

B. When sending a message, you don't need an address[1]. You send the message to a particular topology and the topology does the routing for you. If it's a "broadcast" topology [2], it will deliver message to all nodes in the topology, if it's a "load-balancing" topology, it will deliver the message to one of the nodes etc.

In this B, you have an operation on that name. Let's it be "abc". You operate on nw-handle, and your operation goes like this:

I wish to send this message through "abc"

I.e. you are sending your message through a channel --- and you can but you do not have to specify a destination (a specific outlet) in that channel.

In summary: You need an address to join a topology. You need no address to send a message.

Comments:

[1] You definitely need some way to reference the topology you are sending the message to, but that doesn't need to be an address. For example, in 0MQ it's a simple file descriptor created when you join the topology.

[2] Note that with PUB/SUB and subscriptions the message is routed depending on its content. However, that doesn't mean that message contains an address. Rather, message contains business data and topology is able to do smart routing decisions based on the those business data.

Martin

Kohei Honda

unread,

Jul 19, 2011, 1:53:15 PM7/19/11

to sp-discu...@googlegroups.com, Tony Garnock-Jones

While I was writing about B it has been sent off by some keyboard short-cut, my apologies. I only add to the part B. The following is by Martin.

B. When sending a message, you don't need an address[1]. You send the message to a particular topology and the topology does the routing for you. If it's a "broadcast" topology [2], it will deliver message to all nodes in the topology, if it's a "load-balancing" topology, it will deliver the message to one of the nodes etc.

I wrote:

In this B, you have an operation on that name. Let's it be "abc". You operate on nw-handle, and your operation goes like this:

I wish to send this message through "abc"

I.e. you are sending your message through a channel --- and you can but you do not have to specify a destination (a specific outlet) in that channel.

because: you have already put your message into a channel, that channel will do what is necessary. Of course it is not so bad to have a prefix-based destination, as Martin later noted, but in principle, we can do many clever things. The following comment explains this vividly:

[2] Note that with PUB/SUB and subscriptions the message is routed depending on its content. However, that doesn't mean that message contains an address. Rather, message contains business data and topology is able to do smart routing decisions based on the those business data.

So, I gather, a channel is getting smarter and smarter these days.

kohei

Martin Sustrik

unread,

Jul 22, 2011, 8:08:49 AM7/22/11

to sp-discu...@googlegroups.com, Kohei Honda, Tony Garnock-Jones

Hi Kohei,

> First, adding one concept which can be useful from now on:
>
> "Topology establishment" means creating a graph of nodes connected
> by links.

...

> We may say the former can also understood in terms of "capability" ---
> well, perhaps after joining, you can get different capabilities, never
> mind it is about obtaining a capability to use the topology.

Yes. You can call it a capability. Or a service, in the same way as TCP
port 80 translates to HTTP service.

One thing I like about "topology" term is that in kind of implies that
it's a graph, while terms such as "capability" or "service" can be
easily used for p2p connections, simple star topologies, or even for a
single-node configurations.

Other way to put it would be to say that "topology" provides a
"capability/service", thus having a separate term for the messaging
infrastructure (topology) and business logic layered on top of it (service).

> A. You definitely need an "address" to join the topology. That can
> be either address(es) of adjacent peers or an abstract name to
> resolved by some name resolution service.
>
> If we take the "name" option, we use a name, to reach a topology, and
> you get a capability, or capabilities.
>
> What is this initial name?

No idea :) Presumably some kind of URI.

> And I believe the capability you got this way is a handle, an operation,
> which you can operate on that topology. Perhaps this is be named
> differently from the initial name.

What we have in experimental Linux kernel implementation is using file
descriptors as handles:

/* Open a handle to the topology */
int fd = socket (AF_SP, SP_PUB, 0);
connect (fd, "tcp://168.192.0.1111:5555");

/* Use the topology via the handle */
send (fd, "ABC", 3, 0);

> Never mind, it is also a name, and you obtain a capability to operate on
> that name, or on the entity, a topology.
>
> Martin do you find this interpretation capturing A?

Yes. It's exactly what I had in mind.

> B. When sending a message, you don't need an address[1]. You send
> the message to a particular topology and the topology does the
> routing for you. If it's a "broadcast" topology [2], it will deliver
> message to all nodes in the topology, if it's a "load-balancing"
> topology, it will deliver the message to one of the nodes etc.
>
>
> In this B, you have an operation on that name. Let's it be "abc". You
> operate on nw-handle, and your operation goes like this:
>
> I wish to send this message through "abc"
>
> I.e. you are sending your message through a channel --- and you can but
> you do not have to specify a destination (a specific outlet) in that
> channel.

Yes.

Still, I prefer to use "topology" term as the term "channel" is often
used for communication channels between exactly two endpoints (eg. AMQP)
and may cause confusion.

Martin

Gary Berger

unread,

Jul 22, 2011, 9:23:19 AM7/22/11

to sp-discu...@googlegroups.com, Kohei Honda, Tony Garnock-Jones

FWIW I also envision many topologies (abstract graphs).

Today topologies can be defined by physical interconnection (internet),
broadcast domains (VLANS), Networks (Prefix), trees (multicast), or an
overlay (I.e. BitTorrent, Skype).

There is certainly a better need for segmentation both below the network
level (I.e segmentation to avoid leaking packets across autonomous
boundaries) and above the network layer to provide capabilities such as
isolating data into certain geographies, sovereign boundaries but also
into different verticals or customer segments.

The network could possibly be arranged so that a higher level topology
definition is all you need to provision proper segmentation and isolation
if there was such a hint.

One example of an abstract I like is the Tupe-Space concept "Object Space
can be thought of as a virtual repository, shared amongst providers and
accessors of network services, which are themselves abstracted as objects.
Processes communicate among each other using these shared objects ‹ by
updating the state of the objects as and when needed."[1]

EX.
connect (topo,"topo://marketdataservice/equity/options");

send (topo, "GLD", 3, 0);

The question is how do you "join" the topology, who manages the graph,
where does the graph state reside?

Certainly there are examples of this in other P2P based systems but they
are highly complex and proprietary.

-g

[1] http://en.wikipedia.org/wiki/Tuple_space

Martin Sustrik

unread,

Jul 22, 2011, 10:17:29 AM7/22/11

to sp-discu...@googlegroups.com, Gary Berger, Kohei Honda, Tony Garnock-Jones

Hi gary,

> FWIW I also envision many topologies (abstract graphs).
>
> Today topologies can be defined by physical interconnection (internet),
> broadcast domains (VLANS), Networks (Prefix), trees (multicast), or an
> overlay (I.e. BitTorrent, Skype).
>
> There is certainly a better need for segmentation both below the network
> level (I.e segmentation to avoid leaking packets across autonomous
> boundaries) and above the network layer to provide capabilities such as
> isolating data into certain geographies, sovereign boundaries but also
> into different verticals or customer segments.
>
> The network could possibly be arranged so that a higher level topology
> definition is all you need to provision proper segmentation and isolation
> if there was such a hint.

That's an interesting line of thought. How it would work, I guess, is
that SP layer could somehow communicate the topology boundaries to the
underlying layers.

To some extent it's possible using say different TCP port numbers for
different topologies. That way the TCP/IP layer knows about the
boundaries and options like ToS bits can be set based on the topology in
question (= port number).

However, this does not provide strict isolation. So the question is:
Could we go even further and pass the topology boundaries to the
low-level services such as VLAN or MPLS?

I have no expertise in the area, so apologies if I'm being stupid here.

> One example of an abstract I like is the Tupe-Space concept "Object Space
> can be thought of as a virtual repository, shared amongst providers and
> accessors of network services, which are themselves abstracted as objects.
> Processes communicate among each other using these shared objects ‹ by
> updating the state of the objects as and when needed."[1]
>
> EX.
> connect (topo,"topo://marketdataservice/equity/options");
>
> send (topo, "GLD", 3, 0);
>
>
>
> The question is how do you "join" the topology, who manages the graph,
> where does the graph state reside?
>
> Certainly there are examples of this in other P2P based systems but they
> are highly complex and proprietary.

Once again, I know little about tuple-spaces, however, I believe the
concept of clearly separated "messaging patterns" allows to simply
allocate a new messaging pattern (in addition to pub/sub, request/reply
et al.) and hand it to tuple-space experts to define its semantics.

Martin

Gary Berger

unread,

Jul 22, 2011, 10:33:32 AM7/22/11

to Martin Sustrik, sp-discu...@googlegroups.com, Kohei Honda, Tony Garnock-Jones

On 7/22/11 10:17 AM, "Martin Sustrik" <sus...@250bpm.com> wrote:

>Hi gary,
>
>> FWIW I also envision many topologies (abstract graphs).
>>
>> Today topologies can be defined by physical interconnection (internet),
>> broadcast domains (VLANS), Networks (Prefix), trees (multicast), or an
>> overlay (I.e. BitTorrent, Skype).
>>
>> There is certainly a better need for segmentation both below the network
>> level (I.e segmentation to avoid leaking packets across autonomous
>> boundaries) and above the network layer to provide capabilities such as
>> isolating data into certain geographies, sovereign boundaries but also
>> into different verticals or customer segments.
>>
>> The network could possibly be arranged so that a higher level topology
>> definition is all you need to provision proper segmentation and
>>isolation
>> if there was such a hint.
>
>That's an interesting line of thought. How it would work, I guess, is
>that SP layer could somehow communicate the topology boundaries to the
>underlying layers.

@gaberger: Possible its known a-priori, the network topology is organized
based on a contract.

>
>To some extent it's possible using say different TCP port numbers for
>different topologies. That way the TCP/IP layer knows about the
>boundaries and options like ToS bits can be set based on the topology in
>question (= port number).

@gaberger: Possibly TCP port, possibly TOS or maybe an encapsulation
approach like MPLS TAG.

>
>However, this does not provide strict isolation. So the question is:
>Could we go even further and pass the topology boundaries to the
>low-level services such as VLAN or MPLS?
>
>I have no expertise in the area, so apologies if I'm being stupid here.

@gaberger: So exactly, should be based on a contract which defines the
level of isolation needed. It would be great if you could isolate even
sub-VLAN sort of what we call private-VLANS today which were somewhat
dynamic in nature. This would certainly require programability in the
network but this is something that the ONF is deeply looking to solve so
would be best to think about how these mechanics would work what API's we
would want to instruct the network to organize itself based on application
requirements.

>
>> One example of an abstract I like is the Tupe-Space concept "Object
>>Space
>> can be thought of as a virtual repository, shared amongst providers and
>> accessors of network services, which are themselves abstracted as
>>objects.
>> Processes communicate among each other using these shared objects ‹ by
>> updating the state of the objects as and when needed."[1]
>>
>> EX.
>> connect (topo,"topo://marketdataservice/equity/options");
>>
>> send (topo, "GLD", 3, 0);
>>
>>
>>
>> The question is how do you "join" the topology, who manages the graph,
>> where does the graph state reside?
>>
>> Certainly there are examples of this in other P2P based systems but they
>> are highly complex and proprietary.
>
>Once again, I know little about tuple-spaces, however, I believe the
>concept of clearly separated "messaging patterns" allows to simply
>allocate a new messaging pattern (in addition to pub/sub, request/reply
>et al.) and hand it to tuple-space experts to define its semantics.

@gaberger: Yes I wasn't necessarily promoting tuple-spaces but the ideas
are similar to organizing an abstract topology shared amongst providers
and accessors of network services.
>
>Martin

Martin Sustrik

unread,

Jul 22, 2011, 11:48:12 AM7/22/11

to Gary Berger, sp-discu...@googlegroups.com, Kohei Honda, Tony Garnock-Jones

On 07/22/2011 04:33 PM, Gary Berger wrote:

>> However, this does not provide strict isolation. So the question is:
>> Could we go even further and pass the topology boundaries to the
>> low-level services such as VLAN or MPLS?
>>
>> I have no expertise in the area, so apologies if I'm being stupid here.
>
> @gaberger: So exactly, should be based on a contract which defines the
> level of isolation needed. It would be great if you could isolate even
> sub-VLAN sort of what we call private-VLANS today which were somewhat
> dynamic in nature. This would certainly require programability in the
> network but this is something that the ONF is deeply looking to solve so
> would be best to think about how these mechanics would work what API's we
> would want to instruct the network to organize itself based on application
> requirements.

Could you give an concrete example of how it could work?

I must admit I have little understanding of how things work on this
level. I guess the earlier we get this to IETF the better, there's much
more expertise there...

Martin

Gary Berger

unread,

Jul 22, 2011, 12:35:29 PM7/22/11

to Martin Sustrik, sp-discu...@googlegroups.com, Kohei Honda, Tony Garnock-Jones

On 7/22/11 11:48 AM, "Martin Sustrik" <sus...@250bpm.com> wrote:

>On 07/22/2011 04:33 PM, Gary Berger wrote:
>
>>> However, this does not provide strict isolation. So the question is:
>>> Could we go even further and pass the topology boundaries to the
>>> low-level services such as VLAN or MPLS?
>>>
>>> I have no expertise in the area, so apologies if I'm being stupid here.
>>
>> @gaberger: So exactly, should be based on a contract which defines the
>> level of isolation needed. It would be great if you could isolate even
>> sub-VLAN sort of what we call private-VLANS today which were somewhat
>> dynamic in nature. This would certainly require programability in the
>> network but this is something that the ONF is deeply looking to solve so
>> would be best to think about how these mechanics would work what API's
>>we
>> would want to instruct the network to organize itself based on
>>application
>> requirements.
>
>Could you give an concrete example of how it could work?

@gaberger: In its simplest form would be a kind of ACL based on a
pattern-match/action. This could be installed in the access layer switch
or possibly in a virtual switch inside the host by a controller. Its
managed at a higher layer I.e. An abstract graph of G(V,E) where the
members are in a set identified by a property.

Now that¹s a lot of hand-waving because there are limitations in the
hardware lookup engines called TCAMs, but there are ideas like this in
concept which provide network services such as de-centralized firewalls
and load balancers.. Please unicast me and I can point you towards some of
the research here..

Reply all

Reply to author

Forward