Simplified layering

33 views
Skip to first unread message

Martin Sustrik

unread,
Aug 28, 2011, 5:31:01 AM8/28/11
to sp-discu...@googlegroups.com
Hi all,

Paul Colomiets have pointed out that the messaging patterns are the
primary concept in SP and that discussing labeling (which tries to
factor out some purely syntactic structures common to multiple patterns)
before discussing the patterns makes little sense.

Thus I would propose using a simplified mode of the stack. Namely,
standard OSI model with SP on L5:

L4: Transport Layer (SCTP, WebSockets, etc.)
L5: Session (Pattern) Layer (pub/sub, req/rep etc.)
L6: Presentation Layer (XML, JSON etc.)
L7: Application Layer

For example:

+------------+
| FpML | L7 - Application Layer
+------------+
| XML | L6 - Presentation Layer
+------------+
| REQ/REP | L5 - Session (Pattern) Layer
+------------+
| SCTP | L4 - Transport Layer
+------------+
| IP | L3 - Network Layer
+------------+
| Ethernet | L2 - Datalink Layer
+------------+
| 1000BASE-T | L1 - Physical Layer
+------------+

Couple of comments:

1. When we have several patterns defined we can normalise out some
common syntactic structures to form a thin labeling layer between L4 and
L5. Or it may turn out that the patterns are so disparate that trying to
find a common denominator is just a waste of time.

2. Pattern layer requires transport layer to deliver data in discrete
atomic messages. If that's not the case (TCP) we may need to define an
thin auxiliary framing (see SP framing I-D). However, this is a
auxiliary scaffolding, not a core part of the pattern layer. It can be
placed into an appendix or similar.

3. In this simplified model where there's no intermediate layer between
the transport and the pattern and where different patterns are clearly
defined as separate protocols, it's much more obvious that different
patterns may have different requirements on the transport (req/rep
requires bi-directional channel etc.)

4. One issue I can see with this approach is that there will be no
unified "SP protocol" which may be a bit of problem at IETF (IETF
prefers to deal with protocols) as well as a branding/marketing problem.
Maybe it can be solved by simply referring to a set of protocols using a
single name, say "Scalability Protocols".

Martin

Gary Berger

unread,
Aug 30, 2011, 8:48:33 PM8/30/11
to sp-discu...@googlegroups.com
I am not sure this is helpful but.. The OSI guys learned long ago that the
top three-layers were really (1) layer.. In modern practice if you want to
look at this you can look at PNA which has been mentioned before.. You
have an application which is split (half outside of the network domain
which is used for the core application logic) and half which sits in the
network domain which is responsible for IPC communications.

If you look at all of the problems that exist in networking (IPC) a lot of
them are centered around IP.. It would be best to try and look at the
problem by using the layering abstraction appropriately. In protocols you
typically have a multiplexing protocol and an error and flow control
protocol above it. These layers can be recursive which means they operate
on different scopes.. If the base operations for IPC services are (Read,
Write, Create, Delete, Start, Stop) than what would be the scope necessary
to provide for the different communication patterns and how would you
represent this to applications?


-g

>--
>Note Well: This discussion group is meant to become an IETF working group
>in the future. Thus, the posts to this discussion should comply with IETF
>contribution policy as explained here:
>http://www.ietf.org/about/note-well.html


Martin Sustrik

unread,
Aug 31, 2011, 5:27:57 AM8/31/11
to sp-discu...@googlegroups.com, Gary Berger
Hi Gary,

> I am not sure this is helpful but.. The OSI guys learned long ago that the
> top three-layers were really (1) layer.. In modern practice if you want to
> look at this you can look at PNA which has been mentioned before.. You
> have an application which is split (half outside of the network domain
> which is used for the core application logic) and half which sits in the
> network domain which is responsible for IPC communications.

I don't care much about OSI layering as such. The diagram below is to
say "There is a single scalability layer which can host multiple
protocols. The layer is above the transport and below the presentation
(ie. the content is un-structured, not an XML or similar)."

We can possibly define it that way without even mentioning OSI model.

> If you look at all of the problems that exist in networking (IPC) a lot of
> them are centered around IP.. It would be best to try and look at the
> problem by using the layering abstraction appropriately. In protocols you
> typically have a multiplexing protocol and an error and flow control
> protocol above it. These layers can be recursive which means they operate
> on different scopes..

Are you suggesting there should be multiplexing and flow control on SP
level?

> If the base operations for IPC services are (Read,
> Write, Create, Delete, Start, Stop) than what would be the scope necessary
> to provide for the different communication patterns and how would you
> represent this to applications?

From client's perspective the operations are exactly the same IMO. You
can think of it as a simple connection-less transport, you send a
message, network decides where it goes. The only difference is that UDP
uses standard IP routing algorithm, while SP uses more sophisticated
routing algorithms.

Martin

Gary Berger

unread,
Aug 31, 2011, 9:45:25 AM8/31/11
to sp-discu...@googlegroups.com
Comments inline..

@gaberger: It seems already implicit, its just a pattern "There is a


single scalability layer which can host multiple

Protocols" So SP mux/demux protocols". My question would be is that the
right scope?. We've talked about the concept of topology or to be more
specific graphs. So SP allows us to communicate with each topology when it
can be identified, else broadcast? As far as error and flow control, if
you assume that an ACK in the transport layer means "I got the data" that
might be wrong.. It really is saying "Don't send me that data again". So
if you have to detect errors or even buffering situations how does SP
respond back to the application?

>
>> If the base operations for IPC services are (Read,
>> Write, Create, Delete, Start, Stop) than what would be the scope
>>necessary
>> to provide for the different communication patterns and how would you
>> represent this to applications?
>
> From client's perspective the operations are exactly the same IMO. You
>can think of it as a simple connection-less transport, you send a
>message, network decides where it goes. The only difference is that UDP
>uses standard IP routing algorithm, while SP uses more sophisticated
>routing algorithms.

@gaberger: Yes exactly, but "where" it goes is kinda broken today.. (I.e.
Dealing with multiple-paths). For instance lets say you have two
interfaces on a device (one is a Wifi 802.11, the other is a 4G LTE
connection). if you send a message down to the network which path does it
take? What if one of the interfaces is down, do you think the network
layer knows what to do? The answer is it doesn't, that¹s why guys are
trying to push MPTCP, yes SCTCP can be rebound on another interface but
you lose state.

Martin Sustrik

unread,
Sep 1, 2011, 2:44:30 AM9/1/11
to sp-discu...@googlegroups.com, Gary Berger
Hi Gary,

>> Are you suggesting there should be multiplexing and flow control on SP
>> level?
>
> @gaberger: It seems already implicit, its just a pattern "There is a
> single scalability layer which can host multiple
> Protocols" So SP mux/demux protocols".

Ah, right, that way. Yes.

It's important to state though that the actual act of passing several
independent streams of data should be delegated to the underlying layer
(TCP ports, SCTP streams etc.) rather than emulated on SP level.

Passing multiple streams on top of a single TCP connection -- say the
way AMQP does it -- results in head of line blocking, especially in
situations where pushback is applied (two streams are passed through a
single TCP connection, one stream applies pushback, other stream gets
stuck).

We were no networking experts when writing original AMQP specification,
so it's not surprising we've got that wrong, but even the last WebSocket
I-D messes it up and actually suggest that WebSocket fragmentation can
be used for mutliplexing :(

So the requirement to not emulate multiplexing on SP layer is really
worth of explicit statement.

Maybe you can think of a better terminology and wording?

> My question would be is that the
> right scope?. We've talked about the concept of topology or to be more
> specific graphs. So SP allows us to communicate with each topology when it
> can be identified, else broadcast?

I would say, if you don't identify topology, it's an error. No data
should be passed in such case.

> As far as error and flow control, if
> you assume that an ACK in the transport layer means "I got the data" that
> might be wrong.. It really is saying "Don't send me that data again". So
> if you have to detect errors or even buffering situations how does SP
> respond back to the application?

That's kind of scope of the protocol specification AFAICS. It's more of
an API concern. The answer actually depends on the use case. The leaves
of the topology often want to get the explicit errors (say connection
timeout) while intermediary nodes want to follow the best-effort path,
ignoring the errors and trying to work around the failure.

>> From client's perspective the operations are exactly the same IMO. You
>> can think of it as a simple connection-less transport, you send a
>> message, network decides where it goes. The only difference is that UDP
>> uses standard IP routing algorithm, while SP uses more sophisticated
>> routing algorithms.
>
> @gaberger: Yes exactly, but "where" it goes is kinda broken today.. (I.e.
> Dealing with multiple-paths). For instance lets say you have two
> interfaces on a device (one is a Wifi 802.11, the other is a 4G LTE
> connection). if you send a message down to the network which path does it
> take?

No idea :)

> What if one of the interfaces is down, do you think the network
> layer knows what to do? The answer is it doesn't, that�s why guys are
> trying to push MPTCP, yes SCTCP can be rebound on another interface but
> you lose state.

Right. The question is: Is this in scope for SP work? In other words:
Should this problem be solved above L4?

Martin

Martin Sustrik

unread,
Sep 1, 2011, 4:45:11 AM9/1/11
to sp-discu...@googlegroups.com, Gary Berger
On 09/01/2011 08:44 AM, Martin Sustrik wrote:

> Passing multiple streams on top of a single TCP connection -- say the
> way AMQP does it -- results in head of line blocking, especially in
> situations where pushback is applied (two streams are passed through a
> single TCP connection, one stream applies pushback, other stream gets
> stuck).
>
> We were no networking experts when writing original AMQP specification,
> so it's not surprising we've got that wrong, but even the last WebSocket
> I-D messes it up and actually suggest that WebSocket fragmentation can
> be used for mutliplexing :(

Ugh! I was wrong. As hybi guys have pointed out multiplexing can be done
on top of TCP. It cane be done using separate per-channel buffers,
per-channel flow control, message fragmentation and quotas expressed in
bytes rather than in messages.

Still, it's a question whether this kind of tunnelling is in scope of SP
work or whether it should be handled by the layers beneath it (eg. SCTP
or WebSocket/x-google-mux).

Martin

Gary Berger

unread,
Sep 1, 2011, 11:13:21 AM9/1/11
to Martin Sustrik, sp-discu...@googlegroups.com
Check this Princeton initiative out.



The flow is broken out under the transport layer and demultiplexed by FlowID.

Pieter Hintjens

unread,
Sep 1, 2011, 3:31:32 PM9/1/11
to sp-discu...@googlegroups.com, Gary Berger
On Thu, Sep 1, 2011 at 10:45 AM, Martin Sustrik <sus...@250bpm.com> wrote:

>> We were no networking experts when writing original AMQP specification,

>> so it's not surprising we've got that wrong...

:-) As the author of that original AMQP specification, I'm pretty sure
we didn't get it wrong.

> Ugh! I was wrong. As hybi guys have pointed out multiplexing can be done on
> top of TCP. It cane be done using separate per-channel buffers, per-channel
> flow control, message fragmentation and quotas expressed in bytes rather
> than in messages.

Indeed. This is how AMQP/0.9.1 did it.

Having said that, what we did discover is that it's seriously
over-engineered, difficult to use properly in APIs, and redundant
because TCP connections are cheap and there really aren't strong use
cases for multiplexing. Same reason that HTTP-NG never took off, the
benefits aren't worth the significant costs in overall complexity.

-Pieter

Paul Colomiets

unread,
Sep 1, 2011, 4:51:17 PM9/1/11
to sp-discu...@googlegroups.com
Hi Pieter,

On Thu, Sep 1, 2011 at 10:31 PM, Pieter Hintjens <p...@imatix.com> wrote:
>
> Having said that, what we did discover is that it's seriously
> over-engineered, difficult to use properly in APIs, and redundant
> because TCP connections are cheap and there really aren't strong use
> cases for multiplexing. Same reason that HTTP-NG never took off, the
> benefits aren't worth the significant costs in overall complexity.
>

This depends very much on domain. In browser there are quite low limit
on number of connections (IIRC about 30 on current browsers). It's not
a problem for HTTP, but while some people tend to have 20-60 tabs
open and some of them would want several connections open it will
quickly become a problem when websockets will get a wider usage.

Similar thing with websocket intermediaries. Current web application
world tend to use proxies between actual clients and backend servers.
And creating another connection to the backend by intermediary, for
each client will render such intermediaries almost useless.

But yes. I think SP should not provide channels and per-channel
flow control, qutoas, ability to use several patterns within one
connection, and other bloat. But what I do want is to have standard
easy well-defined protocol to aggregate feeds, mark messages
apropriately, forward, and split feeds back for any message pattern.
I've describe my use case in more detail in "Labeling layer -- some
additional thoughts" thread. May be this kind of tunelling is another
pattern itself?

--
Paul

Gary Berger

unread,
Sep 1, 2011, 7:31:00 PM9/1/11
to sp-discu...@googlegroups.com
Not sure if you guys have seen this. Worth a view especially the last part
where he talks about an interface not a socket.

http://www.youtube.com/watch?v=WVs7Pc99S7w


-g

Martin Sustrik

unread,
Sep 2, 2011, 1:53:24 AM9/2/11
to sp-discu...@googlegroups.com, Pieter Hintjens, Gary Berger
On 09/01/2011 09:31 PM, Pieter Hintjens wrote:
> On Thu, Sep 1, 2011 at 10:45 AM, Martin Sustrik<sus...@250bpm.com> wrote:
>
>>> We were no networking experts when writing original AMQP specification,
>>> so it's not surprising we've got that wrong...
>
> :-) As the author of that original AMQP specification, I'm pretty sure
> we didn't get it wrong.

Yup. I think we've had this discussion before. You were right, I was
wrong. Multiplexing *can* be done on top of TCP.

>> Ugh! I was wrong. As hybi guys have pointed out multiplexing can be done on
>> top of TCP. It cane be done using separate per-channel buffers, per-channel
>> flow control, message fragmentation and quotas expressed in bytes rather
>> than in messages.
>
> Indeed. This is how AMQP/0.9.1 did it.

I guess my feeling that the thing cannot be done was based on how AMQP
does it, which still doesn't work:

1. There are no publisher-side acks, meaning that single stream can
deadlock all other streams on the way from publisher to the broker.

2. While there is prefetch-size field that advertises the space
available in rx buffer in terms of bytes, there's a clause that
explicitly disables it when there's only one message being processed.
Meaning that a large message in one stream can deadlock other streams.

3. The ACK works in terms of messages, not bytes, which makes it
unsuitable for making sure that rx buffers won't overflow and apply
pushback to the TCP socket deadlocking all other streams.

> Having said that, what we did discover is that it's seriously
> over-engineered, difficult to use properly in APIs, and redundant
> because TCP connections are cheap and there really aren't strong use
> cases for multiplexing. Same reason that HTTP-NG never took off, the
> benefits aren't worth the significant costs in overall complexity.

Ack.

The websocket scenario is interesting though. Let's see whether
websocket multiplexing takes off.

Martin

Martin Sustrik

unread,
Sep 2, 2011, 2:01:48 AM9/2/11
to sp-discu...@googlegroups.com, Paul Colomiets
On 09/01/2011 10:51 PM, Paul Colomiets wrote:
>
>> Having said that, what we did discover is that it's seriously
>> over-engineered, difficult to use properly in APIs, and redundant
>> because TCP connections are cheap and there really aren't strong use
>> cases for multiplexing. Same reason that HTTP-NG never took off, the
>> benefits aren't worth the significant costs in overall complexity.
>>
>
> This depends very much on domain. In browser there are quite low limit
> on number of connections (IIRC about 30 on current browsers). It's not
> a problem for HTTP, but while some people tend to have 20-60 tabs
> open and some of them would want several connections open it will
> quickly become a problem when websockets will get a wider usage.

It should be noted that Jim Gettys identified multiple connections from
the browser as one of the problems behind bufferbloat (talk at IETF80).
I don't feel competent to judge whether he's right or not, but if so,
it's possible we'll see a push for limiting the number of HTTP
connections in the future.

> Similar thing with websocket intermediaries. Current web application
> world tend to use proxies between actual clients and backend servers.
> And creating another connection to the backend by intermediary, for
> each client will render such intermediaries almost useless.
>
> But yes. I think SP should not provide channels and per-channel
> flow control, qutoas, ability to use several patterns within one
> connection, and other bloat. But what I do want is to have standard
> easy well-defined protocol to aggregate feeds, mark messages
> apropriately, forward, and split feeds back for any message pattern.
> I've describe my use case in more detail in "Labeling layer -- some
> additional thoughts" thread. May be this kind of tunelling is another
> pattern itself?

Kind of, but it's pattern on top of other patterns (eg. pub/sub and
req/rep can be passed through the same tunnel) so more thinking is
needed about how to formalise it.

Also, there are tunnelling solutions on every level of the stack which
could be possible used -- making the problem even more interesting.

Martin

Martin Sustrik

unread,
Sep 6, 2011, 5:52:10 AM9/6/11
to sp-discu...@googlegroups.com, Gary Berger
Hi Gary,

> Check this Princeton initiative out.
>
> http://www.serval-arch.org/
>
>

> The flow is broken out under the transport layer and demultiplexed by
> FlowID.

I've finally got some time to check this out.

AFAICS such an architecture can solve some of the problems you've
mentioned (multihoming, mobility etc.) Also, the option to do this stuff
between L3 and L4 seems reasonable.

As for the communication patterns, which are in the center of SP, serval
provides anycast routing of the the request to the topologically closest
instance of the service.

That's not enough for messaging-style request/reply scenarios IMO.

What we need is rather a way to negotiate the capacity between providers
of the service and users of the service, so that request are routed to
the instances that are able to handle them (as opposed to the
topologically closest instances).

One can think of it as a stock exchange, where providers of the service
place asks on the exchange, users place bids and the exchange (the SP
topology) matches the asks with the bids.

The pitfalls to avoid here are service overcommitment ("selling" more
service than you can actually provide) and request overcommitment (where
the user ends up "buying" more service than it actually needs).

Martin

Gary Berger

unread,
Sep 6, 2011, 11:36:40 AM9/6/11
to Martin Sustrik, sp-discu...@googlegroups.com
Martin,

I agree the communication patterns are underdeveloped in this model but I
think demonstrates an example of how you can decouple application services
from network semantics. I can imagine for instance


context = zmq.Context(1)
ZMQ.Socket publisher = context.socket(ZMQ.PUB,PF_SERVAL); //Select Serval
transport

publisher.bind("ServiceID"); // Bind to ServiceID

publisher.send(message); //Send to Service.

Anyone bound to ServiceID (I.e. Through a Demux rule in the service table)
would accept the message.


Now to be clear I don't think the Serval is the total answer but it does
demonstrate some important concepts.

1. Disconnecting the application service name from the networking address
semantics
2. Disconnecting the communication flow from the interface
3. Allowing for recursive service discovery


I think that looking at this and RINA's concept of application entities
and CDAP are an interesting way of evolving any discussion on scalability
protocols..

-g

Martin Sustrik

unread,
Sep 18, 2011, 7:00:03 AM9/18/11
to sp-discu...@googlegroups.com, Gary Berger
Hi Gary,

Sory for the delay. I've been doing some consulting and had no time for
the SP work.


> context = zmq.Context(1)
> ZMQ.Socket publisher = context.socket(ZMQ.PUB,PF_SERVAL); //Select Serval
> transport
>
> publisher.bind("ServiceID"); // Bind to ServiceID
>
> publisher.send(message); //Send to Service.
>
>
>
> Anyone bound to ServiceID (I.e. Through a Demux rule in the service table)
> would accept the message.
>
>
> Now to be clear I don't think the Serval is the total answer but it does
> demonstrate some important concepts.
>
> 1. Disconnecting the application service name from the networking address
> semantics
> 2. Disconnecting the communication flow from the interface
> 3. Allowing for recursive service discovery
>
>
> I think that looking at this and RINA's concept of application entities
> and CDAP are an interesting way of evolving any discussion on scalability
> protocols..

Right. What I think it boils down to is name resolution. The name of the
SP level (service name or topology name if you wish) should be resolved
to the address for the underlying layer (be it serval or whatever else).

I have to say here that I haven't even got as far as thinking seriously
about it...

However, there are some obvious problems there. For example: Given that
topology has N nodes, if you say "connect to topology" which node should
you be connected to?

This problem seems to have a lot in common with the mobility problem, so
maybe the existing work in the area can be re-used or at least learnt from.

Martin

Reply all
Reply to author
Forward
0 new messages