Why use padding when sending data using CDR ?

Rahul Rastogi

unread,

May 31, 1999, 3:00:00 AM5/31/99

to

Hello !

I have a doubt with regard to the way data is marshalled using CDR - if
both client and server object know the data layout, why is padding
required ?

Also, if I send a long followed by a double, I'll consume 16 bytes (4
for long + 4 padding to align double on a 8-byte boundary + 8 for
double). If, however, I send a double followed by a long, I use only 12
bytes (a saving of 25%) - 8 for double + 4 for long (there's no
padding). Is this a possible optimization that can be used ?

Martin v.Loewis

unread,

May 31, 1999, 3:00:00 AM5/31/99

to

In article <37526DBC...@eil.co.in>,

Rahul Rastogi <rahul....@eil.co.in> wrote:
>I have a doubt with regard to the way data is marshalled using CDR - if
>both client and server object know the data layout, why is padding
>required ?

There is a simple answer: Because the spec says you have to do padding.

An ORB is free to use any on-the-wire representation to communicate,
but it won't be IIOP if CDR is not used.

For example, ILU implements the HTTP-NG protocol, in addition to IIOP
and a number of other protocols. Bill Janssen likes to argue that HTTP-ng
gives you much smaller messages than IIOP, for a number of reasons
(marshalling, transmitting operation numbers instead of operation names,
...)

>Also, if I send a long followed by a double, I'll consume 16 bytes (4
>for long + 4 padding to align double on a 8-byte boundary + 8 for
>double). If, however, I send a double followed by a long, I use only 12
>bytes (a saving of 25%) - 8 for double + 4 for long (there's no
>padding).

It depends on what follows after the long. If a double follows (eg. because
you have a sequence of these structs) nothing is gained.

> Is this a possible optimization that can be used ?

In most cases, the answer is no. You seem to assume small messages here,
consisting of only a double and a long, e.g.

void an_operation(in double d, in long l);

In that case, the vast percentage of data comes:
- from the GIOP header and GIOP invoke header (the latter carrying
the object key, the operation name (16 bytes), and possibly service
contexts)
- the TCP and IP header
- and the MAC framing
In this context, four bytes saved are not measurable. In addition,
the latency is much more significant over the transmission time for
small messages.

In large messages, it might be occasionally possible to save a significant
number of bytes by clever re-arrangement of data.

Regards,
Martin

P.S. If you draw the conclusion that IIOP is not particularly efficient -
yes, I agree. Chose an ORB that offers a more efficient alternative.

Elliot Lee

unread,

May 31, 1999, 3:00:00 AM5/31/99

to

On Mon, 31 May 1999 16:38:44 +0530, Rahul Rastogi
<rahul....@eil.co.in> wrote:

>I have a doubt with regard to the way data is marshalled using CDR - if
>both client and server object know the data layout, why is padding
>required ?

Padding is required because computer architectures have alignment rules
for data, and if you have the CDR stream aligned even roughly like the
architecture expects, you can often do things like:
*outparam = *(CORBA_double)(cdrbuf->curposition);

If you are really having _major_ problems with alignment overhead, your
best bet may be to use a custom marshalling algorithm to put the data into
a sequence<octet>, and then pass that as the parameter of the operation. I
think, however, that compared to the overhead of this custom marshalling
pass, the extra size used by alignment is a minimal cost.

-- Elliot
"We're sorry, we didn't know it was supposed to be invisible."
- Sign carried outside US embassy.

Bill Beckwith

unread,

May 31, 1999, 3:00:00 AM5/31/99

to

Rahul Rastogi <rahul....@eil.co.in> wrote in
<37526DBC...@eil.co.in>:

>Hello !

>
>I have a doubt with regard to the way data is marshalled using CDR - if
>both client and server object know the data layout, why is padding
>required ?

The padding helps with the speed of marshalling. RISC architectures
require that data is aligned according to its size. The CDR padding
allows the marshalling code take advantage of this optimization.

>Also, if I send a long followed by a double, I'll consume 16 bytes (4
>for long + 4 padding to align double on a 8-byte boundary + 8 for
>double).

Not necessarily. The padding only occurs only if you started
on an 8 byte boundary.

>If, however, I send a double followed by a long, I use only 12
>bytes (a saving of 25%) - 8 for double + 4 for long (there's no

>padding). Is this a possible optimization that can be used ?

Unless you're bandwidth is limited to a 28.8 modem (and some
systems are) the wire time of the extra four bytes isn't
significant (an interceptor could preinitialize the skipped
bytes to zero for good compression though). However, the CPU
time for writing data across cache and memory alignment
boundaries can be very significant in higher bandwidth
situations.

If you are writing the IDL then you have control of the record
layout and can avoid any penalty at all.

-- Bill

------------------------------------------------------------------
email: rwb-...@ois.com Main: 703-295-6500 | CORBA
Objective Interface Systems, Inc. | for
1892 Preston White Drive FAX: 703-295-6501 | fast
Reston, Virginia 20191-5448 http://www.ois.com | real-time

Michi Henning

unread,

Jun 1, 1999, 3:00:00 AM6/1/99

to

On Mon, 31 May 1999, Rahul Rastogi wrote:

> I have a doubt with regard to the way data is marshalled using CDR - if
> both client and server object know the data layout, why is padding
> required ?

The padding permits unmarshaling of the data by doing a simple memcpy()
(at least in some cases, namely, when the data alignment matches the
alignment restrictions of the CPU).

> Also, if I send a long followed by a double, I'll consume 16 bytes (4
> for long + 4 padding to align double on a 8-byte boundary + 8 for

> double). If, however, I send a double followed by a long, I use only 12

> bytes (a saving of 25%) - 8 for double + 4 for long (there's no
> padding). Is this a possible optimization that can be used ?

I don't see what you could optimize in this case. Do you mean that in the
first case, the struct could only be 12 bytes long? If so, the answer
is yes, you could make it shorter. However, in that case, the receiving
ORB couldn't not simply set a pointer to the struct into the marshaling
buffer and pretend that it is the structure value, because the structure
would not be aligned with the requirements of most CPUs. Basically,
padding bytes trade off bandwidth on the wire against the cost of
additional data copies during marshaling.

Cheers,

Michi.
Copyright 1999 Michi Henning. All rights reserved.
--
Michi Henning +61 7 3236 1633
Triodia Technologies +61 4 1118 2700 (mobile)
PO Box 372 +61 7 3211 0047 (fax)
Annerley 4103 mi...@triodia.com
AUSTRALIA http://www.triodia.com/staff/michi-henning.html

Bill Janssen

unread,

Jun 2, 1999, 3:00:00 AM6/2/99

to

In article <7iuf04$ms$1...@news.cs.tu-berlin.de> loe...@cs.tu-berlin.de (Martin v.Loewis) writes:

Bill Janssen likes to argue that HTTP-ng
gives you much smaller messages than IIOP, for a number of reasons
(marshalling, transmitting operation numbers instead of operation names,
...)

Well, it's perhaps more than just an argument. In our tests, running
the ILU example "test1" with IIOP, then with HTTP-ng, shows that
HTTP-ng uses fewer than half the bytes used by IIOP. A simple "ping"
call with HTTP-ng uses 1/7 the bytes the same calls uses with IIOP.

Anyone can download ILU and run these tests themselves. "test1", of
course, is only one approximation of a CORBA job mix; I'd be
interested in seeing other typical job mixes established for this kind
of benchmarking purpose.

Bill
--
Bill Janssen <jan...@parc.xerox.com> (650) 812-4763 FAX: (650) 812-4777
Xerox Palo Alto Research Center, 3333 Coyote Hill Rd, Palo Alto, CA 94304
URL: ftp://ftp.parc.xerox.com/pub/ilu/misc/janssen.html

Michi Henning

unread,

Jun 3, 1999, 3:00:00 AM6/3/99

to

On 2 Jun 1999, Bill Janssen wrote:

> Well, it's perhaps more than just an argument. In our tests, running
> the ILU example "test1" with IIOP, then with HTTP-ng, shows that
> HTTP-ng uses fewer than half the bytes used by IIOP. A simple "ping"
> call with HTTP-ng uses 1/7 the bytes the same calls uses with IIOP.

No argument here. However, a more intersting point would be to know how
that affects efficiency of marshaling. Doug's results indicate that, at
least with high-speed networks, marshaling cost is the dominant
cost factor. IIOP, at least in some cases, avoids byte-for-byte processing
during marshaling because it lays out data on the wire in a format that
coincides with the alignment requirements of many CPUs and, in that respect,
is more efficient. So, how much difference does this make in terms
of marshaling performance over high-speed networks? Is the more bloated
IIOP representation actually more efficient than HTTP-NG? (Of course,
over slower networks, I would expect HTTP-NG to win hands-down, especially
for messages with large amounts of data in them.)

On a related note, many people advocate to simply use a fixed byte order
on the wire and to not require the receiver-makes-it-right approach. The
argument is that the extra complexity in the receiver isn't worth it.
From a simplicity point of view, I agree. However, again, Doug's results
seem to indicate that the extra complexity would be worth it if you are
running over a high-speed network.

Bill, have you done some benchmarks over high-speed networks to see how
much the more compact representation on the wire affects marshaling? It
would be interesting to know the results, if any.

Bill Beckwith

unread,

Jun 3, 1999, 3:00:00 AM6/3/99

to

Michi Henning <mi...@triodia.com> wrote in
<Pine.HPX.4.05.990603...@bobo.triodia.com>:

>On 2 Jun 1999, Bill Janssen wrote:
>
>> Well, it's perhaps more than just an argument. In our tests, running
>> the ILU example "test1" with IIOP, then with HTTP-ng, shows that
>> HTTP-ng uses fewer than half the bytes used by IIOP. A simple "ping"
>> call with HTTP-ng uses 1/7 the bytes the same calls uses with IIOP.

...

>On a related note, many people advocate to simply use a fixed byte order
>on the wire and to not require the receiver-makes-it-right approach. The
>argument is that the extra complexity in the receiver isn't worth it.
>From a simplicity point of view, I agree. However, again, Doug's results
>seem to indicate that the extra complexity would be worth it if you are
>running over a high-speed network.

The network-byte-order scheme is only simpler on systems that match
the byte order picked for the network. The systems that don't match
have twice the complexity to adjust both reads and writes. I like
spreading the work and complexity of any necessary endian switching
to both systems.

>Bill, have you done some benchmarks over high-speed networks to see how
>much the more compact representation on the wire affects marshaling? It
>would be interesting to know the results, if any.

Yes, I have. The faster the network the lower the proportion the
physical transfer time (I call this media time) represents. My test
results were consistent with a simple numeric analysis.

Imagine two transfer sizes, 64 bytes (close to the lower limits
of a message size for most ORBs) and 50 megabytes (maybe a
half-hour MPEG 3 TV show). We'll think about four network
media speeds:

* a modem running at 28.8K BAUD (277,777 nsec/byte),
* 10 Mbit/sec ethernet ( 800 nsec/byte),
* 100 Mbit/sec ethernet ( 80 nsec/byte), and
* 800 MByte/sec Fibre Channel ( 1.25 nsec/byte).

The media times of our two message sizes are as follows:

Msg Size 28.8K BAUD 10 Mb 100 Mb 800 MB
-------- ---------- ----- ------ -------

64 bytes 17,777,778 nsec 51,200 nsec 5,120 nsec 80 nsec

50 MBytes 13,888,889 msec 40,000 msec 4,000 msec 62.5 msec

M = 1,000,000 (for simplicity)

nsec = .000000001 sec
usec = .000001 sec
msec = .001 sec

[Note that this analysis leaves out any fixed latency the network
introduces beyond the media time. A fixed non-zero latency would
not affect the comparison of marshalling schemes since both would
incur this time.]

A highly tuned ORB on a reasonably fast machine can add as little
as 30,000 nsec of latency to each request. Most ORB's add at
least 1,000,000 nsec of latency. Even with the highly tuned ORB
the media time of the small message is not significant if you are
using 10 Mb or faster. With most ORB's the small message media time\
on a LAN is a very small percentage.

With the large request the media time is a larger factor. But
large requests do not typically incur the additional CDR size
overhead. It is possible to create a situation where the extra
CDR overhead is significant. Imagine:

struct S
{
octet o1;
double d1;
octet o2;
double d2;
octet o3;
double d3;
octet o4;
double d4;
};

typedef S S_Array[1000000];

// 64,000,000 bytes to send 36,000,000 bytes of user data

v.s.:

struct S
{
octet o1;
octet o2;
octet o3;
octet o4;
double d1;
double d2;
double d3;
double d4;
};

typedef S S_Array[1000000];

// 40,000,000 bytes to send 36,000,000 bytes of user data

or better yet:

struct S1
{
octet o1;
octet o2;
octet o3;
octet o4;
};

struct S2
{
double d1;
double d2;
double d3;
double d4;
};

typedef S1 S1_Array[1000000];

typedef S2 S2_Array[1000000];

struct S
{
S1_Array s1a;
S2_Array s2a;
};

// 36,000,000 bytes to send 36,000,000 bytes of user data

or best of all:

struct S
{
octet o1[1000000];
octet o2[1000000];
octet o3[1000000];
octet o4[1000000];
double d1[1000000];
double d2[1000000];
double d3[1000000];
double d4[1000000];
};

// 36,000,000 bytes to send 36,000,000 bytes of user data

I'll also mention that the last IDL example above will
marshal between 10 and 100 times faster than the other
two in most ORBs.

Thus, in the case of well designed IDL, the CDR marshalling
for the large message size is compact and very fast.

-----

I think Bill Janssen would agree that the savings that HTTP-ng
offers is typically a fixed reduction of the message size. If
his 1/7 figure is accurate he can send a HTTP-ng CORBA request
in a 9 byte message. This probably means (I don't know the
details of the HTTP-ng to CORBA mapping he is using) that there
is fixed reduction of the request header by about 50 bytes.

It would be hard to get most ORB's object keys in 9 bytes
much less some identification of the operation being invoked.
I'd like to hear from Bill Janssen as to the exact numbers
used to compute his 1/7 figure.

Even if HTTP-ng can shrink the message and request header
to 9 bytes the savings is as follows:

Savings 28.8K BAUD 10 Mb 100 Mb 800 MB
------- ---------- ----- ------ ------

55 bytes 15 msec .04 msec .004 msec .000007 msec

Thus I hope this shows how little this savings helps on even
a medium speed network. On a high speed network the time
is lost if even a few more function calls are required to
create the compressed header.

Douglas C. Schmidt

unread,

Jun 3, 1999, 3:00:00 AM6/3/99

to

Hi Michi and Bill,

> On a related note, many people advocate to simply use a fixed byte
> order on the wire and to not require the receiver-makes-it-right
> approach. The argument is that the extra complexity in the receiver
> isn't worth it. From a simplicity point of view, I agree. However,
> again, Doug's results seem to indicate that the extra complexity
> would be worth it if you are running over a high-speed network.

Right, plus the fixed byte order approach, such as the canonical
ordering used in Sun RPC's XDR, penalizes one type of "endianness,"
which may be a "bad thing" depending on what types of machines you run
on. Naturally, the "right thing"[tm] is for middleware to allow
applications to configure the "endianness" in such a way that it can
be disabled entirely for homogeneous endsystems. TAO supports this
optimization, BTW, and it has a small, but measurable impact on
performane.

> Bill, have you done some benchmarks over high-speed networks to see
> how much the more compact representation on the wire affects
> marshaling? It would be interesting to know the results, if any.

I would also be interested to see these results. Our tests on Fast
Ethernet and ATM show that shaving off unneccessary bytes in GIOP
doesn't have much impact on high-speed networks, though it does have
an impact on low-speed links, such as 2nd generation wireless
networks.

Naturally, the right approach here is to also standardize a pluggable
protocol framework into CORBA, which then makes the whole HTTP-ng
vs. GIOP debate complete moot. Check out

http://www.cs.wustl.edu/~schmidt/pluggable_protocols.ps.gz

for more details.

Thanks,

Doug
--
Dr. Douglas C. Schmidt, Associate Professor
Department of Computer Science, Washington University
St. Louis, MO 63130. Work #: (314) 935-4215; FAX #: (314) 935-7302
sch...@cs.wustl.edu, www.cs.wustl.edu/~schmidt/

Bill Janssen

unread,

Jun 3, 1999, 3:00:00 AM6/3/99

to

In article <Pine.HPX.4.05.990603...@bobo.triodia.com> Michi Henning <mi...@triodia.com> writes:

IIOP, at least in some cases, avoids byte-for-byte processing
during marshaling because it lays out data on the wire in a format that
coincides with the alignment requirements of many CPUs and, in that respect,
is more efficient. So, how much difference does this make in terms
of marshaling performance over high-speed networks? Is the more bloated
IIOP representation actually more efficient than HTTP-NG? (Of course,
over slower networks, I would expect HTTP-NG to win hands-down, especially
for messages with large amounts of data in them.)

The ILU implementation of HTTP-NG, of course, also does this memory
layout vs. wire layout comparison and exploits it where possible.
IIOP has a slight edge here, in that it is either-endian, whereas
HTTP-NG, targeted as an Internet protocol, uses `network byte order',
which is only big-endian.

There seem to be some claims that having the wire representation match
the memory representation isn't especially a win for IIOP, since it
doesn't go whole hog, the way that DCE RPC's NDR does -- we've gone
over this a couple of months ago in this very newsgroup, I believe.
Certainly for many applications, such as the Web, fixed-length data
elements (the only thing that IIOP data structures match) are a tiny
proportion of the data exchanged.

Bill, have you done some benchmarks over high-speed networks to see how
much the more compact representation on the wire affects marshaling? It
would be interesting to know the results, if any.

We have indeed. Well, it depends on what you call high-speed. We've
measured HTTP-NG against IIOP 1.0 on two big-endian machines (Sun
Ultra-30s running Solaris 2.6) connected via a private 100Mb switched
Ethernet, for a particular application, fetching a Web page, and found
NG to be about 3 times faster than IIOP. I'm planning on comparing it
to IIOP 1.1 as well -- I'm *guessing* that it will be about twice as
fast.

Again, I'd love to have some representative CORBA benchmark tests to
run, to see how normal everyday use of CORBA would be affected.
Perhaps the benchmarking effort in the OMG will produce such a suite.

Bill Janssen

unread,

Jun 3, 1999, 3:00:00 AM6/3/99

to

In article <7j62me$n...@tango.cs.wustl.edu> sch...@tango.cs.wustl.edu (Douglas C. Schmidt) writes:

Naturally, the right approach here is to also standardize a pluggable

protocol framework into CORBA.

Yes, I agree. We've had pluggable protocols in ILU from the start
(1991) and it's been a big help. I'd be interesting in seeing what an
implementation of HTTP-NG would be like in the TAO pluggable protocols
framework. Any TAO hackers out there who'd like to try it?

Douglas C. Schmidt

unread,

Jun 3, 1999, 3:00:00 AM6/3/99

to

Hi Bill,

> We've measured HTTP-NG against IIOP 1.0 on two big-endian machines
> (Sun Ultra-30s running Solaris 2.6) connected via a private 100Mb
> switched Ethernet, for a particular application, fetching a Web
> page, and found NG to be about 3 times faster than IIOP.

As someone who's spent most of my professional career benchmarking
networking protocols and middleware I'm always cautious about claiming
to have measured the performance of a *spec* (i.e., "IIOP 1.0")
vs. the performance of an *implementation* of a spec (i.e., "the ILU
version XYZ implementation").

As I'm sure you know, Bill, when it comes to performance the devil is
always in the details, and there's not much to be gained if you don't
hold "implementation details" constant, i.e., comparing a poor
implementation of any protocol against a good implementation of a
different protocol can be used to prove just about anything (which is
why everyone always beat up on Sun RPC in the 80's ;-)).

Therefore, it would be particularly useful if you could explain (1)
which implementation of CORBA IIOP 1.0 were you testing with, (2) what
kind of data were you sending, (3) what kind of performance numbers
did you get, and (4) where we could get a copy of your IDL to compare
with other ORBs, such as TAO.

In particular, I find it hard to believe that transmitting a Web page
as an octet sequence over a 100 Mbps Ethernet using NG could possibly
be 3 times faster than a well-tuned CORBA IIOP implementation simply
because the IIOP performance of a well-tuned ORB should be as just
about as fast as TCP over sockets. For instance, TAO easily gets 120
Mbps throughput for octet sequences over 155 Mbps ATM, so unless
you're computing throughput using a different metric I can't believe
that you'll get 3 times faster performance.

Having said all this, I'll repeat my earlier comment that once ORBs
support a standard pluggable protocols framework this whole line of
discussion will become moot since it'll be possible to plug in
whatever protocol is most appropriate for the requirements.

Bill Janssen

unread,

Jun 3, 1999, 3:00:00 AM6/3/99

to

In article <8DDA59EE3rw...@sigma.ois.com> rwb-...@ois.com (Bill Beckwith) writes:

I think Bill Janssen would agree that the savings that HTTP-ng
offers is typically a fixed reduction of the message size. If
his 1/7 figure is accurate he can send a HTTP-ng CORBA request
in a 9 byte message. This probably means (I don't know the
details of the HTTP-ng to CORBA mapping he is using) that there
is fixed reduction of the request header by about 50 bytes.

12 bytes, actually -- 4 for the transport information (chunking, which
virtual connection to use, bi-directionality, packet size info), 4 for
the request header, and 4 or more for the marshalled parameters.

But you're absolutely right. The big payoff is shrinking three
things: message headers, object representations, and pickle (oops,
any) representations. HTTP-NG does memo-ization of both method
request identifiers and method discriminants. See
http://www.w3.org/Protocols/HTTP-NG/ for more information on how it
does it.

The "1/7" figure I cite (actually, I just looked at my slides, and
it's really 1/8) is from a single specific (but representative)
request message, and used mainly for shock value; as I've said
earlier, the more typical figure is about 1/2 for an average over a
mixture of calls.

It would be hard to get most ORB's object keys in 9 bytes
much less some identification of the operation being invoked.
I'd like to hear from Bill Janssen as to the exact numbers
used to compute his 1/7 figure.

Sure. There's a Postscript dump of some Powerpoint slides under
ftp://ftp.parc.xerox.com/transient/janssen/omg-sept-98.ps. They're
from a talk I gave to the Internet SIG of the OMG in Seattle. Pages
26-30 contain real packet dumps from a simple example, the cube_long
call from the old cubit example. The IIOP Request message on page 29
takes 104 bytes, the HTTP-NG request message takes 12 bytes (including
the 4 bytes which implement record-marking, multiple virtual
connections, and bi-directionality). Feel free to check them both
against the specs to validate them.

Thus I hope this shows how little this savings helps on even
a medium speed network. On a high speed network the time
is lost if even a few more function calls are required to
create the compressed header.

Yep. The bandwidth savings really only help for things like wireless
connections, like Palm VII or cell phone use.

Michi Henning

unread,

Jun 4, 1999, 3:00:00 AM6/4/99

to

On 3 Jun 1999, Douglas C. Schmidt wrote:

> Right, plus the fixed byte order approach, such as the canonical
> ordering used in Sun RPC's XDR, penalizes one type of "endianness,"
> which may be a "bad thing" depending on what types of machines you run
> on. Naturally, the "right thing"[tm] is for middleware to allow
> applications to configure the "endianness" in such a way that it can
> be disabled entirely for homogeneous endsystems. TAO supports this
> optimization, BTW, and it has a small, but measurable impact on
> performane.

Hmmm... I just thought about all this some more. OK, granted, the answer
appears to be that the faster the network, the more important it is
to use the receiver-makes-it-right approach because of the lower marshaling
cost if both client and server have the same byte order. By the same
argument, having a network representation that matches the CPU alignment
and padding requirements is better for high-speed networks than a
compact representation.

Now, looking a bit further than just raw benchmark figures...

Brian and Rob argue in the "Practice of Programming" (page 206-207) that
a fixed byte order (a'la XDR) is preferable to receiver-makes-it-right
because of reduced complexity. Doug's benchmark results fly in the face of
that. So, who is right? For low-bandwidth networks, I think we all agree that
receiver-makes-it-right isn't very attractive, because it adds complexity
with negligible performance improvement.

For high-speed networks, receiver-makes-it-right and padded data wins out for
large messages. Now, where do these large messages go? For
receiver-makes-it-right to win, the server must be CPU bound. That's because,
if the server is I/O bound (such as a file server), the gains of receiver-
makes-it-right don't matter. (They'll be swamped by the I/O cost at the
server end.)

So, let's assume that the server is CPU bound then. For
receiver-makes-it-right, all these bytes have to go somewhere and do something,
otherwise we wouldn't bother to send them in the first place. For the server
to be CPU bound then, it would have to perform computations on those bytes,
such as matrix inversions. Closer to everyday computing, it could be a
graphics server for remote terminals. Either way, it seems to me that in
this case, the overall cost will be dominated by the computations performed
on the data in the server, not by the cost of marshaling those bytes (even
on gigabit networks).

So, again, is there any point in receiver-makes-it-right at all? From the
thoughts I just outlined, the answer would appear to be no. For slow
networks, it doesn't matter, for fast networks, it doesn't matter for short
messages. For fast networks and long messages, I can't see how the cost
of marshaling could ever be significant compared to the cost of actually
processing the data.

Overall, the conclusion would appear to be that:

- receiver-makes-it-right is never worth it

- marshaling in native CPU alignment and padding is not worth it,
except in the extreme case of a slow CPU on a very fast network.
But in that case, processing the data (as opposed to sending and
receiving it) will also be slow, so savings during marshaling
are irrelevant.

Overall, this seems to favor the HTTP-NG approach: use a fixed byte order
for simplicity, and pack the data as much as possible to save bandwidth.
For low-speed networks that must carry large messages, you get savings
that are almost linear with the reduction in the size of the on-the-wire
representation. For high-speed networks, it doesn't matter, regardless
of whether the CPU is slow or fast. However, given the fixed bandwidth
of the network (whatever its speed), a compact representation always
wins because it linearly increases the overall throughput of the network.

One exception I can see is that of large streaming data, such as video.
In that case, the server may actually be CPU bound during marshaling.
For example, you could imagine an architecture where most of a movie
is cached in memory, and the server feeds it to a number of consumers
over the network. (The consumers may each be looking at different parts of the
movie at the same time.) In that case, a less compact representation wins
if we assume that the network bandwidth is very large so that the server
is completely CPU bound. (Compressing the data for transmission
in this case would make matters worse instead of better, because it would
increase CPU time.) However, the video will be stored and transmitted
in compressed form anyway and will be decoded in hardware, so things like
byte order and padding are completely moot. In other words, the byte order
and padding arguments do not apply at all.

Besides, this last example is quite a long way from what applies to CORBA
or RPC in general. Now we are talking about a very special-purpose
application, where byte order and padding are completely irrelevant anyway.

All of this would appear to support the HTTP-NG approach. A more compact
representation is better for RPC, no matter what.

So, any holes in that reasoning?

Steve Vinoski

unread,

Jun 4, 1999, 3:00:00 AM6/4/99

to

Douglas C. Schmidt wrote:

> Naturally, the right approach here is to also standardize a pluggable

> protocol framework into CORBA, which then makes the whole HTTP-ng
> vs. GIOP debate complete moot. Check out
>
> http://www.cs.wustl.edu/~schmidt/pluggable_protocols.ps.gz
>
> for more details.

I've been pushing pluggable protocols in CORBA for at least six years now.
However, in that time I've worked on at least three completely different
ORBs that support pluggable protocols, each with radically different
designs and implementations for doing so. I must say that because of this
experience I'm convinced that such standardization would be a Very Bad
Idea. One issue is that OMG specs are intended to tell me *what* to
implement, not *how* to implement. The main problem is that all of the
published pluggable protocol papers I've seen make huge assumptions about
internal ORB architectures, and in all cases none of those assumptions
held for any of my ORBs. Frankly, I don't want someone else -- even you,
Doug -- telling me how to architect and design my ORB. That's simply not a
space that OMG specifications should be in.

--steve

Polar Humenn

unread,

Jun 4, 1999, 3:00:00 AM6/4/99

to Bill Janssen

Bill Janssen wrote:
>
> In article <7j62me$n...@tango.cs.wustl.edu> sch...@tango.cs.wustl.edu (Douglas C. Schmidt) writes:
>

> Naturally, the right approach here is to also standardize a pluggable

> protocol framework into CORBA.
>
> Yes, I agree. We've had pluggable protocols in ILU from the start
> (1991) and it's been a big help. I'd be interesting in seeing what an
> implementation of HTTP-NG would be like in the TAO pluggable protocols
> framework. Any TAO hackers out there who'd like to try it?

I am a big fan of pluggable protocols, since the OMG designed SECIOP, but
didn't design a way to get it into an ORB. This forces Security services
to be ORB specific, alienating third party service vendors. However,
ORBacus gave pluggable protocols to us, allowing us to create ORBAsec SL2,
an ORB with authentication and security.

However, for some political reasoning that has been told to me by
some OMG architecture board members, pluggable protocols will not
fly as a standard because there is a branding marketing issue.
Basically, the OMG markets CORBA as "IIOP".
That is the definitive standard, and that's why
it's so successfull, they said.

However, if they continue that way, e.g. insert head in sand,
I fear for the future of CORBA since
faster more robust processing and communication is *always* needed
no matter how fast the machines are.

So the problem is to get the pluggable protocol thing passed the OMG.

-Polar

-------------------------------------------------------------------
Polar Humenn Adiron, LLC
President 2-212 Center for Science & Technology
mailto:po...@adiron.com CASE Center/Syracuse University
Phone: 315-443-3171 Syracuse, NY 13244-4100
Fax: 315-443-4745 http://www.adiron.com

Bill Janssen

unread,

Jun 7, 1999, 3:00:00 AM6/7/99

to

In article <3757CB9E...@iona.com> Steve Vinoski <vin...@iona.com> writes:

The main problem is that all of the
published pluggable protocol papers I've seen make huge assumptions about
internal ORB architectures, and in all cases none of those assumptions
held for any of my ORBs. Frankly, I don't want someone else -- even you,
Doug -- telling me how to architect and design my ORB. That's simply not a
space that OMG specifications should be in.

I have to agree with Steve on this one. I can't see a useful OMG standard
for pluggable protocols ever doing anything but cause trouble.

Bill Janssen

unread,

Jun 7, 1999, 3:00:00 AM6/7/99

to

In article <7j7h1t$9...@tango.cs.wustl.edu> sch...@tango.cs.wustl.edu (Douglas C. Schmidt) writes:

As someone who's spent most of my professional career benchmarking
networking protocols and middleware I'm always cautious about claiming
to have measured the performance of a *spec* (i.e., "IIOP 1.0")
vs. the performance of an *implementation* of a spec (i.e., "the ILU
version XYZ implementation").

Absolutely, Doug. I was summarizing to the newsgroup. It is indeed
the ILU 2.0alpha14 implementation of both IIOP 1.0 and HTTP-NG 1.0,
all done with the ILU C bindings, using the testing framework to be
found in ILUSRC/examples/ngtest/. We're working on a paper to
document all of this a tad more formally than in Powerpoint or even
newsgroup postings :-). The test I mentioned involves fetching
a ~20K HTML file with links to 42 embedded images, all of which
have to be fetched in turn.

As I'm sure you know, Bill, when it comes to performance the devil is
always in the details, and there's not much to be gained if you don't
hold "implementation details" constant, i.e., comparing a poor
implementation of any protocol against a good implementation of a
different protocol can be used to prove just about anything (which is
why everyone always beat up on Sun RPC in the 80's ;-)).

Yep.

In particular, I find it hard to believe that transmitting a Web page
as an octet sequence over a 100 Mbps Ethernet using NG could possibly
be 3 times faster than a well-tuned CORBA IIOP implementation simply
because the IIOP performance of a well-tuned ORB should be as just
about as fast as TCP over sockets. For instance, TAO easily gets 120
Mbps throughput for octet sequences over 155 Mbps ATM, so unless
you're computing throughput using a different metric I can't believe
that you'll get 3 times faster performance.

Web page fetching might be a bit more complicated than you are
thinking it is. The Microscape page (see
ILUSRC/examples/ngtest/url-test-material/serverdocs/microscape/ about
20K of HTML, with forty-odd embedded images. The extra overhead with
IIOP seems to be coming mainly from the absence of chunking in IIOP
1.0 and the presence of batching and pipelining in HTTP-NG; I'd expect
IIOP 1.1 to be much closer to NG performance, as I mentioned.

Elliot Lee

unread,

Jun 8, 1999, 3:00:00 AM6/8/99

to

On 07 Jun 1999 20:22:35 -0700, Bill Janssen <jan...@parc.xerox.com> wrote:
>In article <3757CB9E...@iona.com> Steve Vinoski <vin...@iona.com> writes:
>
> The main problem is that all of the
> published pluggable protocol papers I've seen make huge assumptions about
> internal ORB architectures, and in all cases none of those assumptions
> held for any of my ORBs. Frankly, I don't want someone else -- even you,
> Doug -- telling me how to architect and design my ORB. That's simply not a
> space that OMG specifications should be in.
>
>I have to agree with Steve on this one. I can't see a useful OMG standard
>for pluggable protocols ever doing anything but cause trouble.

While I agree, having the OMG provide some clarifications in the following
areas might be conducive to multi-protocol implementations:

. Instructions how to get new protocols, and existing protocols over
new transports, added as widely-adopted extensions to the IOR spec.
(There's an e-mail address in the 2.2 spec to e-mail for the assignment
of a new ORB-specific profile number, but I got no response when I
tried, and I think it would be nicer if usage of the UNIX sockets
and IPv6 transports was standardized anyways.)
. An API to allow the app developer to select a preferred
protocol & transport for the ORB as a whole, and for specific
object references.

The ORB could then implement protocols and transports (pluggable or not) as
best fit its design.

-- Elliot
"how smy riting? male dme...@fqptmf.com"

Jonathan Biggar

unread,

Jun 8, 1999, 3:00:00 AM6/8/99

to

Elliot Lee wrote:
> . Instructions how to get new protocols, and existing protocols over
> new transports, added as widely-adopted extensions to the IOR spec.
> (There's an e-mail address in the 2.2 spec to e-mail for the assignment
> of a new ORB-specific profile number, but I got no response when I
> tried, and I think it would be nicer if usage of the UNIX sockets
> and IPv6 transports was standardized anyways.)

The OMG has just published a list of assignments for profiles, minor
error codes, etc. as ptc/99-05-02. I don't know how long ago you tried
to get the tags, but they are supposed to be more responsive now.
Requests go to tag-r...@omg.org.

I would also like to see more standardized transports, but someone has
to do the work and push it through the OMG process.

> . An API to allow the app developer to select a preferred
> protocol & transport for the ORB as a whole, and for specific
> object references.

This is part of the new messaging specification. Part of the QoS
(Quality of Service) policies that you can set on the client side allow
you to specify which protocols you want to use at both the ORB and
object reference levels.

--
Jon Biggar
Floorboard Software
j...@floorboard.com
j...@biggar.org

Christopher Browne

unread,

Jun 9, 1999, 3:00:00 AM6/9/99

to

On Tue, 08 Jun 1999 09:15:17 -0700, Jonathan Biggar <j...@floorboard.com> wrote:
>Elliot Lee wrote:
>> . An API to allow the app developer to select a preferred
>> protocol & transport for the ORB as a whole, and for specific
>> object references.
>
>This is part of the new messaging specification. Part of the QoS
>(Quality of Service) policies that you can set on the client side allow
>you to specify which protocols you want to use at both the ORB and
>object reference levels.

This looks to me like a Service That Wants To Become Ubiquitous,
*almost* as much so as the Naming and Trading Services, and fairly
comparable with the Event Service.
--
Trivialize a user's bug report by pointing out that it was fixed
independently long ago in a system that hasn't been released yet.
-- from the Symbolics Guidelines for Sending Mail
cbbr...@ntlug.org- <http://www.ntlug.org/~cbbrowne/corba.html>