Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Question: Big/Little Endian, sockets, noths, ntohl, htons, htnol..

288 views
Skip to first unread message

Asdf Jackal

unread,
Jul 13, 2003, 10:41:17 PM7/13/03
to
Being that the byte-ordering of sockets is natively Big-Endian, and that
most PC's out there are Little-Endian due to Windows/Intel, I was curious
why aren't the socket libraries updated to natively match that of the
majority? My logic here is that the translation functions (ntohs, ntohl,
htons, htnol)
are going to exist anyway, so its internal structure shouldn't matter anyway
(provided developers are actually using the functions)

My theories why things haven't changed:
* Sockets originate from BSD, so the Endian is a carry-over from Unix
* Making such a change may cause a flood of apps to break (those that are
working without the translation functions)
* Standards shouldn't be changed, unless there's a damn good reason
* A discussion about Big vs Little Endian is akin to religious/political
debate?

But honestly, I don't really know...
Your thoughts?

Please go easy on me, this is my first post to this newsgroup :)

Thx,
jc


pmc...@vbjzww.com.hd

unread,
Jul 13, 2003, 11:40:29 PM7/13/03
to
|Being that the byte-ordering of sockets is natively Big-Endian, and that
|most PC's out there are Little-Endian due to Windows/Intel, I was curious
|why aren't the socket libraries updated to natively match that of the
|majority? My logic here is that the translation functions (ntohs, ntohl,
|htons, htnol)
|are going to exist anyway, so its internal structure shouldn't matter anyway
|(provided developers are actually using the functions)

It's more than just updating the libraries. If you send things out in
anything but network order, you can't communicate with big endian
machines. If you produce native variants of the functions, then your
code is not portable.

|My theories why things haven't changed:
|* Sockets originate from BSD, so the Endian is a carry-over from Unix
|* Making such a change may cause a flood of apps to break (those that are
|working without the translation functions)
|* Standards shouldn't be changed, unless there's a damn good reason
|* A discussion about Big vs Little Endian is akin to religious/political
|debate?

Because it ain't broke so don't "fix" it?
--

Steve Horsley

unread,
Jul 14, 2003, 3:49:24 AM7/14/03
to

It is not the socket who's byte order is big-endian, it is the network's
byte order - the order in which bytes are sent and received over the wire.
This cannot be changed without breaking every existing application.

And you cannot (for instance) modify socket code to re-order bytes during
read() or write() calls because it doesn't know whether you are thinking
in terms of bytes, short, int or what. The socket simply reads and writes
sequences of bytes in the order that they are given to it. In fact I would
agrgue that a socket has no endian-ness because it does not deal in any
structure bigger than a byte (or byte[]).

It is the application that projects higher-level meanings onto byte
sequences. It is purely the application's responsibility to convert
between the network byte order of these high-level structures and its own
internal memory representation of them.

Steve

Arkady Frenkel

unread,
Jul 14, 2003, 5:26:59 AM7/14/03
to
You forgot about servers where MS have only 40% of market IIRC.
Arkady

"Asdf Jackal" <asd...@kypsoft.com> wrote in message
news:bLoQa.1780$Je.1318@fed1read04...

Fritz M

unread,
Jul 14, 2003, 11:27:17 AM7/14/03
to
"Arkady Frenkel" <ark...@hotmail.com> wrote:

> You forgot about servers where MS have only 40% of market IIRC.

That's immaterial; network byte order is still big endian even if MS
controls 90% of the servers on the Net.

RFM
--
To reply, translate domain from l33+ 2p33|< to alpha.
4=a 0=o 3=e +=t

Fernando Gont

unread,
Jul 14, 2003, 2:37:47 PM7/14/03
to
On Sun, 13 Jul 2003 19:41:17 -0700, "Asdf Jackal"
<asd...@kypsoft.com> wrote:

>Being that the byte-ordering of sockets is natively Big-Endian, and that
>most PC's out there are Little-Endian due to Windows/Intel,

PC's are little-endian due to Intel, *not* Windows.


>I was curious
>why aren't the socket libraries updated to natively match that of the
>majority?

It's the *protocol* *headers* that care about the endianess. What do
you mean by "why aren't the socket libraries updated to natively match
that of the majority?"?

Besides that, what would you do if, some time later, any big-endian
architecture takes over the market? Would you change the protocol
specification again?


>My theories why things haven't changed:
>* Sockets originate from BSD, so the Endian is a carry-over from Unix

It's usually the hardware that dictates the byte-order in use, *not*
the OS. Intel microprocessors are little-endian (most of them), while
Motorola's are big-endian.

However, it's not eh API that dictates the byte-order, but the
protocol specification.


>* Making such a change may cause a flood of apps to break (those that are
>working without the translation functions)

The protocol itself would break.

--
Fernando Gont
e-mail: fern...@ANTISPAM.gont.com.ar

[To send a personal reply, please remove the ANTISPAM tag]

Barry Margolin

unread,
Jul 14, 2003, 2:39:40 PM7/14/03
to
In article <3f12d1a1...@News.CIS.DFN.DE>,

Fernando Gont <fg...@softhome.net> wrote:
>However, it's not eh API that dictates the byte-order, but the
>protocol specification.

Well, there are a few places in the sockets API where it requires data in
network byte order, like the port field in the sockaddr_in structure.
Maybe that's what the OP is wondering about.

--
Barry Margolin, barry.m...@level3.com
Level(3), Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

p...@icke-reklam.ipsec.nu

unread,
Jul 14, 2003, 8:23:24 PM7/14/03
to
In comp.protocols.tcp-ip Fritz M <ne...@m4s0n3r.n3+> wrote:
> "Arkady Frenkel" <ark...@hotmail.com> wrote:

>> You forgot about servers where MS have only 40% of market IIRC.

> That's immaterial; network byte order is still big endian even if MS
> controls 90% of the servers on the Net.

If MS controlled 90% of server market that would make sense. But according
to netcraft ( http://news.netcraft.com/archives/2003/07/02/july_2003_web_server_survey.html)
they control 25% if webservers, as for smtp they are probably still smaller.

> RFM
> --
> To reply, translate domain from l33+ 2p33|< to alpha.
> 4=a 0=o 3=e +=t

--
Peter HÃ¥kanson
IPSec Sverige ( At Gothenburg Riverside )
Sorry about my e-mail address, but i'm trying to keep spam out,
remove "icke-reklam" if you feel for mailing me. Thanx.

Le Chaud Lapin

unread,
Jul 14, 2003, 10:07:18 PM7/14/03
to
"Asdf Jackal" <asd...@kypsoft.com> wrote in message news:<bLoQa.1780$Je.1318@fed1read04>...

My thoughts? Same as your thoughts.

By the way, there is a quasi-optimal solution to this problem.
Someone noted it a couple a weeks ago in this same group. Rather than
blab it out, I will wait for others to say what it is if they know it.

-Chaud Lapin-

Fernando Gont

unread,
Jul 15, 2003, 4:40:06 AM7/15/03
to
On Mon, 14 Jul 2003 18:39:40 GMT, Barry Margolin
<barry.m...@level3.com> wrote:

>>However, it's not eh API that dictates the byte-order, but the
>>protocol specification.
>Well, there are a few places in the sockets API where it requires data in
>network byte order, like the port field in the sockaddr_in structure.

There are a few places in the Sockets API that require data in network
byte order just because a decision was taken in the past to make the
programmer cope with the byte order, instead of having the kernel
making the conversions when/where appropriate.


>Maybe that's what the OP is wondering about.

I don't see any reason for making the changes the OP proposed to the
Sockets API .

I suspect the OP would change the Sockets API, so that those values
that are stored in network byte order (big-endian), would be stored in
little-endian byte order, instead.

However, even if that change were made, the byte ordering functions
would still exist, for portability reasons (for example, you'd still
call a htons()-like function to convert port numbers from host byte
order to little-endian, as on big-endian systems the two bytes should
have to be swapped, in order to convert the 16 bit value from
big-endian to little-endian byte order).


On little-endian systems, you'd have the *kernel* converting those
values you stored in little-endian byte order, to big-endian (network
byte order), in order to store them in the protocol headers, while
*still* having to call the byte ordering functions, for portability
reasons (as explained above).

On big-endian systems, you'd be converting those values from
big-endian to little-endian, and then backwards... ie., twice!
The first conversion would be made by the byte ordering functions, in
order to store the values in little-endian byte order. Then the kernel
would have to take these values, and convert them to big-endian byte
order, as that's the byte order the protocol specification requires.

IMO, regarding the byte orderdering functions, the only change that
might help the programmer would be to let the programmer store values
in host byte order, and have the kernel make the conversions to
network byte order when/where appropriate. If this change were made, I
think the byte ordering functions would not be needed.

Arkady Frenkel

unread,
Jul 15, 2003, 11:16:58 AM7/15/03
to
Exactly , BTW that's why in windows CE the only subsystem which work with
bytes and not unicode wbytes is winsock.
Arkady

<p...@icke-reklam.ipsec.nu> wrote in message
news:bevhhs$212a$2...@nyheter.ipsec.se...

Fernando Gont

unread,
Jul 15, 2003, 2:31:41 PM7/15/03
to
On 14 Jul 2003 19:07:18 -0700, unorigina...@yahoo.com (Le Chaud
Lapin) wrote:

>> Please go easy on me, this is my first post to this newsgroup :)
>My thoughts? Same as your thoughts.
>By the way, there is a quasi-optimal solution to this problem.

What problem?


>Someone noted it a couple a weeks ago in this same group. Rather than
>blab it out, I will wait for others to say what it is if they know it.

What was the "Subject" of the thread?

Steve Watt

unread,
Jul 15, 2003, 3:30:29 PM7/15/03
to
In article <3f1354f9...@News.CIS.DFN.DE>,

Fernando Gont <fg...@softhome.net> wrote:
>On Mon, 14 Jul 2003 18:39:40 GMT, Barry Margolin
><barry.m...@level3.com> wrote:
>
>>>However, it's not eh API that dictates the byte-order, but the
>>>protocol specification.
>>Well, there are a few places in the sockets API where it requires data in
>>network byte order, like the port field in the sockaddr_in structure.
>
>There are a few places in the Sockets API that require data in network
>byte order just because a decision was taken in the past to make the
>programmer cope with the byte order, instead of having the kernel
>making the conversions when/where appropriate.
>
>
>>Maybe that's what the OP is wondering about.
>
>I don't see any reason for making the changes the OP proposed to the
>Sockets API .
>
>I suspect the OP would change the Sockets API, so that those values
>that are stored in network byte order (big-endian), would be stored in
>little-endian byte order, instead.

Actually, I've thought about the same thing more than once, and it's
not little-endian that's desired, but rather host order. The sockets
API can (and should, in a number of people's opinion) flip them as
needed -- it's just a wart in the API. Those happen, usually due
to historical accidents.

>However, even if that change were made, the byte ordering functions
>would still exist, for portability reasons (for example, you'd still
>call a htons()-like function to convert port numbers from host byte
>order to little-endian, as on big-endian systems the two bytes should
>have to be swapped, in order to convert the 16 bit value from
>big-endian to little-endian byte order).

You'd still need those functions, but only for payload -- all port
numbers could be presented to the application in host byte order. The
kernel would change port numbers (and IP addresses, but with a weaker
argument) to/from network byte order as appropriate.

>On little-endian systems, you'd have the *kernel* converting those
>values you stored in little-endian byte order, to big-endian (network
>byte order), in order to store them in the protocol headers, while
>*still* having to call the byte ordering functions, for portability
>reasons (as explained above).
>
>On big-endian systems, you'd be converting those values from
>big-endian to little-endian, and then backwards... ie., twice!

I think you're being silly here -- nobody would propose passing
port numbers in little-endian order. The only sensible choices
are host byte order or network byte order. Forcing the application
programmer to remember htons() for port numbers is mostly an accident
waiting to happen. OK, it has happened: I think every beginning
socket programmer has forgotten at least one htons call.

>IMO, regarding the byte orderdering functions, the only change that
>might help the programmer would be to let the programmer store values
>in host byte order, and have the kernel make the conversions to
>network byte order when/where appropriate. If this change were made, I
>think the byte ordering functions would not be needed.

They would still be needed, but, as I said above, only for payload,
not for API setup functionality.
--
Steve Watt KD6GGD PP-ASEL-IA ICBM: 121W 56' 57.8" / 37N 20' 14.9"
Internet: steve @ Watt.COM Whois: SW32
Free time? There's no such thing. It just comes in varying prices...

Glen Herrmannsfeldt

unread,
Jul 15, 2003, 11:45:27 PM7/15/03
to

"Fernando Gont" <fg...@softhome.net> wrote in message
news:3f12d1a1...@News.CIS.DFN.DE...

> PC's are little-endian due to Intel, *not* Windows.

Well, windows on Alpha was little endian, where the alpha is bi-endian, and
big endian for most other OS.

-- glen


Fernando Gont

unread,
Jul 16, 2003, 10:54:39 AM7/16/03
to
On Tue, 15 Jul 2003 19:30:29 GMT, st...@nospam.Watt.COM (Steve Watt)
wrote:

>>I don't see any reason for making the changes the OP proposed to the
>>Sockets API .
>>I suspect the OP would change the Sockets API, so that those values
>>that are stored in network byte order (big-endian), would be stored in
>>little-endian byte order, instead.
>Actually, I've thought about the same thing more than once, and it's
>not little-endian that's desired, but rather host order.

Of course. But that's not what the OP said (read my reply to his
post). He wanted to store values (port numbers, IP addresses, etc.) in
little-endian byte order, which not only would make the programmer
still cope with the byte order, but would make things "worse" (as
explained in one of my posts).


>>On big-endian systems, you'd be converting those values from
>>big-endian to little-endian, and then backwards... ie., twice!
>I think you're being silly here -- nobody would propose passing
>port numbers in little-endian order.

The Sockets API needs port numbers in network byte order (ie,
big-endian). The OP "proposed" to change this, so that port numbers
would be stored in *little-endian* byte order (*not* in *host* byte
order). Then I exaplined what would happen if that change (the one the
OP proposed) was made.


>The only sensible choices
>are host byte order or network byte order.

That's what I said. At the end of my post (the one you replied to), I
said:

"IMO, regarding the byte orderdering functions, the only change that
might help the programmer would be to let the programmer store values
in host byte order, and have the kernel make the conversions to
network byte order when/where appropriate. If this change were made, I
think the byte ordering functions would not be needed."

--

pmup...@ddjfsv.com.ed

unread,
Jul 16, 2003, 11:13:24 AM7/16/03
to
|That's what I said. At the end of my post (the one you replied to), I
|said:
|
|"IMO, regarding the byte orderdering functions, the only change that
|might help the programmer would be to let the programmer store values
|in host byte order, and have the kernel make the conversions to
|network byte order when/where appropriate. If this change were made, I
|think the byte ordering functions would not be needed."

That's no good either. The kernel can't know all the places in network
packets that contain byte order sensitive data, and there's a good
argument that it shouldn't. Imagine if you had to update your kernel
everytime a new IP protocol is to be supported. And how would the kernel
know, for example, that you decided to run TFTP on port 10067 instead of
67? Another system call? Get real. That's why byte order handling
belongs in user space.
--

Le Chaud Lapin

unread,
Jul 16, 2003, 12:11:18 PM7/16/03
to
fg...@softhome.net (Fernando Gont) wrote in message news:<3f1444e...@News.CIS.DFN.DE>...

> On 14 Jul 2003 19:07:18 -0700, unorigina...@yahoo.com (Le Chaud
> Lapin) wrote:
>
> >> Please go easy on me, this is my first post to this newsgroup :)
> >My thoughts? Same as your thoughts.
> >By the way, there is a quasi-optimal solution to this problem.
>
> What problem?

Problem: What to do about endianess in distributed communication.

> >Someone noted it a couple a weeks ago in this same group. Rather than
> >blab it out, I will wait for others to say what it is if they know it.
>
> What was the "Subject" of the thread?

I do not remember, but the solution is "receiver-makes-right":

Let the source of the data, application data that is, send the data in
its native format. If the target gets it, and the two machines are of
same endianess (often the case), then the target has nothing to do.
If the endianess's are disparate, then target simply has to invert.
In both cases, this is an optimal solution.

Like many design choices in engineering, historical decisions have
more to do with human personality than objective reasoning. I think
that is what happened here.

If we were all robots, little-endian would make more sense, as the
byte order is entirely congruent with concept of increasing addresses,
etc. There is a certain mechanistic "purity" in the mode of thought -
robots would be entirely happy with little-endian because they would
not be subject to the lexical constraints of English or other
languages.

But we are not robots. We are human, and if you write data in hex
format using a left-to-right language (most common), the high-order
nibbles appear first, and because we have learned that things that
come first in a sequence are typically of lowest order, the high-order
nibbles should therefore go at the lowest-order address....you get the
idea....the humans are doing the tango with the machines and the
humans are winning.

This was the "historical" dilema faced by the guardians of byte-order
in distributed communication. There were two options:

1. We can be robots and do it in more or less Draconian
way...little-endian...pretending that what English-people humans
prefer does not matter

2. We can be people and make it convenient for some who really really
like their paper. big-endian would work better in this case

They chose 2. The crypto people did the same thing with SHA-x, and
anyone who has implemented these cryptographic primitives knows how
annoying it is to work with constructs whose specification prescribes
big-endian order.

Finally, I argue that there is an unignorable correlation between
certain personalities and design choices. In matters of style,
certain individuals are predisposed to choosing one mode of thought
over the other, so much so that you can almost predict who likes what
on a give axis simply by observing what they like on axes. I have no
supporting data for this observation, but I would bet that "balanced"
curly brace people are more likely to go with #1, where as "skewed"
curly brace people are more likely to go to #2. Don't ask me why,
it's just a hunch.

-Chaud Lapin-

bri...@encompasserve.org

unread,
Jul 16, 2003, 1:01:57 PM7/16/03
to
In article <fc2e0ade.03071...@posting.google.com>, unorigina...@yahoo.com (Le Chaud Lapin) writes:
> fg...@softhome.net (Fernando Gont) wrote in message news:<3f1444e...@News.CIS.DFN.DE>...
> But we are not robots. We are human, and if you write data in hex
> format using a left-to-right language (most common), the high-order
> nibbles appear first,

Who told you that the high order digit is on the left in hex format?

That's a _convention_. It is not an intrinsic feature of the
universe.

Yes, it is true that if you write hex with the high order digits
on the left and you read from left to right, you end up reading
the high order digits first.

Like *duh*!

> and because we have learned that things that
> come first in a sequence are typically of lowest order, the high-order
> nibbles should therefore go at the lowest-order address....you get the
> idea....the humans are doing the tango with the machines and the
> humans are winning.

I've never observed that sequences have any such innate preference.

In English the low order terms come first: "big cat"
In Spanish the high order terms come first: "gato grande"

It's a matter of convention, not the war of the worlds.

For what it's worth, German in a minor way is the PDP-11 of endian-ness:

"Ein hundred drei und zwanzich" (One hundred, three and twenty = 123)

John Briggs

Fernando Gont

unread,
Jul 16, 2003, 1:31:58 PM7/16/03
to
On Wed, 16 Jul 2003 15:13:24 GMT, pmup...@ddjfsv.com.ed wrote:

>|"IMO, regarding the byte orderdering functions, the only change that
>|might help the programmer would be to let the programmer store values
>|in host byte order, and have the kernel make the conversions to
>|network byte order when/where appropriate. If this change were made, I
>|think the byte ordering functions would not be needed."
>That's no good either. The kernel can't know all the places in network
>packets that contain byte order sensitive data, and there's a good
>argument that it shouldn't.

I didn't mean "all the places in network packets that contain byte
order sensitive data". I just meant "any parameters passed to the
Sockets API".
Instead of having to pass values to the Sockets API in network byte
order, programmers could pass values in host byte order, and let the
kernel make the conversions where/when appropriate. I'm not talking
about the protocols themselves, I'm talking about just the API.


>Imagine if you had to update your kernel
>everytime a new IP protocol is to be supported. And how would the kernel
>know, for example, that you decided to run TFTP on port 10067 instead of
>67? Another system call? Get real.
>That's why byte order handling belongs in user space.

There's no real reason for that (at least that I know of).

Fernando Gont

unread,
Jul 16, 2003, 2:23:41 PM7/16/03
to
On 16 Jul 2003 09:11:18 -0700, unorigina...@yahoo.com (Le Chaud
Lapin) wrote:

>Problem: What to do about endianess in distributed communication.

Choose one, and stick to it.


>I do not remember, but the solution is "receiver-makes-right":
>Let the source of the data, application data that is, send the data in
>its native format. If the target gets it, and the two machines are of
>same endianess (often the case), then the target has nothing to do.
>If the endianess's are disparate, then target simply has to invert.
>In both cases, this is an optimal solution.

Optimal? How would you "signal" the byte order to the other end?
Include a "byte order" bit in some type of header bit for every block
of data you send?????


>Like many design choices in engineering, historical decisions have
>more to do with human personality than objective reasoning.

I don't agree with that. If so, we'd had psychologists designing
communication protocols and life would be harder than it its. :)


>I think that is what happened here.

I think we're talking about different things.
One of them is whether the programmer should cope with the byte order,
instead of having the kernel cope with it.
The other one is what byte order should be used in distributed
systems.

For the former, I think we should let the kernel cope with the byte
order, instead of dealing ourselves with it (as in the Sockets API).

For the latter, I'd choose one byte order, and stick to it.
Which one? It doesn't matter.


>But we are not robots. We are human, and if you write data in hex
>format using a left-to-right language (most common), the high-order
>nibbles appear first, and because we have learned that things that
>come first in a sequence are typically of lowest order, the high-order
>nibbles should therefore go at the lowest-order address....

Being a Spanish-native speaker, I must say that "concept" does not
apply to Spanish.

gvar...@quytdy.com.wp

unread,
Jul 16, 2003, 2:26:34 PM7/16/03
to
|>That's no good either. The kernel can't know all the places in network
|>packets that contain byte order sensitive data, and there's a good
|>argument that it shouldn't.
|
|I didn't mean "all the places in network packets that contain byte
|order sensitive data". I just meant "any parameters passed to the
|Sockets API".
|Instead of having to pass values to the Sockets API in network byte
|order, programmers could pass values in host byte order, and let the
|kernel make the conversions where/when appropriate. I'm not talking
|about the protocols themselves, I'm talking about just the API.

But if that's all you want, what's to stop you from defining some
wrappers to the Berkeley socket API, called my_bind, my_accept,
my_connect, etc. Why does the kernel have to do it? With shared
libraries, it would be just as space efficient as putting it in the
kernel.

Bear in mind too that the Berkeley socket API handles not just IP
sockets but also other types of transport, e.g. x25.
--

Barry Margolin

unread,
Jul 16, 2003, 3:11:09 PM7/16/03
to
In article <3f15935e...@News.CIS.DFN.DE>,

Fernando Gont <arie...@softhome.net> wrote:
>For the latter, I'd choose one byte order, and stick to it.
>Which one? It doesn't matter.

Ideally it should be the most common one, so that the [hn]to[nh][sl]
functions are no-ops more often.

But since architecture preferences change over time, there's no guarantee
that this decision will remain optimal. When TCP/IP was being designed,
there was no way to know that the Wintel architecture would eventually
dominate the industry. At that time, I don't think either endianness was a
clear-cut winner, so the choice was relatively arbitrary.

Steve Horsley

unread,
Jul 16, 2003, 4:28:51 PM7/16/03
to
On Wed, 16 Jul 2003 17:31:58 +0000, Fernando Gont wrote:

> On Wed, 16 Jul 2003 15:13:24 GMT, pmup...@ddjfsv.com.ed wrote:
>
> I didn't mean "all the places in network packets that contain byte
> order sensitive data". I just meant "any parameters passed to the
> Sockets API".
> Instead of having to pass values to the Sockets API in network byte
> order, programmers could pass values in host byte order, and let the
> kernel make the conversions where/when appropriate. I'm not talking
> about the protocols themselves, I'm talking about just the API.

Actually, I think this is very sensible, and I guess that the reason the
existing API doesn't so the byte swapping itself where appropriate is that
the socket programmers at the time had other things to worry about. In
fact the java socket API _does_ do any byte reordering itself.

I think it is probably too late in the day to get the existing APIs
changed though.

Steve

lo...@fsoyyk.com.yk

unread,
Jul 16, 2003, 6:46:59 PM7/16/03
to
|> I didn't mean "all the places in network packets that contain byte
|> order sensitive data". I just meant "any parameters passed to the
|> Sockets API".
|> Instead of having to pass values to the Sockets API in network byte
|> order, programmers could pass values in host byte order, and let the
|> kernel make the conversions where/when appropriate. I'm not talking
|> about the protocols themselves, I'm talking about just the API.
|
|Actually, I think this is very sensible, and I guess that the reason the
|existing API doesn't so the byte swapping itself where appropriate is that
|the socket programmers at the time had other things to worry about. In
|fact the java socket API _does_ do any byte reordering itself.

Sure it's sensible. For example, Perl also presents a nicer interface to
sockets than "fill in the fields yourself". And you can also work at a
higher level than sockets. Many Java programmers use http objects
without having to deal with sockets.

|I think it is probably too late in the day to get the existing APIs
|changed though.

As long as you have your higher level APIs, there's no need to touch the
existing socket API. Sure I did once forget the htons for the port
number but I've made all sorts of other programming mistakes too, this
is just one.
--

Fernando Gont

unread,
Jul 17, 2003, 7:54:23 AM7/17/03
to
On Wed, 16 Jul 2003 18:26:34 GMT, gvar...@quytdy.com.wp wrote:

>But if that's all you want, what's to stop you from defining some
>wrappers to the Berkeley socket API, called my_bind, my_accept,
>my_connect, etc. Why does the kernel have to do it? With shared
>libraries, it would be just as space efficient as putting it in the
>kernel.

It's not that I "want" it. It's just that there's no reason for making
programmers use byte ordering functions, when the kernel could do the
conversions itself.


>Bear in mind too that the Berkeley socket API handles not just IP
>sockets but also other types of transport, e.g. x25.

?

Fernando Gont

unread,
Jul 17, 2003, 7:54:25 AM7/17/03
to
On Wed, 16 Jul 2003 19:11:09 GMT, Barry Margolin
<barry.m...@level3.com> wrote:

>>For the latter, I'd choose one byte order, and stick to it.
>>Which one? It doesn't matter.
>Ideally it should be the most common one, so that the [hn]to[nh][sl]
>functions are no-ops more often.

Ideally, the same byte ordered would be used in the protocol headers,
too, so that for those "most common" systems, no conversions would be
needed at all.

gpb...@mawvyu.com.xd

unread,
Jul 17, 2003, 10:00:20 AM7/17/03
to
|>Bear in mind too that the Berkeley socket API handles not just IP
|>sockets but also other types of transport, e.g. x25.
|
|?

So you didn't realise that the Berkeley designers had more foresight
than you give them credit for? Why do you think there is a sockaddr type
and then a sockaddr_in type, a sockaddr_atmsvc type and now a
sockaddr_in6 type, and that you have to tell socket how large the
sockaddr_* structure is? Sockets were designed to work with various
transport protocols.
--

Barry Margolin

unread,
Jul 17, 2003, 11:11:36 AM7/17/03
to
In article <UZxRa.884$OM3...@news-server.bigpond.net.au>,

None of this is at all related to the issue we're discussing. This glitch
in the API only occurs when filling in a sockaddr_in structure. It's very
counterintuitive that a structure used for communicating with the local
TCP/IP stack, not a remote system, would use network byte ordering instead
of host byte ordering.

What's even weirder is that some of these same folks at Berkeley also
implemented the original Unix Talk program, and in that case they *didn't*
use network byte ordering in the application data. The result was that Sun
users couldn't talk to Vax users, and replacement applications like ntalk
and ytalk had to be created.

rez...@cfaehe.com.qq

unread,
Jul 17, 2003, 11:33:55 AM7/17/03
to
|>So you didn't realise that the Berkeley designers had more foresight
|>than you give them credit for? Why do you think there is a sockaddr type
|>and then a sockaddr_in type, a sockaddr_atmsvc type and now a
|>sockaddr_in6 type, and that you have to tell socket how large the
|>sockaddr_* structure is? Sockets were designed to work with various
|>transport protocols.
|
|None of this is at all related to the issue we're discussing. This glitch
|in the API only occurs when filling in a sockaddr_in structure. It's very
|counterintuitive that a structure used for communicating with the local
|TCP/IP stack, not a remote system, would use network byte ordering instead
|of host byte ordering.

Not an awful lot, but it shows that the details of the addressing were
not considered to be the kernel's concern. It is a thin interface.

I still don't see what the need is for the *kernel* to be involved. If
one thinks that a host byte ordering API is a good thing, one just needs
to write some wrappers to the socket calls and do the swap or not in
userspace. Only thing is you would have to call it something other than
socket, bind, connect etc. That will satisfy some people's need for
"intuitiveness". Personally I don't see it as a big win. When you learn
an API, there are all sorts of things you have to note. Ok, this
argument is a ushort int, so it has to be 0-65535, and 0-1023 require
privilege. This area has to be zeroed. This one's a string, null
terminated. And so on. Endianness is just another characteristic of the
datum one has to learn.

That's why I consider such a piddling improvement not worth the trouble;
a higher level API is more useful.
--

Le Chaud Lapin

unread,
Jul 17, 2003, 11:51:26 AM7/17/03
to
bri...@encompasserve.org wrote in message news:<XccCRw...@eisner.encompasserve.org>...

> In article <fc2e0ade.03071...@posting.google.com>, unorigina...@yahoo.com (Le Chaud Lapin) writes:
> > fg...@softhome.net (Fernando Gont) wrote in message news:<3f1444e...@News.CIS.DFN.DE>...
> > But we are not robots. We are human, and if you write data in hex
> > format using a left-to-right language (most common), the high-order
> > nibbles appear first,
>
> Who told you that the high order digit is on the left in hex format?
>
> That's a _convention_. It is not an intrinsic feature of the
> universe.
> Yes, it is true that if you write hex with the high order digits
> on the left and you read from left to right, you end up reading
> the high order digits first.
>
> Like *duh*!

If you would read my message carefully, you would see that we are
saying the same thing.

> > and because we have learned that things that
> > come first in a sequence are typically of lowest order, the high-order
> > nibbles should therefore go at the lowest-order address....you get the
> > idea....the humans are doing the tango with the machines and the
> > humans are winning.
>
> I've never observed that sequences have any such innate preference.

For those who dominated the decision-making process, the language was
English.



> In English the low order terms come first: "big cat"
> In Spanish the high order terms come first: "gato grande"

Not always. In many languages, including Spanish, there are cases
when it is more appropriate to invert the inversion. Take my name,
for example, which means "hot bunny" in French. Both orders are
acceptable. The appropriate choice depends on the mood and intent of
the speaker:

le chaud lapin - The Hot Rabbit
un lapin chaud - A Hot Rabbit

The same thing applies to Spanish:

el toro loco - the crazy bull
el toro loco - the crazy bull

> It's a matter of convention, not the war of the worlds.

Yes, it's a matter of convention influenced by language.


>
> For what it's worth, German in a minor way is the PDP-11 of endian-ness:
>
> "Ein hundred drei und zwanzich" (One hundred, three and twenty = 123)

This is not entirely correct. Noth only does German have both
"endians", they are actually used within the same expression. You can
see this from your example:

123 spoken in German yields the order 1-3-2.

If you had made the number 1234, then when spoken in German, the order
would have been 1-2-4-3. So you can see that for most of the number,
the order is as it is in English, but for 20-99, it inverts.

-Chaud Lapin-

Mark Crispin

unread,
Jul 17, 2003, 11:45:26 AM7/17/03
to
On Thu, 17 Jul 2003, Barry Margolin wrote:
> It's very
> counterintuitive that a structure used for communicating with the local
> TCP/IP stack, not a remote system, would use network byte ordering instead
> of host byte ordering.

Indeed. The design of the sockets interface is a matter of complete
bewilderment for those of us who came from other operating system
environments who had a much cleaner interface. I've programmed with
sockets for 15 years, and still can't believe that anyone would consider
it reasonable.

On TOPS-20, there was no distinction between a "file" and a "socket".
You used the same calls for both. Everything that you encode in the
sockaddr_in structure in sockets was a component of a filename on TOPS-20.
Among other things, this meant that you could copy to a network printer
since it was a file name like everything else. Hasn't anyone on UNIX ever
wished that he could pipe to the network from the shell without having to
write a helper program?

> What's even weirder is that some of these same folks at Berkeley also
> implemented the original Unix Talk program, and in that case they *didn't*
> use network byte ordering in the application data.

Yet another example of the foolishness of using binary in an application
protcol when text will work perfectly well.

-- Mark --

http://staff.washington.edu/mrc
Science does not emerge from voting, party politics, or public debate.
Si vis pacem, para bellum.

Le Chaud Lapin

unread,
Jul 17, 2003, 12:01:29 PM7/17/03
to
arie...@softhome.net (Fernando Gont) wrote in message news:<3f15935e...@News.CIS.DFN.DE>...

> On 16 Jul 2003 09:11:18 -0700, unorigina...@yahoo.com (Le Chaud
> Lapin) wrote:
>
> >Problem: What to do about endianess in distributed communication.
>
> Choose one, and stick to it.
>
>
> >I do not remember, but the solution is "receiver-makes-right":
> >Let the source of the data, application data that is, send the data in
> >its native format. If the target gets it, and the two machines are of
> >same endianess (often the case), then the target has nothing to do.
> >If the endianess's are disparate, then target simply has to invert.
> >In both cases, this is an optimal solution.
>
> Optimal? How would you "signal" the byte order to the other end?
> Include a "byte order" bit in some type of header bit for every block
> of data you send?????

es.

>
> >Like many design choices in engineering, historical decisions have
> >more to do with human personality than objective reasoning.
>
> I don't agree with that. If so, we'd had psychologists designing
> communication protocols and life would be harder than it its. :)
>
>
> >I think that is what happened here.
>
> I think we're talking about different things.
> One of them is whether the programmer should cope with the byte order,
> instead of having the kernel cope with it.
> The other one is what byte order should be used in distributed
> systems.

???? This does not make sense.


>
> For the former, I think we should let the kernel cope with the byte
> order, instead of dealing ourselves with it (as in the Sockets API).
> For the latter, I'd choose one byte order, and stick to it.
> Which one? It doesn't matter.

These two situations are the same. If you transmit data, it goes into
a buffer, and the kernel will not have any notion of byte order. The
kernel is not aware that you might have put an 'int' in the buffer.

The solution to this problem is to indicate the byte order of the
payload of every packet sent (along with other information). The
receiver will determine if an inversion is necessary. The programmer
should be relieved of thinking about byte order, as you have said, but
then again, this *is* distributed systems.

>
> >But we are not robots. We are human, and if you write data in hex
> >format using a left-to-right language (most common), the high-order
> >nibbles appear first, and because we have learned that things that
> >come first in a sequence are typically of lowest order, the high-order
> >nibbles should therefore go at the lowest-order address....
>
> Being a Spanish-native speaker, I must say that "concept" does not
> apply to Spanish.

"el toro loco" versus "el loco toro"

Each is valid under certain context. (see my other post)

-Chaud Lapin-

Le Chaud Lapin

unread,
Jul 17, 2003, 12:14:30 PM7/17/03
to
"Steve Horsley" <steve.h...@virgin.NO_SPAM.net> wrote in message news:<pan.2003.07.14....@virgin.NO_SPAM.net>...
> On Sun, 13 Jul 2003 19:41:17 -0700, Asdf Jackal wrote:
> It is not the socket who's byte order is big-endian, it is the network's
> byte order - the order in which bytes are sent and received over the wire.
> This cannot be changed without breaking every existing application.
>
> And you cannot (for instance) modify socket code to re-order bytes during
> read() or write() calls because it doesn't know whether you are thinking
> in terms of bytes, short, int or what. The socket simply reads and writes
> sequences of bytes in the order that they are given to it. In fact I would
> agrgue that a socket has no endian-ness because it does not deal in any
> structure bigger than a byte (or byte[]).
>
> It is the application that projects higher-level meanings onto byte
> sequences. It is purely the application's responsibility to convert
> between the network byte order of these high-level structures and its own
> internal memory representation of them.

All True. That's the way it is now.

Perhaps it would have been better to let the application "blast to
buffer" on sending, then let the receiver invert *but*!!!! - and this
is the important point - *not* have the programmer on the receiving
end be responsible for the inversion.

The way to achieve this is to tag each packet to indicate the nature
of byte ordering (and some other things too) of the payload. The
sockets middleware would be responsible for doing an inversion if
necessary. This is an optimal solution. It even covers the case of
Barry's suggestion: let the byte order be that of the majority of
architectures.

It's been more than 5 years since I have worried about byte order
between disparate architectures, and any conversion that needs to be
done is optimally efficiency. I am not doing any type of odering
overlay on top of UDP or TCP - the data is left "in the raw".

It's quite nice to write in C++

{
Socket-Type-Thing s;
int i = 98;
s << i;
}

and not worry about whether the source is a PDA and the target is a
Cray. The '98' will become '98' at the target end. Again, this is
done with maximal global efficiency.

-Chaud Lapin-

Fritz M

unread,
Jul 17, 2003, 4:04:23 PM7/17/03
to
"Glen Herrmannsfeldt" <g...@ugcs.caltech.edu> wrote:


> Well, windows on Alpha was little endian, where the alpha is
> bi-endian, and big endian for most other OS.

Ditto MIPS and PowerPC. If I recall correctly, those chips were
architected bi-endian specifically to accomodate Windows code.

Hank Oredson

unread,
Jul 17, 2003, 4:16:35 PM7/17/03
to

"Barry Margolin" <barry.m...@level3.com> wrote in message
news:I0zRa.241$0z4...@news.level3.com...

> In article <UZxRa.884$OM3...@news-server.bigpond.net.au>,
> <gpb...@mawvyu.com.xd> wrote:
> >|>Bear in mind too that the Berkeley socket API handles not just IP
> >|>sockets but also other types of transport, e.g. x25.
> >|
> >|?
> >
> >So you didn't realise that the Berkeley designers had more foresight
> >than you give them credit for? Why do you think there is a sockaddr type
> >and then a sockaddr_in type, a sockaddr_atmsvc type and now a
> >sockaddr_in6 type, and that you have to tell socket how large the
> >sockaddr_* structure is? Sockets were designed to work with various
> >transport protocols.
>
> None of this is at all related to the issue we're discussing. This glitch
> in the API only occurs when filling in a sockaddr_in structure. It's very
> counterintuitive that a structure used for communicating with the local
> TCP/IP stack, not a remote system, would use network byte ordering instead
> of host byte ordering.

Naw, it occurs other places as well, such as sockaddr_nr where
the virtual circuit number consist of two octets "wrong endian"
to be considered an int in x86. There are probably some places in
sockaddr_x25 or sockaddr_ax25 as well, but I'm too lazy to look
at my code to find out.

It's just part of the API. If you don't like the API, create your own
cover API that works the way you want.

> What's even weirder is that some of these same folks at Berkeley also
> implemented the original Unix Talk program, and in that case they *didn't*
> use network byte ordering in the application data. The result was that Sun
> users couldn't talk to Vax users, and replacement applications like ntalk
> and ytalk had to be created.

--

... Hank

Hank: http://horedson.home.att.net
W0RLI: http://w0rli.home.att.net


Hank Oredson

unread,
Jul 17, 2003, 4:20:09 PM7/17/03
to

"Le Chaud Lapin" <unorigina...@yahoo.com> wrote in message
news:fc2e0ade.03071...@posting.google.com...

And we call this a "protocol" and publish a specification.
Just as with TCP over IP, etc.

> > >But we are not robots. We are human, and if you write data in hex
> > >format using a left-to-right language (most common), the high-order
> > >nibbles appear first, and because we have learned that things that
> > >come first in a sequence are typically of lowest order, the high-order
> > >nibbles should therefore go at the lowest-order address....
> >
> > Being a Spanish-native speaker, I must say that "concept" does not
> > apply to Spanish.
>
> "el toro loco" versus "el loco toro"
>
> Each is valid under certain context. (see my other post)

--

Peter Pentchev

unread,
Jul 18, 2003, 3:55:09 AM7/18/03
to
unorigina...@yahoo.com (Le Chaud Lapin) wrote in message news:<fc2e0ade.03071...@posting.google.com>...
[snip]

> >
> > For what it's worth, German in a minor way is the PDP-11 of endian-ness:
> >
> > "Ein hundred drei und zwanzich" (One hundred, three and twenty = 123)
>
> This is not entirely correct. Noth only does German have both
> "endians", they are actually used within the same expression. You can
> see this from your example:
>
> 123 spoken in German yields the order 1-3-2.

Well, isn't this exactly what he meant by 'the PDP-11 of endian-ness'? :)
E.g. <URL:http://mail-index.netbsd.org/port-pdp10/2002/06/15/0015.html>

> If you had made the number 1234, then when spoken in German, the order
> would have been 1-2-4-3. So you can see that for most of the number,
> the order is as it is in English, but for 20-99, it inverts.

Well, ok, so not entirely PDP-11-ish, but close...

G'luck,
Peter

Fernando Gont

unread,
Jul 18, 2003, 1:01:30 PM7/18/03
to
On 17 Jul 2003 09:14:30 -0700, unorigina...@yahoo.com (Le Chaud
Lapin) wrote:

>Perhaps it would have been better to let the application "blast to
>buffer" on sending, then let the receiver invert *but*!!!! - and this
>is the important point - *not* have the programmer on the receiving
>end be responsible for the inversion.

Why shouldn't the programmer cope with the byte order at the
*application* layer protocol if binary information is being sent, and
no custom API for that specific application layer protocol is being
used????


>The way to achieve this is to tag each packet to indicate the nature
>of byte ordering (and some other things too) of the payload. The
>sockets middleware would be responsible for doing an inversion if
>necessary. This is an optimal solution. It even covers the case of
>Barry's suggestion: let the byte order be that of the majority of
>architectures.

For sending binary information, there's not only the issue of byte
ordering, but that of how many bits are used for each C datatype, etc.
If you don't want to choose a convention, you should store all this
information for each "block" of data you send to the other end, which,
IMHO, is nonsense.


>It's been more than 5 years since I have worried about byte order
>between disparate architectures, and any conversion that needs to be
>done is optimally efficiency. I am not doing any type of odering
>overlay on top of UDP or TCP - the data is left "in the raw".

[....]


>and not worry about whether the source is a PDA and the target is a
>Cray. The '98' will become '98' at the target end. Again, this is
>done with maximal global efficiency.

I don't understand. What would happen if your app is running in two
different architectures????

Fernando Gont

unread,
Jul 18, 2003, 1:01:33 PM7/18/03
to
On Thu, 17 Jul 2003 15:33:55 GMT, rez...@cfaehe.com.qq wrote:

>|None of this is at all related to the issue we're discussing. This glitch
>|in the API only occurs when filling in a sockaddr_in structure. It's very
>|counterintuitive that a structure used for communicating with the local
>|TCP/IP stack, not a remote system, would use network byte ordering instead
>|of host byte ordering.
>Not an awful lot, but it shows that the details of the addressing were
>not considered to be the kernel's concern. It is a thin interface.

It's not about the details of addressing: it's about the byte order.


>I still don't see what the need is for the *kernel* to be involved. If
>one thinks that a host byte ordering API is a good thing, one just needs
>to write some wrappers to the socket calls and do the swap or not in
>userspace. Only thing is you would have to call it something other than
>socket, bind, connect etc. That will satisfy some people's need for
>"intuitiveness".

Why should you write a wrapper for something that could be done good
from the beginning?

Fernando Gont

unread,
Jul 18, 2003, 1:01:36 PM7/18/03
to
On 17 Jul 2003 09:01:29 -0700, unorigina...@yahoo.com (Le Chaud
Lapin) wrote:

>> Optimal? How would you "signal" the byte order to the other end?
>> Include a "byte order" bit in some type of header bit for every block
>> of data you send?????
>es.

Nonsense for me.
Would you put that type of information in TCP and IP headers, too, so
that you could store IP addresses and port numbers in your host byte
order? Would you have each router analyzing in which byte order the
destination IP address is stored, in order to know where it should
forward a packet to? - I wouldn't.


>> For the former, I think we should let the kernel cope with the byte
>> order, instead of dealing ourselves with it (as in the Sockets API).
>> For the latter, I'd choose one byte order, and stick to it.
>> Which one? It doesn't matter.
>These two situations are the same.

They are two totally different issues. One of them is about talking to
a *local* API. The other is about talking to a *remote* system.


>If you transmit data, it goes into
>a buffer, and the kernel will not have any notion of byte order. The
>kernel is not aware that you might have put an 'int' in the buffer.

We're talking about structure fields, of which the kernel *does* know
their datatypes.


>The solution to this problem is to indicate the byte order of the
>payload of every packet sent (along with other information). The
>receiver will determine if an inversion is necessary. The programmer
>should be relieved of thinking about byte order, as you have said, but
>then again, this *is* distributed systems.

Would you make the kernel know about every existing application layer
protocol, in order to have it cope with byte order (and other
portability issues) at the application layer protocol? Nonsense for
me.


>> Being a Spanish-native speaker, I must say that "concept" does not
>> apply to Spanish.
>"el toro loco" versus "el loco toro"
>Each is valid under certain context. (see my other post)

Yes. But it does not follow your rule that "and because we have


learned that things that come first in a sequence are typically of
lowest order, the high-order nibbles should therefore go at the
lowest-order address".

--

Glen Herrmannsfeldt

unread,
Jul 18, 2003, 2:53:38 PM7/18/03
to

"Barry Margolin" <barry.m...@level3.com> wrote in message
news:I0zRa.241$0z4...@news.level3.com...

(snip)

> None of this is at all related to the issue we're discussing. This glitch
> in the API only occurs when filling in a sockaddr_in structure. It's very
> counterintuitive that a structure used for communicating with the local
> TCP/IP stack, not a remote system, would use network byte ordering instead
> of host byte ordering.

Well, if the structure is just copied into the outgoing data stream it does.
Though it doesn't have to work that way.

> What's even weirder is that some of these same folks at Berkeley also
> implemented the original Unix Talk program, and in that case they *didn't*
> use network byte ordering in the application data. The result was that
Sun
> users couldn't talk to Vax users, and replacement applications like ntalk
> and ytalk had to be created.

So that is why there are so many? Though I have had ones that wouldn't talk
on the same machine, so there must have been other problems, too.

-- glen


Barry Margolin

unread,
Jul 18, 2003, 3:05:30 PM7/18/03
to
In article <SmXRa.79559$wk6....@rwcrnsc52.ops.asp.att.net>,

Glen Herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
>
>"Barry Margolin" <barry.m...@level3.com> wrote in message
>news:I0zRa.241$0z4...@news.level3.com...
>
>(snip)
>
>> None of this is at all related to the issue we're discussing. This glitch
>> in the API only occurs when filling in a sockaddr_in structure. It's very
>> counterintuitive that a structure used for communicating with the local
>> TCP/IP stack, not a remote system, would use network byte ordering instead
>> of host byte ordering.
>
>Well, if the structure is just copied into the outgoing data stream it does.
>Though it doesn't have to work that way.

I suspect the reason the API is the way it is is because that's what the
original sockets implementators did. The API glitch didn't become apparent
until it was ported to other architectures.

grct...@qslmfm.com.xe

unread,
Jul 18, 2003, 6:45:58 PM7/18/03
to
|>I still don't see what the need is for the *kernel* to be involved. If
|>one thinks that a host byte ordering API is a good thing, one just needs
|>to write some wrappers to the socket calls and do the swap or not in
|>userspace. Only thing is you would have to call it something other than
|>socket, bind, connect etc. That will satisfy some people's need for
|>"intuitiveness".
|
|Why should you write a wrapper for something that could be done good
|from the beginning?

Ok, you go back in your time machine and fix it. While you're doing
that, there are lots of other things in history I would like you to fix.
:-)
--

mel...@drpqzf.com.tb

unread,
Jul 18, 2003, 6:58:38 PM7/18/03
to
|>Well, if the structure is just copied into the outgoing data stream it does.
|>Though it doesn't have to work that way.
|
|I suspect the reason the API is the way it is is because that's what the
|original sockets implementators did. The API glitch didn't become apparent
|until it was ported to other architectures.

Maybe, but the Berkeley people wouldn't have been unaware of the endian
issue. The protocol specs existed before BSD was asked to create a
TCP/IP implementation for Unix. So they might have decided it wasn't in
their bailiwick to "correct" the problem.
--

Barry Margolin

unread,
Jul 18, 2003, 8:00:17 PM7/18/03
to
In article <yY_Ra.2832$OM3...@news-server.bigpond.net.au>,

The glitch is in the API, not the protocol, and the Berkeley people
invented that themselves. They decided to copy the bytes from the
sockaddr_in structure verbatim into the IP header rather than call htons()
in the IP stack.

gdb...@ygjgxi.com.lo

unread,
Jul 18, 2003, 8:20:14 PM7/18/03
to
|>Maybe, but the Berkeley people wouldn't have been unaware of the endian
|>issue. The protocol specs existed before BSD was asked to create a
|>TCP/IP implementation for Unix. So they might have decided it wasn't in
|>their bailiwick to "correct" the problem.
|
|The glitch is in the API, not the protocol, and the Berkeley people
|invented that themselves. They decided to copy the bytes from the
|sockaddr_in structure verbatim into the IP header rather than call htons()
|in the IP stack.

Which is not an reasonable decision, I think. I mean, how far do you go
to hide the details of the underlying protocol? Should they have zeroed
the 8 byte area (I've been caught by that too)? In retrospect it's easy
to see that some things could have been done different. But the fact
that nobody wants to spend any effort on a thin wrapper to "correct" it
(with insignificant loss of space and speed) means that the whole
discussion is about perceptions about what's "intiutive". So we read
phrases like "done right from the beginning". I take the point but am
also amused by the subtext.

There are more serious glitches with many of the existing protocols. The
word alignment could have been done much better if they had foresight
about CPUs with longer words. Netbios is one of my favourite poster bad
boys.
--

wr...@dwheqh.com.lf

unread,
Jul 18, 2003, 8:21:19 PM7/18/03
to
|Which is not an reasonable decision, I think. I mean, how far do you go

Oops, should be "not an unreasonable". Serves me right for using the
double negative. :-)
--

Le Chaud Lapin

unread,
Jul 20, 2003, 9:26:31 PM7/20/03
to
arie...@softhome.net (Fernando Gont) wrote in message news:<3f16fc90...@News.CIS.DFN.DE>...

> >The solution to this problem is to indicate the byte order of the
> >payload of every packet sent (along with other information). The
> >receiver will determine if an inversion is necessary. The programmer
> >should be relieved of thinking about byte order, as you have said, but
> >then again, this *is* distributed systems.
>
> Would you make the kernel know about every existing application layer
> protocol, in order to have it cope with byte order (and other
> portability issues) at the application layer protocol? Nonsense for
> me.

Non-sense for me too. You keep mentioning the "kernel". As a device
driver developer, this has a very specific meaning for me (IOCTL's,
DMA, Interrupts, etc.), so I will start using "stack" instead.

There is a very elegant solution to the whole problem of portability
and it involves data marshalling by software associated with the
stack. It's too long to explain in detail here, so I will summarize:

1. Packet headers should predetermined byte-order.
2. Data placed in buffers should *not* have predetermined byte-order.
3. Receiver reorders data buffer as necessary.

This solution is optimal, both from an efficiency point of view and an
elegance point of view.

> >> Being a Spanish-native speaker, I must say that "concept" does not
> >> apply to Spanish.
> >"el toro loco" versus "el loco toro"
> >Each is valid under certain context. (see my other post)
>
> Yes. But it does not follow your rule that "and because we have
> learned that things that come first in a sequence are typically of
> lowest order, the high-order nibbles should therefore go at the
> lowest-order address".

I only use this example because you invoked the "being a native
spanish speaker" clause. The fact of the matter is that big-endian
was chosen because it works very well for hexadecimal (and octal,
etc.) printouts. If numbers had been written in the opposite
direction, little-endian would have ruled the day, as there are some
subtle, but still somewhat significant technical issues why a
"computer would prefer little endian".

I wrote a very long post about this about 6 months ago, and cannot
remember what group it is in, but know that the whole endian thing
does in fact have to do with the way numbers are written in English
and many other languages.

-Chaud Lapin-

Alun Jones [MS MVP]

unread,
Jul 21, 2003, 5:57:35 PM7/21/03
to
In article <fc2e0ade.03072...@posting.google.com>,
unorigina...@yahoo.com (Le Chaud Lapin) wrote:
>There is a very elegant solution to the whole problem of portability
>and it involves data marshalling by software associated with the
>stack. It's too long to explain in detail here, so I will summarize:
>
>1. Packet headers should predetermined byte-order.

Which is as it currently stands. The predetermined byte-order being
"network byte order". MSByte first, LSByte last. [I can never remember
what "big-endian" or "little-endian" are - perhaps a reminder that the terms
are poorly chosen, since they convey no meaning.]

>2. Data placed in buffers should *not* have predetermined byte-order.
>3. Receiver reorders data buffer as necessary.
>
>This solution is optimal, both from an efficiency point of view and an
>elegance point of view.

And there's absolutely nothing to tell you that you can't do this.

I might note, however, that one possible fly in that ointment comes from the
graphics community, where it's long been realised that the conversion from
one format to another generally involves an intermediate format. It doesn't
give much in the way of benefits where there are only two end formats
(especially when you choose one of them to be the intermediate format), but
if there are several end formats, it drastically reduces the amount of
translation that needs to be done.

So, for instance, if you want to add processors that (for whatever reason)
have a new way of representing numbers (a different format, or a new size),
then all receivers would have to be modified under your scheme. Under a
scheme where everyone talks the same order, only those applications written
on the new processor need to be changed. Now, that's obviously not so much
of an issue with integer values, where you're essentially limited to two
orderings, and two or three sizes (32 bits, 16 bits, and maybe 8) - and new
sizes require significant alterations. But with floating point numbers,
there are a number of different representations. Defining a fixed ('fixed'
as in 'set in stone', not as in 'opposite of floating point') representation
allows you to write considerably fewer conversion routines in your
application.

>I only use this example because you invoked the "being a native
>spanish speaker" clause. The fact of the matter is that big-endian
>was chosen because it works very well for hexadecimal (and octal,
>etc.) printouts. If numbers had been written in the opposite
>direction, little-endian would have ruled the day, as there are some
>subtle, but still somewhat significant technical issues why a
>"computer would prefer little endian".

"The fact of the matter" is nothing of the sort. Endianness was essentially
randomly chosen.

>I wrote a very long post about this about 6 months ago, and cannot
>remember what group it is in, but know that the whole endian thing
>does in fact have to do with the way numbers are written in English
>and many other languages.

Those who write their numbers vertically must invent interesting processors,
that's all I can say.

No, given that English speakers make up large portions of both the rabidly
pro-big-endian and the fanatically pro-little-endian camps, support for
endianness clearly has nothing to do with language, and everything to do
with personal preference, unburdened by any reason.

Alun.
~~~~

[Please don't email posters, if a Usenet response is appropriate.]
--
Texas Imperial Software | Find us at http://www.wftpd.com or email
1602 Harvest Moon Place | al...@texis.com.
Cedar Park TX 78613-1419 | WFTPD, WFTPD Pro are Windows FTP servers.
Fax/Voice +1(512)258-9858 | Try our NEW client software, WFTPD Explorer.

Le Chaud Lapin

unread,
Jul 21, 2003, 10:26:13 PM7/21/03
to
al...@texis.com (Alun Jones [MS MVP]) wrote in message news:<jlZSa.2792$fo2....@newssvr22.news.prodigy.com>...
[abridged]


> "The fact of the matter" is nothing of the sort. Endianness was essentially
> randomly chosen.

[abridged]

> No, given that English speakers make up large portions of both the rabidly
> pro-big-endian and the fanatically pro-little-endian camps, support for
> endianness clearly has nothing to do with language, and everything to do
> with personal preference, unburdened by any reason.

There seems to be a contradiction here. How could the choice have been
random if there was a preference for one endian over the other?

There was in fact some debate before the choice was made.

In the case where big-endian chosen, the big endian proponents "won".

In the case where little-endian was chosen, the little endian
proponents "won".

-Chaud Lapin-

Alun Jones [MS MVP]

unread,
Jul 23, 2003, 10:19:57 AM7/23/03
to
In article <fc2e0ade.03072...@posting.google.com>,
unorigina...@yahoo.com (Le Chaud Lapin) wrote:
>There seems to be a contradiction here. How could the choice have been
>random if there was a preference for one endian over the other?

You appear to have been confused by my use of the English language.

Here goes:

A choice was made. That choice is set in stone. When the choice was made,
prior to it being set in stone, there was no reason other than personal
preference / random chance to make that particular choice. Now that it is
set in stone, there is no good reason to choose something different, because
you'd be breaking everything that currently works.

>There was in fact some debate before the choice was made.
>
>In the case where big-endian chosen, the big endian proponents "won".
>
>In the case where little-endian was chosen, the little endian
>proponents "won".

And yet, in both cases, the language of the proponents of each scheme was
the same - English, in the case of most of them. How, then, can endianness
derive from the language spoken by the proponent, when the language spoken
is the same among both camps?

You might as well say "right handed people all speak English - it's because
they speak English that they are right handed - and of course, the same goes
for the left handed people, because they all speak English, too."

Le Chaud Lapin

unread,
Jul 23, 2003, 10:09:53 PM7/23/03
to
al...@texis.com (Alun Jones [MS MVP]) wrote in message news:<hQwTa.1037$ad7.19...@newssvr12.news.prodigy.com>...

> In article <fc2e0ade.03072...@posting.google.com>,
> unorigina...@yahoo.com (Le Chaud Lapin) wrote:
> >There seems to be a contradiction here. How could the choice have been
> >random if there was a preference for one endian over the other?
>
> You appear to have been confused by my use of the English language.
>
> Here goes:
>
> A choice was made. That choice is set in stone. When the choice was made,
> prior to it being set in stone, there was no reason other than personal
> preference / random chance to make that particular choice. Now that it is
> set in stone, there is no good reason to choose something different, because
> you'd be breaking everything that currently works.
>
> >There was in fact some debate before the choice was made.
> >

There are two separate topics discussed in your preceeding paragraph.
The first topic is whether there was any reasoning behind choosing
big-endian over little-endian. The second is whether we should do
something about it, today, in 2003. For this disucssion, I am only
referring to the first topic, whether the choice was random or if
there was debate.

> >In the case where big-endian chosen, the big endian proponents "won".
> >
> >In the case where little-endian was chosen, the little endian
> >proponents "won".
>
> And yet, in both cases, the language of the proponents of each scheme was
> the same - English, in the case of most of them. How, then, can endianness
> derive from the language spoken by the proponent, when the language spoken
> is the same among both camps?

The "how-it-is-written-in-English" argument was invoked by the
big-endian people. Naturally, the little-endian people would not
invoke this concept, as it would undermine their position.



> You might as well say "right handed people all speak English - it's because
> they speak English that they are right handed - and of course, the same goes
> for the left handed people, because they all speak English, too."

Non sequitur.

I never said that the English language should dictate the choice of
endian. As a matter of fact, I have purposely refrained from stating
my personal opinion on whether I prefer big endian over little endian.

What I am trying to point out is that the choice of endianess was not
at all "random".

One need only imagine what goes on the the design room of a major
microprocessor house. Those of us who have experience in
microprocessor construction know that the choice of endian heavily
influences the ALU (how could it not), and on microprocessors that are
split-word cognizant, the choice is even more significant. If you do
not believe that choice of endian influences the very heart of a
microprocessor, I can find a colleague over at AMD or Intel who
design's CPU's to put up a post to the contrary.

Knowing that there was considerable debate, and there still is, one
has to ask what are the reasons proffered for choosing one endian over
the other. This is what I was getting at in my previous post. I
stated that one of the reasons given by the big-endian proponents has
to do with the way the memory dumps are written in English. There are
many other reasons given by them (not by me). Whether those reasons
have merit is a different matter, but rest assured, people who are
"rabid" or "fanatic" about a concept, to quote a previous poster, will
certainly have something to say when they are sitting in a design room
discussion that concept while drinking Perrier.

Choice of endian, in each major design work, is not done by a flip of
the coin. There is always consideration, and it still continues
today.

-Chaud Lapin-

Fernando Gont

unread,
Jul 24, 2003, 10:02:43 AM7/24/03
to
On 23 Jul 2003 19:09:53 -0700, unorigina...@yahoo.com (Le Chaud
Lapin) wrote:

>> And yet, in both cases, the language of the proponents of each scheme was
>> the same - English, in the case of most of them. How, then, can endianness
>> derive from the language spoken by the proponent, when the language spoken
>> is the same among both camps?
>The "how-it-is-written-in-English" argument was invoked by the
>big-endian people. Naturally, the little-endian people would not
>invoke this concept, as it would undermine their position.

You didn't read all this from IEN 137, did you?

Hank Oredson

unread,
Jul 24, 2003, 2:14:22 PM7/24/03
to
Google for:
"ON HOLY WARS AND A PLEA FOR PEACE"
If that does not work try "IEN137" and also "Danny Cohen".
Reading this paper will help you understand :-)

--

... Hank

"Alun Jones [MS MVP]" <al...@texis.com> wrote in message
news:hQwTa.1037$ad7.19...@newssvr12.news.prodigy.com...

Hank Oredson

unread,
Jul 24, 2003, 2:15:33 PM7/24/03
to

"Fernando Gont" <fg...@softhome.net> wrote in message
news:3f1fe5ca...@News.CIS.DFN.DE...

> On 23 Jul 2003 19:09:53 -0700, unorigina...@yahoo.com (Le Chaud
> Lapin) wrote:
>
> >> And yet, in both cases, the language of the proponents of each scheme was
> >> the same - English, in the case of most of them. How, then, can
endianness
> >> derive from the language spoken by the proponent, when the language spoken
> >> is the same among both camps?
> >The "how-it-is-written-in-English" argument was invoked by the
> >big-endian people. Naturally, the little-endian people would not
> >invoke this concept, as it would undermine their position.
>
> You didn't read all this from IEN 137, did you?


Ah ... should have read further down the thread before posting the
hint about "ON HOLY WARS AND A PLEA FOR PEACE" :-)

Le Chaud Lapin

unread,
Jul 24, 2003, 3:39:03 PM7/24/03
to
fg...@softhome.net (Fernando Gont) wrote in message news:<3f1fe5ca...@News.CIS.DFN.DE>...

> On 23 Jul 2003 19:09:53 -0700, unorigina...@yahoo.com (Le Chaud
> Lapin) wrote:
>
> >> And yet, in both cases, the language of the proponents of each scheme was
> >> the same - English, in the case of most of them. How, then, can endianness
> >> derive from the language spoken by the proponent, when the language spoken
> >> is the same among both camps?
> >The "how-it-is-written-in-English" argument was invoked by the
> >big-endian people. Naturally, the little-endian people would not
> >invoke this concept, as it would undermine their position.
>
> You didn't read all this from IEN 137, did you?

No.

-Chaud Lapin-

Le Chaud Lapin

unread,
Jul 24, 2003, 4:17:27 PM7/24/03
to
fg...@softhome.net (Fernando Gont) wrote in message news:<3f1fe5ca...@News.CIS.DFN.DE>...

> On 23 Jul 2003 19:09:53 -0700, unorigina...@yahoo.com (Le Chaud
> Lapin) wrote:
>
> >> And yet, in both cases, the language of the proponents of each scheme was
> >> the same - English, in the case of most of them. How, then, can endianness
> >> derive from the language spoken by the proponent, when the language spoken
> >> is the same among both camps?
> >The "how-it-is-written-in-English" argument was invoked by the
> >big-endian people. Naturally, the little-endian people would not
> >invoke this concept, as it would undermine their position.
>
> You didn't read all this from IEN 137, did you?

I just looked that up on the net to see what it is. Interesting
paper.

Not to gloat, but it supports my assertion that the
big-endian/little-endian proponents used order of print in English as
a point of argument.

In all honesty, a long time ago, before I had ever engaged in any
conversation with anyone about big-endian/little-endian, I had
inferred that the big-endians' chief argument (no pun intended) was
probably based on how numbers are written in English. I inferred this
knowing that certain classes of people with certain sociological
traits interact in a distinctly recognizable way with their world.

This brings us back to the point I was trying to make:

Reason; in a pure, mathematical, Draconian sense; is often *not* what
guides our design choices, even as engineers. If it did, there would
not be such strong correlations between seemingly unrelated design
axes. Instead, each of us has has an innate affinity for certain
modes of thought (which is not necessarily equivalent to "reasoning".

For example,

1. There are people who prefer big-endian over little-endian.
2. There are people who prefer "skewed" curly braces over "balanced"
curly braces.
3. There are people who like underbars in their variable names and
some not.
4. There are people you like to indicate type by adding "_t" to type
names, and some who do not.
5. There are people who think C++ is superior to Java, and some who do
not.
6. There are who prefer abbreviate variable names over verbose ones.

The list goes on and on.

What is interesting to note is that there are correlations between
these choices. I suspect that, if someone were to do a behavioral
study of engineers on, say, 50 of these axes, at least 15 would be
strongly correlated.

Under the assumption that the engineers under scrutiny are equally
insightful, and that the distribution of preference on each axes is
roughly equivalent, it would seem that the data should be
uncorrelated.

But in my experience, this is not the case. This can be exemplified
by taking a colleague, having an extended conversation with him/her
about his/her personal life, dicussing things that have nothing to do
with science or technology, and after the conversation is over, trying
to decide if s/he is big-endian or little-endian, balanced curly
braced or skewed curly brace, etc. I've done this on several
occasions with notable success.

Of course, this is nothing new. Socioligists have been trying to tell
us for decades that scientists, too, are selectively objective.

-Chaud Lapin-

Barry Margolin

unread,
Jul 24, 2003, 4:27:16 PM7/24/03
to
In article <fc2e0ade.03072...@posting.google.com>,

Le Chaud Lapin <unorigina...@yahoo.com> wrote:
>What is interesting to note is that there are correlations between
>these choices. I suspect that, if someone were to do a behavioral
>study of engineers on, say, 50 of these axes, at least 15 would be
>strongly correlated.

I suspect that the reason for this is sociological. Many people develop
their styles by copying others. If a particular development organization
has programming style standards (either formal or informal) that specify
some of these choices, members of that organization will adopt that style.
In the future they may take positions that allow them to dictate some
choices to new groups, so some of the correlations will spread to other
organizations.

0 new messages