Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Question about Endian-ness and file transfers

3 views
Skip to first unread message

Miriam Jaffe

unread,
Dec 7, 1993, 3:59:41 PM12/7/93
to
Can one of you experts on endian-ness help me with a
question I have been asked to research which is quite
outside of my area of expertise?

If graphic images are stored on a Mac, and someone on
a machine of opposite endian-ness uses FTP to retrieve
the binary data, how can the data be used on the second
machine? Argh.. I don't think I'm asking the question
correctly. This is the problem: Person 1 has image data,
on a Mac. Person 2 has a PC and needs the images. Person
2 uses FTP to suck up the binary image data from Person 1's
Mac. The files appear to transfer correctly, since I under-
stand that FTP doesn't care what it is looking at or what
type of machine it is getting it from (Is that right? that's
what I was told.) However, once on Person 2's machine, the
files can't be read as the images they originally were.

If the files are transferred through email, there's no
problem; it's just the FTP transfers that are causing trouble.
Are we doing something wrong? With image data stored on
FTP archives all over the world, surely someone has encountered
this situation before.

I was led to believe by the person who instructed me to find
the answer that this is a problem resulting from the differences
in byte order between the Macs and PCs, which is why I have
posed it to people discussing Big- vs Little-endian-ness.

Thanks for any help you can provide.

-----------
Miriam Jaffe
Information Services, University of Maryland at Baltimore
mja...@trout.ab.umd.edu mja...@umbc8.umbc.edu

Miriam Jaffe

unread,
Dec 7, 1993, 7:51:38 PM12/7/93
to

Ok, I am following up to my own post because I guess
I was not very clear in the original (judging from the
responses I have been getting)...

In article <2e2qrt...@umbc8.umbc.edu>,
Miriam Jaffe <mja...@gl.umbc.edu> wrote:

>If the files are transferred through email, there's no
>problem; it's just the FTP transfers that are causing trouble.
>Are we doing something wrong? With image data stored on
>FTP archives all over the world, surely someone has encountered
>this situation before.

The files transfer in email ok because the person mailing them
knows to use MacBinary or whatever to turn the files into
ASCII characters so they can be sent in email messages. But
in their original format they are just binary data (or so
this is what I am told)... and when someone goes to FTP the
data off the source machine, the FTPer is just getting a stream
of binary bits. I wish I were explaining better. We are going
to test this out again and I will observe for myself the exact
nature of the problem, when it occurs and when it doesn't..
and post again if I have new insights.

Meanwhile, I appreciate the prompt and helpful replies I
have already received. I am impressed with the responsiveness
of this newsgroup!

Herman Rubin

unread,
Dec 7, 1993, 9:23:07 PM12/7/93
to
In article <2e2qrt...@umbc8.umbc.edu> mja...@gl.umbc.edu (Miriam Jaffe) writes:
>Can one of you experts on endian-ness help me with a
>question I have been asked to research which is quite
>outside of my area of expertise?

>If graphic images are stored on a Mac, and someone on
>a machine of opposite endian-ness uses FTP to retrieve
>the binary data, how can the data be used on the second
>machine? Argh.. I don't think I'm asking the question
>correctly. This is the problem: Person 1 has image data,
>on a Mac. Person 2 has a PC and needs the images. Person
>2 uses FTP to suck up the binary image data from Person 1's
>Mac. The files appear to transfer correctly, since I under-
>stand that FTP doesn't care what it is looking at or what
>type of machine it is getting it from (Is that right? that's
>what I was told.) However, once on Person 2's machine, the
>files can't be read as the images they originally were.

The problem is much worse than you have indicated, and endianness
is not the sole problem. In any case it is necessary to know the
data format; even with the same machine, it might be a problem, as
compression methods and other means may be used to tweak data.

>If the files are transferred through email, there's no
>problem; it's just the FTP transfers that are causing trouble.

If the data are transferred as integers, either decimal or octal or
hex, or decimal floats or fixed-point reals, there is little problem.
But for binary fixed-point or floating-point numbers, we as yet do
not have a convention of any kind. The Crays, the ETA 10, and the
RS/6000 are all big-endian with 64 bits for words on the Crays,
full precision numbers on the ETA 10, and "double precision" floats
on the 6000, and conversion has to be done to transfer numbers
between these machines.

>Are we doing something wrong? With image data stored on
>FTP archives all over the world, surely someone has encountered
>this situation before.

Yes and no. As long as there is any difference in the representation
in different machines, some sort of conversion will have to be made.
But some sort of a "binary" standard of communication for "real"
numbers will help the process.

>I was led to believe by the person who instructed me to find
>the answer that this is a problem resulting from the differences
>in byte order between the Macs and PCs, which is why I have
>posed it to people discussing Big- vs Little-endian-ness.
>
>Thanks for any help you can provide.
>
>-----------
>Miriam Jaffe
>Information Services, University of Maryland at Baltimore
>mja...@trout.ab.umd.edu mja...@umbc8.umbc.edu
>


--
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hru...@snap.stat.purdue.edu (Internet, bitnet)
{purdue,pur-ee}!snap.stat!hrubin(UUCP)

Ketil Albertsen,TIH

unread,
Dec 9, 1993, 5:42:25 AM12/9/93
to
In article <CHp2M...@mentor.cc.purdue.edu>, hru...@snap.stat.purdue.edu (Herman Rubin) writes:

[...]


>As long as there is any difference in the representation
>in different machines, some sort of conversion will have to be made.
>But some sort of a "binary" standard of communication for "real"
>numbers will help the process.

Notice that there IS such a standard; it is used by all OSI application
protocols to which it is relevant, and is called the BER - the Basic
Encoding Rules. (It covers "all" data types - not only numeric formats.)

True enough: BER belongs in a family of standards that, as a whole, is
rather complex, to say the least. Also true: Converting to and from BER
is not cheap (compared to eg. XDR transmission format on a 68K machine,
on which you more or less dump the 68K's native binary format to the line
and let machines based on all other processors take responsibility for
converting. Gives SUN machines some very nice benchmark results, though!).
BER does not fit into any processors' native format, it will always require
conversion. But done properly, it isn't *that* costly.

ASN.1, the language to specify "what" to transmit, accompanying the BER,
which specifies "how" to transmit it, is slowly creeping into the Internet
protocol specs. (In OSI protocols, it is The Way to do specs). Unless the
Internet community once more runs into the Not Invented Here syndrome,
we can hope for ASN.1/BER to become a common standard for data representation
in all major networks over the next few year. (Maybe I am overly optimistic...)

Kenneth Ekman

unread,
Dec 9, 1993, 10:17:59 AM12/9/93
to
In article 2e2qrt...@umbc8.umbc.edu, mja...@gl.umbc.edu (Miriam Jaffe) writes:
>Can one of you experts on endian-ness help me with a
>question I have been asked to research which is quite
>outside of my area of expertise?
>
>If graphic images are stored on a Mac, and someone on
>a machine of opposite endian-ness uses FTP to retrieve
>the binary data, how can the data be used on the second
>machine?

If the data is stored in one of the standard forms (ie GIF, PCX
JPEG, etc) you shouldn't have a problem, as those formats are
the same on all (I hope :-) machines.

What I am not sure about, however, is how ftp works beetween different
machines. I would find it very strange though, if there is no standard
byte-order specified in the ftp-protocol. One common mistake when
using ftp, (I know, I've been there :-) is not typing 'binary' before
transmission of 8-bit files. In that case, ftp might use a 7-bit conection
which would certainly corrupt your pictures.

>I was led to believe by the person who instructed me to find
>the answer that this is a problem resulting from the differences
>in byte order between the Macs and PCs, which is why I have
>posed it to people discussing Big- vs Little-endian-ness.

I doubt that this would cause the problem, as I often transfer
pictures between an Amiga (680xx) and a PC (though not by ftp).

Kenneth


Herman Rubin

unread,
Dec 10, 1993, 5:16:12 AM12/10/93
to
In article <1993Dec9.10...@lumina.edb.tih.no> ke...@edb.tih.no (Ketil Albertsen,TIH) writes:
>In article <CHp2M...@mentor.cc.purdue.edu>, hru...@snap.stat.purdue.edu (Herman Rubin) writes:

>[...]
>>As long as there is any difference in the representation
>>in different machines, some sort of conversion will have to be made.
>>But some sort of a "binary" standard of communication for "real"
>>numbers will help the process.

>Notice that there IS such a standard; it is used by all OSI application
>protocols to which it is relevant, and is called the BER - the Basic
>Encoding Rules. (It covers "all" data types - not only numeric formats.)

>True enough: BER belongs in a family of standards that, as a whole, is
>rather complex, to say the least. Also true: Converting to and from BER
>is not cheap (compared to eg. XDR transmission format on a 68K machine,
>on which you more or less dump the 68K's native binary format to the line
>and let machines based on all other processors take responsibility for
>converting. Gives SUN machines some very nice benchmark results, though!).
>BER does not fit into any processors' native format, it will always require
>conversion. But done properly, it isn't *that* costly.

I have seen such standards based on decimal conversion and other such
horrors. I have NOT seen in any proposed standard a real binary, except
for standards based on IEEE, which are totally unacceptable as being too
limited.

What is needed is a standard for binary fixed-point numbers and floats
which is as easy and as powerful to use as the 0x notation for hex integers,
and which requires no use whatever of the idea of decimals. We may even
want to allow full-byte characters in the transmission, but it should be
as straightforward for someone who grew up using any other base than for
the decimal overenthusiasts.

If such a standard is produced, it will be relatively cheap to use.
At this time, all machines use binary for all except a few purposes,
so a binary protocol will be machine friendly, and at least to people
like me, user friendly as well.

David Gay

unread,
Dec 10, 1993, 8:49:12 AM12/10/93
to

In article <1993Dec9.10...@lumina.edb.tih.no> ke...@edb.tih.no (Ketil Albertsen,TIH) writes:
>As long as there is any difference in the representation
>in different machines, some sort of conversion will have to be made.
>But some sort of a "binary" standard of communication for "real"
>numbers will help the process.

Notice that there IS such a standard; it is used by all OSI application
protocols to which it is relevant, and is called the BER - the Basic
Encoding Rules. (It covers "all" data types - not only numeric formats.)

It only handles "traditional" types (records, arrays, etc) however.
And its syntax is downright bad in places. I don't want to go into
details, but a few keywords such as 'tags' (which are completely
useless), 'choices', 'values' (interesting syntax if you want to
exercise your parser writing skills) should ring bells with those
familiar with ASN.1.

True enough: BER belongs in a family of standards that, as a whole, is
rather complex, to say the least. Also true: Converting to and from BER
is not cheap (compared to eg. XDR transmission format on a 68K machine,
on which you more or less dump the 68K's native binary format to the line
and let machines based on all other processors take responsibility for
converting. Gives SUN machines some very nice benchmark results, though!).
BER does not fit into any processors' native format, it will always require
conversion. But done properly, it isn't *that* costly.

Not only is conversion not particularly cheap, but it contains
significant quantities of useless information which increase the
necessary size of the transmitted data. If I had to choose between
ASN.1 and XDR, I would select XDR any day.

ASN.1, the language to specify "what" to transmit, accompanying the BER,
which specifies "how" to transmit it, is slowly creeping into the Internet
protocol specs. (In OSI protocols, it is The Way to do specs). Unless the
Internet community once more runs into the Not Invented Here syndrome,
we can hope for ASN.1/BER to become a common standard for data representation
in all major networks over the next few year. (Maybe I am overly optimistic...)

If NIH lead to the use of something better than ASN.1, it would for once
have produced a postive result ...

David Gay
dg...@di.epfl.ch

Rogers Huw

unread,
Dec 13, 1993, 4:35:52 AM12/13/93
to ke...@edb.tih.no
In article <1993Dec9.10...@lumina.edb.tih.no> you write:
>In article <CHp2M...@mentor.cc.purdue.edu>, hru...@snap.stat.purdue.edu (Herman Rubin) writes:
>[...]
>>As long as there is any difference in the representation
>>in different machines, some sort of conversion will have to be made.
>>But some sort of a "binary" standard of communication for "real"
>>numbers will help the process.

>Notice that there IS such a standard; it is used by all OSI application
>protocols to which it is relevant, and is called the BER - the Basic
>Encoding Rules. (It covers "all" data types - not only numeric formats.)

>True enough: BER belongs in a family of standards that, as a whole, is
>rather complex, to say the least. Also true: Converting to and from BER
>is not cheap

[munch]

>ASN.1, the language to specify "what" to transmit, accompanying the BER,
>which specifies "how" to transmit it, is slowly creeping into the Internet
>protocol specs. (In OSI protocols, it is The Way to do specs). Unless the
>Internet community once more runs into the Not Invented Here syndrome,
>we can hope for ASN.1/BER to become a common standard for data representation
>in all major networks over the next few year. (Maybe I am overly optimistic...)

Pass the sick bag, Alice...

This has to the be the worst, most unutterably repellent thought I have
seen in a long time. Anyone who has had to deal with SNMP should know
how bad the BER, and concomitant horrors like "software presentation
layers" are. BER is a triumph of bloated theory over clean
implementation.

Repeat after me:

"Good software is small.
Good software is clean.
Good software is fast.
Good software is designed with implementation considerations in mind."

None of which applies to the BER, or to any protocol which explicitly
requires it. There are other, much better, ways of encoding ASN.1 for
transmission.

Just one of the many misfeatures of BER, the design of a protocol
around bitstreams with arbitrary boundaries between data which can only
be determined at a high level, is ignoring real world architectural
considerations in a way worthy of only the most anal retentive of
theoreticians. Note: this is not to criticise variable word data
compression, which can be totally hidden from the protocol implementer,
is relatively simple to implement, and is not that expensive anyway.

As a matter of holy principle, Internet communication protocols (once
decompressed if that is applicable) should either be directly
addressable as C data structures (with use of "network byte order"), or
accessed as human read/writeable text streams (SNMP/NNTP).

Avoid BER, and save the Internet.

Hell,
"It's the only way to be sure"

DISCLAIMER: _Nothing_ written above has anything to do with my employer.

Miriam Jaffe

unread,
Dec 16, 1993, 5:01:10 PM12/16/93
to

In article <1993Dec9....@sa.erisoft.se>,

Kenneth Ekman <Kennet...@sa.erisoft.se> wrote:
>In article 2e2qrt...@umbc8.umbc.edu,
mja...@gl.umbc.edu (Miriam Jaffe) writes:
>>Can one of you experts on endian-ness help me with a
>>question I have been asked to research which is quite
>>outside of my area of expertise?
>>
>>If graphic images are stored on a Mac, and someone on
>>a machine of opposite endian-ness uses FTP to retrieve
>>the binary data, how can the data be used on the second
>>machine?

Thank you for your many letters in response to my recent
post about file transfers and endian-ness. Just in case
any of you were curious, here's a summary of the case:

Researchers using scientific imaging equipment which
generated files in a format called .gel transferred
those files via ftp to both PCs and Macs. The equipment
itself contains a PC, so the transfers from the machine
to the PCs were PC-to-PC transfers. The researchers
noticed that the transmission rates were wildly different
if they were going to the PC or to the Mac, first of all,
and also that they could not read the images on the Mac.
The transfer rates in question were something like 172 Kb/sec
for the PC/PC transfer and .84 Kb/sec for the PC/Mac transfer.

As you all noted, the transfer mode has to be set to binary
for ftp. We knew this, and we'd told the researchers this,
but they insisted they'd been doing it all correctly. When
we finally went over there to investigate personally and to
try to reproduce the problem, we concluded (from their
sheepish expressions and their insistence that they'd 'fooled
around with it and gotten it to work') that they'd indeed
been forgetting to set the ftp to binary.

The transfer problem turned out to be twofold.. and old version
of ftp on their LAN, and a mis-set parameter for the packet
size to be transmitted. Seems the Mac couldn't take as much
as it was being fed. We reduced the packet size and got the
transfer rates up to about 50 Kb/sec Mac to PC, and about 24
Kb/sec PC to Mac. That seemed right as the Mac can put things
onto the LAN faster than it can take them in, and the difference
between the PC/PC rates and the PC/Mac rates would be the result of
all the translation.

Did I get all of that right? I'm a newbie. :)

Miriam

Guy Harris

unread,
Dec 16, 1993, 5:39:31 PM12/16/93
to
>Also true: Converting to and from BER
>is not cheap (compared to eg. XDR transmission format on a 68K machine,
>on which you more or less dump the 68K's native binary format to the line
>and let machines based on all other processors take responsibility for
>converting. Gives SUN machines some very nice benchmark results, though!).

I can't speak for "SUN" machines, as I know of no computer maker named
"SUN", but the machines from Sun Microsystems (as opposed to Stanford
University Network Microsystems), these days are based on "other
processors" - i.e., SPARC processors. Those machines do not have to
"take responsibility for converting" to a greater degree than do Sun's
old 68K machines.

XDR isn't 68K-specific. It's big-endian, IEEE floating point, format,
which is used by at least some "other processors". It's not used by
*all* "other processors", though - e.g., x86 is little-endian, as are
most if not all Alpha systems, and DEC's M[Ii][Pp][Ss]-based systems.

David B. Gustavson

unread,
Dec 16, 1993, 8:54:30 PM12/16/93
to
In article <CHtD...@mentor.cc.purdue.edu>, hru...@snap.stat.purdue.edu

(Herman Rubin) wrote:
>
> In article <1993Dec9.10...@lumina.edb.tih.no> ke...@edb.tih.no (Ketil Albertsen,TIH) writes:
> >In article <CHp2M...@mentor.cc.purdue.edu>, hru...@snap.stat.purdue.edu (Herman Rubin) writes:
>
> >[...]
>
> I have seen such standards based on decimal conversion and other such
> horrors. I have NOT seen in any proposed standard a real binary, except
> for standards based on IEEE, which are totally unacceptable as being too
> limited.
>
> What is needed is a standard for binary fixed-point numbers and floats
> which is as easy and as powerful to use as the 0x notation for hex integers,
> and which requires no use whatever of the idea of decimals. We may even
> want to allow full-byte characters in the transmission, but it should be
> as straightforward for someone who grew up using any other base than for
> the decimal overenthusiasts.
>
> If such a standard is produced, it will be relatively cheap to use.
> At this time, all machines use binary for all except a few purposes,
> so a binary protocol will be machine friendly, and at least to people
> like me, user friendly as well.
> --
> Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
> Phone: (317)494-6054
> hru...@snap.stat.purdue.edu (Internet, bitnet)
> {purdue,pur-ee}!snap.stat!hrubin(UUCP)

I presume your "IEEE" reference is to IEEE Std 754, the famous
floating-point standard. In fact, that standard does not define an
interchange format (I fought to add that part, but lost).

There are many IEEE standards.

A new one, IEEE Std 1596.5-1993, was designed specifically to make binary
interchange of data among heterogeneous machines as simple and efficient as
possible. Contact d...@apple.com for details.

Raul LopezHernandez

unread,
Dec 17, 1993, 7:29:28 PM12/17/93
to
In article <2eqlr6...@umbc8.umbc.edu> mja...@gl.umbc.edu (Miriam Jaffe) writes:
[...]

>Researchers using scientific imaging equipment which
[...]

>to the PCs were PC-to-PC transfers. The researchers
>noticed that the transmission rates were wildly different
[...]

>for ftp. We knew this, and we'd told the researchers this,
[...]

Good case, but I wonder if Architects, Secretaries, Gardeners,
Teachers or Managers would have had the same problem... :)

My point is that there is a lot to do when it comes to interfaces
and manuals and training before anybody can take advantage of our
precious architectural work and I think that we should do something
about that.
It is not good enough that one is "an architect for a microprocessor
that would sell in the thousands" like somebody described himself to me a few
months ago. I would propose that we architects spend a little more of our
day interacting with real people and trying to get a real idea if our
architectures are going to sell and satisfy real market needs and that
our products are easy to use and upgrade.

RAUL IZAHI
--
------------------- I'm solely responsible for my postings -----------------
Raul Izahi Lopez Hernandez C-Cube Microsystems, Milpitas, CA, U.S.A.
"Real-time or Never!" GUADALAJARA - PALO ALTO - BERGEN - PALO ALTO
Favourite food while in: Pozole Prawns Whale Prawns

0 new messages