Re: [julia-dev] Endianness

536 views
Skip to first unread message

Stefan Karpinski

unread,
Dec 10, 2012, 12:56:50 PM12/10/12
to Julia Dev
Definitely a worthwhile goal. At the moment, I'm not aware of any big-endian architectures that Julia even runs on. Are there any?


On Mon, Dec 10, 2012 at 12:42 PM, Pierre-Yves Gérardy <pyg...@gmail.com> wrote:
Hello guys,

I've seen that the sound.jl extra library only works on little endian processors.

Because of the language scope, Julia users will often have to deal with binary data, and providing endian conversion routines would be very useful.

Here are two snippets by Rob Pike (of Plan 9 and Google Go! fame) to load binary streams of some endian on any kind of processor:

Let's say your data stream has a little-endian-encoded 32-bit integer. Here's how to extract it (assuming unsigned bytes):
 
    i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);
 
If it's big-endian, here's how to extract it:
 
    i = (data[3]<<0) | (data[2]<<8) | (data[1]<<16) | (data[0]<<24);
 
Both these snippets work on any machine, independent of the machine's byte order, independent of alignment issues, independent of just about anything. They are totally portable, given unsigned bytes and 32-bit integers.

 I suppose that they could be encoded as macros...

Cheers,
-- Pierre-Yves

--
 
 
 

Patrick O'Leary

unread,
Dec 10, 2012, 1:03:14 PM12/10/12
to juli...@googlegroups.com
Any of the built-in bits types should be able to use ltoh() or ntoh() to do what Pike's code does for little-endian and big-endian input streams, respectively. That's all that StrPack does.

Tim Holy

unread,
Dec 10, 2012, 1:32:31 PM12/10/12
to juli...@googlegroups.com
On Monday, December 10, 2012 09:42:17 AM Pierre-Yves Gérardy wrote:
> Because of the language scope, Julia users will often have to deal with
> binary data, and providing endian conversion routines would be very useful.

If you happen to pick the right search terms...

julia> apropos("swap")
Loading help data...
Base.fftshift(x)
Base.fftshift(x, dim)
Base.bswap(n)

julia> help("bswap")
Base.bswap(n)

Byte-swap an integer

There are also ntol, ntoh, etc., but I don't think they're documented.

--Tim

Patrick O'Leary

unread,
Dec 10, 2012, 1:39:47 PM12/10/12
to juli...@googlegroups.com
The point of Pike's essay, though, is that bswap() is never what you actually want, as it tends to be fragile to check the host's behavior. In Julia's case, its ltoh/ntoh are still a bit fragile, but at least the fragility of checking the host is centralized and could be eliminated by using Pike's code instead.

Stefan Karpinski

unread,
Dec 10, 2012, 1:47:27 PM12/10/12
to Julia Dev
Here is Rob Pike's post in case anyone wants to read it. It has been linked to on this list in the past.


--
 
 
 

Jeffrey Sarnoff

unread,
Dec 11, 2012, 7:21:25 PM12/11/12
to juli...@googlegroups.com
Are serialize/deserialize above this level of concreteness, or not?

Jeff Bezanson

unread,
Dec 12, 2012, 4:24:13 AM12/12/12
to juli...@googlegroups.com
Yes, ideally [de]serialize would just work in all cases.
To me the only issue is the mechanics of knowing and communicating the
endianness of a stream. We would have to set a flag on the stream
(gaaaa! more state!!) perhaps, and have a way to indicate it in the
serialized format (though it might be easier to just declare it always
little endian).

Then there is performance. In the case of a big array, I suspect the
fastest thing is to read it as one big chunk, then use the CPU's bswap
on each word in a tight loop. That is where swapping comes into play.

We could also provide read_le(io, T) and read_be(io, T) functions, in
case you really do just want to decode an int or two out of a stream
of known endianness.
> --
>
>
>

Jeffrey Sarnoff

unread,
Dec 12, 2012, 8:16:36 AM12/12/12
to juli...@googlegroups.com
I very much want to avoid dealing with endianness in serialize/deserialize.
Always serializing in little-endian format and letting deserialize have a conditional check for big-endianness (like WORD_SIZE) that, if so, maps the stream into big-endian land.
Reply all
Reply to author
Forward
0 new messages