Endian tests & byteswap?

268 views
Skip to first unread message

Tim Holy

unread,
Apr 5, 2012, 5:41:05 AM4/5/12
to juli...@googlegroups.com
I'm having a bit of trouble figuring out how unbox16 and friends work, and
since I don't have a big-endian computer to test on...

Will this work?
function isbigendian()
unbox16(0x0001) == 256 ? true : false
end
or would endian tests & byte-swapping need to be implemented in C?

--Tim

Stefan Karpinski

unread,
Apr 5, 2012, 4:31:47 PM4/5/12
to juli...@googlegroups.com
Ah, someone has gotten into unbox and box features. These are very low-level intrinsics that should *never* be called by code that's not bootstrapping the julia system itself. What unboxN does is take a first-class N-bit bitstype value and "convert" it into a non-first-class, unboxed chunk of bits in memory, which is *only* something that Julia intrinsics defined in src/intrinsics.cpp can operate on. These intrinsics are not C functions like what ccall provides — they are named "functions" that can only operate on unboxed values; however, they don't emit a function call when Julia compiles them — instead, thee actually construct a little piece of LLVM syntax tree. Basically, you should forget about all the box and unbox functions ;-)

You can check endianness using the array reinterpret mechanism, however:

julia> reinterpret(Uint16,uint8(1:2))[1]
0x0201

julia> reinterpret(Uint32,uint8(1:4))[1]
0x04030201

julia> reinterpret(Uint64,uint8(1:8))[1]
0x0807060504030201

I'm pushing a commit that exposes the middle value as ENDIAN_BOM. However, I'm wondering what system you're possibly running Julia on that is anything but little-endian...

Stefan Karpinski

unread,
Apr 5, 2012, 4:48:38 PM4/5/12
to juli...@googlegroups.com
On a related note, was just reading this:


Rob Pike kind of knows what he's talking about ;-)

Tim Holy

unread,
Apr 5, 2012, 6:04:48 PM4/5/12
to juli...@googlegroups.com
On Thursday, April 05, 2012 03:48:38 pm Stefan Karpinski wrote:
> On a related note, was just reading this:
>
> http://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html
>
> Rob Pike kind of knows what he's talking about ;-)

Here's it's more an issue of properly supporting particular image file types,
like UVF
(http://www.sci.utah.edu/devbuilds/workshop11/docs/GettingDataIntoImageVis3D.pdf),
which have a field that encodes the endian state used by the raw data stored on
disk. If someone were to acquire and store binary data on a big-endian
machine, it would have to be byteswapped before it would make any sense on a
little endian machine.

--Tim


>
> On Thu, Apr 5, 2012 at 3:31 PM, Stefan Karpinski
<ste...@karpinski.org>wrote:
> > Ah, someone has gotten into unbox and box features. These are very
> > low-level intrinsics that should *never* be called by code that's not
> > bootstrapping the julia system itself. What unboxN does is take a
> > first-class N-bit bitstype value and "convert" it into a non-first-class,
> > unboxed chunk of

> > bits<http://en.wikipedia.org/wiki/Boxing_(computer_science)#Boxing>in

Stefan Karpinski

unread,
Apr 5, 2012, 6:28:48 PM4/5/12
to juli...@googlegroups.com
Right, right. This is the distinction that Rob Pike makes: it is important to know what the endianness of a data stream is, but not what the endianness of your machine is. That's a subtle distinction, but an important one. That being said, I've certainly written a lot of code that uses ntoh and hton functions.

Tim Holy

unread,
Apr 5, 2012, 7:27:22 PM4/5/12
to juli...@googlegroups.com
On Thursday, April 05, 2012 05:28:48 pm Stefan Karpinski wrote:
> Right, right. This is the distinction that Rob Pike makes: it is important
> to know what the endianness of a data stream is, but not what the
> endianness of your machine is. That's a subtle distinction, but an
> important one. That being said, I've certainly written a lot of code that
> uses ntoh and hton functions.

OK, when I wrote that last message I was trying to catch a train. Now I've
actually looked at it :-). Interesting indeed, and not what I expected!

Should the converse apply when you write a file? E.g., always write in little
endian format and say so, rather than writing in native order? (I'm sure this
is 102% obvious to you, but if you write in native order, then you'd better
test the endian status of the machine you're on.)

Thanks for the pointer.

--Tim

>

Patrick O'Leary

unread,
Apr 5, 2012, 7:43:15 PM4/5/12
to juli...@googlegroups.com
On Thursday, April 5, 2012 6:27:22 PM UTC-5, Tim wrote:
On Thursday, April 05, 2012 05:28:48 pm Stefan Karpinski wrote:
> Right, right. This is the distinction that Rob Pike makes: it is important
> to know what the endianness of a data stream is, but not what the
> endianness of your machine is. That's a subtle distinction, but an
> important one. That being said, I've certainly written a lot of code that
> uses ntoh and hton functions.

OK, when I wrote that last message I was trying to catch a train. Now I've
actually looked at it :-). Interesting indeed, and not what I expected!

Should the converse apply when you write a file? E.g., always write in little
endian format and say so, rather than writing in native order? (I'm sure this
is 102% obvious to you, but if you write in native order, then you'd better
test the endian status of the machine you're on.)

From personal experience, life is better if you declare your wire protocol (in general, your serialization format) with a well-defined byte ordering. You're probably I/O bound anyways.

Stefan Karpinski

unread,
Apr 5, 2012, 10:37:44 PM4/5/12
to juli...@googlegroups.com
Yes, absolutely. Wire protocols need a well-defined byte-order. It's the *code* that doesn't need to be byte-order aware. That's the point of Rob Pike's blog post.
Reply all
Reply to author
Forward
0 new messages