Index: docs/parrotbyte.pod =================================================================== --- docs/parrotbyte.pod (revision 9235) +++ docs/parrotbyte.pod (working copy) @@ -7,8 +7,33 @@ =head1 Format of the Parrot bytecode +Parrot's bytecode format consists of a small endian neutral header region +followed by a series of segments. ALL words (non-bytes) following the header +are are stored in native order, unless otherwise specified. + +=head1 PBC Header + +The PBC header is a fixed 32 bytes in length. Header values are all encoded as +either a single byte or a string so that it can be parsed without having to +consider the endianness of the data. + 0 1 2 3 +----------+----------+----------+----------+ + | 0xfe 0x50 0x42 0x43 | + +----------+----------+----------+----------+ + | 0x0d 0x0a 0x1a 0x0a | + +----------+----------+----------+----------+ + +The header begins with an eight byte I or I. +This is equivalent to the C strings C<\376PBC\r\n\032\n> (ASCII) and +C<\xfe\x50\x42\x43\x0d\x0a\x1a\x0a> sans the terminating C bytes. Bytes +0 and 4-7 are designed to catch common types of file corruption caused by +transport encoding mechanisms (for example, FTP ASCII transfers). This format +was inspired by the PNG Specification. Please see RFC 2083 for an explanation +of the advantages of this strategy. + + 8 9 10 11 + +----------+----------+----------+----------+ | Wordsize | Byteorder| Major | Minor | +----------+----------+----------+----------+ @@ -20,7 +45,7 @@ Byteorder currently supports two values: (0-Little Endian, 1-Big Endian) - 4 5 + 12 13 14 +----------+----------+----------+----------+ | INT size | FloatType| 10 Byte ... | +----------+----------+----------+----------+ @@ -29,26 +54,40 @@ | core.ops is here | +----------+----------+----------+----------+ -INT size (sizeof(INTVAL)) must be 4 or 8. FloatType 0 is IEEE 754 8 byte +INT size (C) must be 4 or 8. FloatType 0 is IEEE 754 8 byte double, FloatType 1 is i386 little endian 12 byte long double. - 16 + + 20 21 22 23 +----------+----------+----------+----------+ - | Parrot Magic = 0x 13155a1 | + | padding | +----------+----------+----------+----------+ - -Magic is stored in native byteorder. The loader uses the byteorder header to -convert the Magic to verify. More specifically, ALL words (non-bytes) in the -bytecode file are stored in native order, unless otherwise specified. - - 20* + | | +----------+----------+----------+----------+ - | Opcode Type (Perl = 0x5045524c) | + | | +----------+----------+----------+----------+ -The asterisk for the offset states, from here we have opcodes. The given -offsets are for 32 bit opcode types only. +Following the core.ops fingerprint, the header I be padded with C +bytes to be an overall 32 bytes in length. +All words following the header will be interpreted as Op codes. + +=head2 Magic Description + +The following is C description of the PBC Header format. + + 0 string \xfe\x50\x42\x43\x0d\x0a\x1a\x0a Parrot Bytecode (PBC) + >10 byte x + >11 byte x version %2$d.%1$d, + >8 byte x wordsize is %d bytes, + >9 byte =0 byteorder is little endian, + >9 byte =1 byteorder is big endian, + >9 byte >1 byteorder is unknown, + >12 byte x integers are %d bytes, + >13 byte =0 floats are IEEE 754 + >13 byte =1 floats are i387 96-bit + >13 byte >1 float type is unknown + =head1 PBC FORMAT 1 All segments are aligned at a 16 byte boundary. All segments share a common @@ -293,6 +332,12 @@ Eventually there will be a more complete and useful PackFile specification, but this simple format works well enough for now (c. Parrot 0.0.5). +=head1 REFERENCES + +=head2 RFC 2803 + +L + =head1 SEE ALSO F, F, F, F, F, and the @@ -306,7 +351,9 @@ Variable argument opcodes update by Jonathan Worthington C +The header format was mangled by Joshua Hoblitt (JHOBLITT) C + =head1 VERSION -2005.09.19 +2005.09.25