Sean Peck <sp...@www.industry.net> writes:
>Does anyone happen to have a description of these algorithms?
The encoding scheme is really quite simple. First, you produce a line of
output like "begin 640 filename" where the "640" is any valid Unix-type file
protection mask (three digits, all between 0 and 7) and the "filename" is the
name of the file you want the binary data to decode into.
Then, you encode the data. You read the input binary file in chunks of 45
bytes. Each of these chunks will produce one line of uuencoded output. If
there are fewer than 45 bytes left in a file, no problem.
The first character of each encoded line gives the number of encoded bytes on
that line. It is usually an "M", meaning 45 bytes, except for the last data
line, which is usually shorter. Each line is, actually, allowed to be between
one and 63 bytes in length, and the translation from the length to a printable
standard-ASCII character is done very simply by adding the 32 to the length
and using the ASCII character that corresponds to the resulting code. To
encode a zero, you can use either the SPACE character or the back-quote
character (`) (your choice, but the back-quote causes fewer problems) (zero
has special meaning as a length).
Then you take the data to be encoded on a line in groups of three bytes. If
there are fewer than three characters in the last byte group, then you take
the second and/or third input-byte values as being zero. You then expand each
of the 8-bit values of the three bytes into four 6-bit values, as follows:
;obyt 11111111 22222222 33333333 44444444 -- output byte numbers
;pos 76543210 76543210 76543210 76543210 -- output-byte bit positions
;byt xx111111 xx112222 xx222233 xx333333 -- input byte numbers
;bit 765432 107654 321076 543210 -- input data bit positions
This is expressed in C (with extra noise) as:
line[linepos ] = uucodeChar[ b0 >> 2 ];
line[linepos+1] = uucodeChar[ (b0&0x03)<<4 | (b1&0xF0)>>4 ];
line[linepos+2] = uucodeChar[ (b1&0x0F)<<2 | (b2&0xC0)>>6 ];
line[linepos+3] = uucodeChar[ b2 & 0x3F ];
Then you take the resulting group of four bytes and encode them into ASCII
characters like you did with the line length, and write the four resulting
characters out onto the line, in order. Repeat this for each group of
three input bytes to be put out onto one line, and then write the end-of-line
character(s).
At the end of the output, you encode a zero-length line, by spitting out a
single SPACE or back-quote character on a line, and follow that with a line
that contains only "end". That's uuencoding for you.
To decode, you inverse the process to extract the encoded bits.
If you want to avoid re-inventing the wheel, there are already uucode programs
and source for Unix and the C128/C64 available for free.
Keep on Hackin'!
-Craig Bruce
csb...@ccnga.uwaterloo.ca
"Eat well, sleep well, and work like hell." - The Astronomers