Compressed executables

Mark Taunton

unread,

Jan 22, 1991, 7:04:04 AM1/22/91

to

In article <39...@skye.ed.ac.uk> ric...@aiai.UUCP (Richard Tobin) writes:
>In article <118...@uunet.UU.NET> r...@uunet.UU.NET (Root Boy Jim) writes:
>>I once suggested to Chris Torek that the kernel should execute
>>compressed programs. He groaned.
>
>This has been done by Acorn in their RISC iX port of 4.3. The compression
>is done in such a way that the file can still be demand paged.
>
>Perhaps someone from Acorn could give more details?

Yes. Here they are:

The machines on which RISC iX is currently implemented have
"interesting" memory management hardware. Each bank of 4Mbytes of
physical memory is handled by a MEMC (Memory Controller) chip in such
a way that every physical page is always mapped to some logical page
(yes, it *is* that way round). This is done entirely on-chip, using a
CAM in which each entry controls one physical page: if the entry's
contents (logical page number) matches the logical page address put
out by the ARM CPU, then the associated physical page (identified by
the index of the entry in the CAM table) is used for the data
transfer, provided that the associated protection bits allow. In the
technology used in 1986 when MEMC was first made the practical limit
on CAM entries was 128. Hence the page size for a 4Mb physical memory
chunk comes out at a whopping 32Kb. If you are prepared to use more
MEMCs and have each control less memory, the page size goes down (to a
minimum of 4Kb, for 512Kb/MEMC); of course this adds to system cost
for several reasons so Acorn chose not to.

Now the consequences of a 32Kb page size are in many respects
unpleasant, but it is possible to turn some of these to advantage. In
particular since the first RISC iX machine (the Acorn R140, launched
in early 1989 for approx. 3.5k pounds) had only 50Mb of disc and we
wanted to put as much as possible of 4.3 BSD onto it, we needed to do
some shoehorning. The problem is exacerbated because ARM, in common
with most RISC architectures, has fairly poor code density (though not
as bad as some). The solution was to use a code compression technique
which is tuned to the particular typical bit patterns of ARM
instruction sequences and which achieves a reduction of around 50%.
The saving in disc space comes by applying this compression technique
on a per-page basis to demand paged executable files. Each page as
stored on disc would normally occupy 32Kb or four disc blocks on an
8Kb blocksize filesystem, but compression reduces this to around 16Kb
on average, taking up two or perhaps three 8Kb blocks. (With the
newer R260 machine we moved to 4Kb blocks in the shipped filesystem,
which further improves the average disc space saving.) The unused
blocks in a page prototype simply become "holes". In conjunction with
a simple shared library scheme, the actual disc occupancy of
executables becomes quite tolerable.

The compression operation - we call it "squeezing" - is applied to a
normal demand-load format program as a separate operation (the linker
produces normal uncompressed output, but may immediately invoke the
squeeze program if desired). An unsqueeze program performs the
inverse operation if required (e.g. for debugging purposes).

Compressed executables comprise a header, the text, the data and an
optional symbol table. The header includes a magic number to identify
the compressed format and two tables of values for the decompression
algorithm, and is zero padded to the next page boundary. Each page of
text or data is compressed "in place", i.e. the compressed data starts
at the same offset in the file that the uncompressed data did, and the
symbol table if present also starts at a page-size offset.

The kernel recognises compressed executables by the magic number at
exec time, and reads in the decompression tables in the header at this
point, via the buffer cache. The tables are typically not very big,
and are held on disc in a compacted format which is expanded by the
kernel. To save memory, since most programs use much less than 32Kb
of stack, the expanded tables are kept in user space just before the
actual process's user stack. To demand load a page, the kernel reads
in the blocks containing the compressed code or data directly into the
new page frame (not through the buffer cache) then applies the
decompression algorithm on the data. Since the data starts at the
same file address regardless of compression, there is no problem
finding it. A checksum kept with the compressed data helps protect
against problems such as the possibility, practically never seen, of a
wayward program corrupting the decompression tables in its own stack
area. The page read operation takes advantage of any adjacency of
disc blocks (which with careful tuning of the filesystem as shipped
happens most of the time) and of course holes take no time to load!
The decompression algorithm is extremely quick, and the net result is
that a page can be ready for use in less wall-clock time than it would
have taken to read in a whole 32Kb of uncompressed data directly. So
the technique saves both disc space and time, although of course if
the page size were smaller these benefits would probably be outweighed
by other considerations.

------------------------------------------------------------------------
Mark Taunton Email: ma...@acorn.co.uk
RISC iX Development Group
Acorn Computers Ltd
Cambridge, England.

Richard Tobin

unread,

Jan 21, 1991, 7:59:19 AM1/21/91

to

In article <118...@uunet.UU.NET> r...@uunet.UU.NET (Root Boy Jim) writes:
>I once suggested to Chris Torek that the kernel should execute
>compressed programs. He groaned.

This has been done by Acorn in their RISC iX port of 4.3. The compression
is done in such a way that the file can still be demand paged.

Perhaps someone from Acorn could give more details?

-- Richard
--
Richard Tobin, JANET: R.T...@uk.ac.ed
AI Applications Institute, ARPA: R.Tobin%uk.a...@nsfnet-relay.ac.uk
Edinburgh University. UUCP: ...!ukc!ed.ac.uk!R.Tobin

Victor Gavin

unread,

Jan 23, 1991, 7:38:08 AM1/23/91

to

In article <39...@skye.ed.ac.uk> ric...@aiai.UUCP (Richard Tobin) writes:

>In article <118...@uunet.UU.NET> r...@uunet.UU.NET (Root Boy Jim) writes:
>>I once suggested to Chris Torek that the kernel should execute
>>compressed programs. He groaned.
>
>This has been done by Acorn in their RISC iX port of 4.3. The compression
>is done in such a way that the file can still be demand paged.

And perhaps they could also explain why it doesn't slow the system down, coz if
the processor has enough time to decompress something coming off the disk, it
has enough time to go and run another program.

Every Acorn person I ask this question of, waffles and side-tracks.

vic

Blair P. Houghton

unread,

Jan 24, 1991, 10:51:44 AM1/24/91

to

What about _encrypted_ executables?

--Blair
"Yes, I do have a real
application in mind."

Richard Tobin

unread,

Jan 24, 1991, 1:14:13 PM1/24/91

to

In article <1991Jan23.1...@grep.co.uk> v...@grep.co.uk (Victor Gavin) writes:
>And perhaps they could also explain why it doesn't slow the system down, coz if
>the processor has enough time to decompress something coming off the disk, it
>has enough time to go and run another program.

It seems pretty clear that it's trading cpu time for disk space and
disk accesses. On a reasonably fast workstation with small slow
disks, it seems likely to be a winning tradeoff.

Henry Troup

unread,

Jan 25, 1991, 9:34:56 AM1/25/91

to

Of course, if you can implement compression in special hardware between the
disk and the CPU, you can do this with no impact on CPU performance.

Henry Troup - BNR owns but does not share my opinions | The .signature is the
P.O. Box 3511, Stn. C. Ottawa, Ontario, Canada K1Y 4H7| lowest form of humour
uunet!bnrgate!hwt%bwdlh490 H...@BNR.CA +1 613-765-2337 |

cc...@levels.sait.edu.au

unread,

Jan 27, 1991, 12:44:24 AM1/27/91

to

ric...@aiai.ed.ac.uk (Richard Tobin) writes:
> In article <1991Jan23.1...@grep.co.uk> v...@grep.co.uk (Victor Gavin) writes:
>>And perhaps they could also explain why it doesn't slow the system down, coz if
>>the processor has enough time to decompress something coming off the disk, it
>>has enough time to go and run another program.
>
> It seems pretty clear that it's trading cpu time for disk space and
> disk accesses.

But Victor's point was that while the executable is being uncompressed, the
system isn't -- and can't be -- executing some other process; hence it must
surely slow down a "busy" system.

On the other hand Acorns do seem to be "one person, one computer" computers.
So perhaps -- just perhaps -- the system usually has no other process to run,
and the user is in any case going to wait for the disk load to complete. Hence
in this model, compressed executable don't slow down the system (or so we are
told.)

David Newall, who no longer works Phone: +61 8 344 2008
for SA Institute of Technology E-mail: cc...@lux.sait.edu.au
"Life is uncertain: Eat dessert first" *Check the return address!*

Andrew Wheadon

unread,

Jan 25, 1991, 10:18:05 PM1/25/91

to

In article <1991Jan23.1...@grep.co.uk> v...@grep.co.uk (Victor Gavin) writes:

>In article <39...@skye.ed.ac.uk> ric...@aiai.UUCP (Richard Tobin) writes:
>And perhaps they could also explain why it doesn't slow the system down, coz if
>the processor has enough time to decompress something coming off the disk, it
>has enough time to go and run another program.
>

> vic
Welllllll, perhaps they have a slow disk:
The smaller the file, the less time it takes to read from disk, the more time
there is to decompress :-). (Is actually true on some PC's without HD-Caching)
--
Andrew Wheadon | and...@acwbust.incom.de | I support the troops in the Golf

Richard Tobin

unread,

Jan 25, 1991, 9:14:08 AM1/25/91

to

In article <20...@inews.intel.com> bhou...@pima.intel.com (Blair P. Houghton) writes:
>What about _encrypted_ executables?

I have a vague recollection of someone (maybe Richard O'Keefe?)
suggesting a processor that permuted its opcodes differently
for each program...

Chris Calabrese

unread,

Jan 28, 1991, 8:54:38 AM1/28/91

to

In article <40...@skye.ed.ac.uk> ric...@aiai.UUCP (Richard Tobin) writes:
>In article <1991Jan23.1...@grep.co.uk> v...@grep.co.uk (Victor Gavin) writes:
>>And perhaps they could also explain why it doesn't slow the system down, coz if
>>the processor has enough time to decompress something coming off the disk, it
>>has enough time to go and run another program.
>
>It seems pretty clear that it's trading cpu time for disk space and
>disk accesses. On a reasonably fast workstation with small slow
>disks, it seems likely to be a winning tradeoff.

Yes. One very big win for compressed executables is when the
workstation has 'disks' mounted over a very slow link (say 9600 baud).

In fact, one might even want a compressed file system type for
mounting remote disks from a home machine. You can usually get
something like 3:1 on executables and 10:1 on English text, so a
workstation with a 9.6kb link with most of the executables local
and much of the documents, etc. remote would behave like it had a 96kb
link. Remember, this is a direct pipe (not shared like ethernet), so
you could actually get reasonable performance.

Name: Christopher J. Calabrese
Brain loaned to: AT&T Bell Laboratories, Murray Hill, NJ
att!ulysses!cjc c...@ulysses.att.com
Obligatory Quote: ``pher - gr. vb. to schlep. phospher - to schlep light.philosopher - to schlep thoughts.''

Victor Gavin

unread,

Jan 25, 1991, 7:22:44 AM1/25/91

to

In article <40...@skye.ed.ac.uk> ric...@aiai.UUCP (Richard Tobin) writes:

>It seems pretty clear that it's trading cpu time for disk space and
>disk accesses. On a reasonably fast workstation with small slow
>disks, it seems likely to be a winning tradeoff.

I would agree with you if Acorn actually had a range of machines with this as a
low entry machine. When you start comparing against SPARCs (for instance) you
get the distinct impression that Acorn aren't totally commited yet. I think the
reason they've made the R140/R260 the way they have is because of their
background in building small, personal machines.

As an aside, who would like to see an ARM based X-terminal. I suggested this to
Acorn a long time ago (and again quite recently) that they do this, but they
never seemed interested. I don't know enough about X-windows, or I'd do it
myself :-)

The way X terminal manufacturers have sprung up, their is definitely a market
(and that means $$$$) and it would also give Acorn a market presence, and help
raise the Acorn name in commercial unix environments.

vic

Eiger Richard

unread,

Jan 30, 1991, 7:31:19 AM1/30/91

to

In article <54...@bwdls58.UUCP> h...@bwdlh490.BNR.CA (Henry Troup) writes:
>Of course, if you can implement compression in special hardware between the
>disk and the CPU, you can do this with no impact on CPU performance.
>

I thought of the exact same thing. While we're at it, why not put
hardware compressing onto the drives? This could be implemented using
schemes that are used successfully in image processing systems. One of
the problems I can think of however is the need to be able to access raw
data directly on the disk. Comments? Experiences?

Richard

Martin Weitzel

unread,

Jan 31, 1991, 6:59:11 AM1/31/91

to

In article <1991Jan30.1...@chvax.uucp> rhe...@chvax.UUCP (Eiger Richard) writes:

Some other problem: You can never be sure how much free space you
have on disk. Worse, even if a file remains the same size (or slightly
shrinks) it may occupy more space than before and the disk space is
no more sufficient. (Of course, for disks which contain more or less
"read-only"-stuff this may be no big problem).
--
Martin Weitzel, email: mar...@mwtech.UUCP, voice: 49-(0)6151-6 56 83

Root Boy Jim

unread,

Feb 1, 1991, 12:27:03 AM2/1/91

to

>In article <1991Jan30.1...@chvax.uucp> rhe...@chvax.UUCP (Eiger Richard) writes:
>:In article <54...@bwdls58.UUCP> h...@bwdlh490.BNR.CA (Henry Troup) writes:
>:>Of course, if you can implement compression in special hardware between the
>:>disk and the CPU, you can do this with no impact on CPU performance.
>:>
>:I thought of the exact same thing. While we're at it, why not put
>:hardware compressing onto the drives?

The problem here is that disk drivers access data by block, optimizing
sector rotation and cylinder head motion. If you're gonna send it all to
the disk anyway, you don't save any transfer time. Compressing the
data changes the boundarys where the sectors fall. Many people think
it a bad idea to move layout policy out of the kernel; others
would rather let a smart caching disk do it.
--

Root Boy Jim Cottrell <r...@uunet.uu.net>
Close the gap of the dark year in between

Brandon S. Allbery KB8JRR

unread,

Feb 2, 1991, 2:02:00 PM2/2/91

to

As quoted from <1991Jan30.1...@chvax.uucp> by rhe...@chvax.uucp (Eiger Richard):
+---------------

| In article <54...@bwdls58.UUCP> h...@bwdlh490.BNR.CA (Henry Troup) writes:
| >Of course, if you can implement compression in special hardware between the
| >disk and the CPU, you can do this with no impact on CPU performance.
|
| I thought of the exact same thing. While we're at it, why not put
| hardware compressing onto the drives? This could be implemented using

+---------------

I thought this was the theory behind RLL vs. MFM. Not, of course, that RLL is
necessarily the best way to do it....

++Brandon
--
Me: Brandon S. Allbery VHF/UHF: KB8JRR on 220, 2m, 440
Internet: all...@NCoast.ORG Packet: KB8JRR @ WA8BXN
America OnLine: KB8JRR AMPR: KB8JRR.AmPR.ORG [44.70.4.88]
uunet!usenet.ins.cwru.edu!ncoast!allbery Delphi: ALLBERY