Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

compressing Xilinx bitstreams

540 views
Skip to first unread message

John Larkin

unread,
Jun 17, 2004, 5:41:01 PM6/17/04
to

Forgive me if this has been asked before, but does anybody have
comments or links to simple methods of compressing/decompressing
Xilinx configuration bitstreams? I've been perusing a few of my .rbt
files, and they have long bunches of 1s and 0s (interestingly,
different designs seem to have more 1s, others mostly 0s.) I'd think
that something very simple might achieve pretty serious (as, maybe
2:1-ish) compression without a lot of runtime complexity. We generally
run a uP from EPROM, with the uP code and the packed Xilinx config
stuff in the same eprom, with the uP bit-banging the Xilinx FPGA at
powerup time. So a simple decompressor would be nice.

I did google for this... haven't found much.

Thanks,

John

Clark Pope

unread,
Jun 17, 2004, 6:09:31 PM6/17/04
to
The bit generation tool has an option to compress the .bit file. I use this
when I'm loading over JTAG to save time. I assume Xilinx has info on in
system programming with a compressed .bit file.

However, I've observed the same phenomenon as you: when I zip a .bit file it
is usually less than 50% of the original size. My guess is even a trivial
run length encoding compression would be helpful. There are plenty of
resources for Lempel Ziv compression on the web:

see http://www.dogma.net/markn/articles/lzw/lzw.htm

If you get it working please post/send the result.


"John Larkin" <jjla...@highSNIPlandTHIStechPLEASEnology.com> wrote in
message news:hh34d0tud78se5vqe...@4ax.com...

John_H

unread,
Jun 17, 2004, 6:14:15 PM6/17/04
to
First, please be aware that the ACSII .rbt file is 8x the simple .bin file
size. Check the bitgen options and you'll find the ability to generate the
straight binary file - 1s and 0s at the bit level, not the ASCII character
level. Compression beyond that may be what you're looking for, but please -
start with the binary file.

"John Larkin" <jjla...@highSNIPlandTHIStechPLEASEnology.com> wrote in
message news:hh34d0tud78se5vqe...@4ax.com...
>

Austin Lesea

unread,
Jun 17, 2004, 6:14:07 PM6/17/04
to
John,

I think that I had heard that zipping, and unzipping bit files led to
the most compression (2:1 or better). (classic unix or windows zip/unzip)

I think that a zip/unzip routine would be a great example of something a
uP could do without an unreasonable amount of memory (ROM+RAM) support.

Austin

Steve Casselman

unread,
Jun 17, 2004, 6:42:32 PM6/17/04
to
"John Larkin" <jjla...@highSNIPlandTHIStechPLEASEnology.com> wrote in
message news:hh34d0tud78se5vqe...@4ax.com...
>

VCC did a package called HOTMan that does compression. It takes the bit file
and turns it into a compressed file that looks like...

int testArray[2669]=\
{
0xddedda78,0xe55c8c5f,0xefe1c079.... }

We get at least 4 to 1 and small designs in big chip can get 50 to 1. The
above format allows you to compile the design into a C/C++ program.

Steve

John Larkin

unread,
Jun 17, 2004, 7:12:19 PM6/17/04
to
On Thu, 17 Jun 2004 22:14:15 GMT, "John_H" <johnha...@mail.com>
wrote:

>First, please be aware that the ACSII .rbt file is 8x the simple .bin file
>size. Check the bitgen options and you'll find the ability to generate the
>straight binary file - 1s and 0s at the bit level, not the ASCII character
>level. Compression beyond that may be what you're looking for, but please -
>start with the binary file.
>

Of course. We have a little utility, vaguely like a linker, that
gobbles up Motorola .s28 files and Xilinx .rbt files and builds a rom
image, all properly squashed into bits. It's cute... it even saves the
beginning of the rbt ASCII header in the rom image for FPGA version
verification. My observation was that the bits themselves include long
runs of 1s or 0s.

I'd like to design a board using a 28-pin eprom (space is at a premium
here) but plan hooks for using a bigger Xilinx chip some day, and then
I'd run out of rom space to store the config bits. So having a
compression scheme would give us the margin to use the small eprom.

Suppose the compressed data were an array of bytes. If the MS bit of a
byte were 0, the remaining 7 bits are to be loaded verbatum; if the MS
bit is a 1, the other 7 bits specify a run of up to 63 1's or 0's.

Something like that; the exact numbers may need tuning. Very easy to
unpack, not hard to encode. I'd have to test some actual config files
to see how good something like this could compress.

John

Tim Wescott

unread,
Jun 17, 2004, 7:11:53 PM6/17/04
to
John Larkin wrote:

No links, but have you considered simple run-length limiting? I can
think of at least one scheme that would be guaranteed sub-optimal from a
compression standpoint but that wouldn't take much code -- just encode
any string of 0xff or 0x00 bytes as that byte followed by a count -- so
that 0x00 0x00 0x00 0x00 becomes 0x00 0x04, for instance. You have the
overhead that 0x00 becomes 0x00 0x01, and you also can't encode anything
that spans bytes -- but you may be happy with it none the less.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Greg Neff

unread,
Jun 17, 2004, 7:26:58 PM6/17/04
to

See:

www.ee.washington.edu/people/faculty/hauck/publications/runlength.PDF
www.ee.washington.edu/people/faculty/hauck/publications/runlengthTR.PDF
www.ee.washington.edu/people/faculty/hauck/publications/runlengthJ.pdf

It should be straightforward to generate some RLL compression and
decompression code. You might want to test the algorithms on a PC to
make sure that the decompressed output ends up the same as the
uncompressed input. A garbled bitstream can have the same effect as
the MC6800 HCF opcode...

================================

Greg Neff
VP Engineering
*Microsym* Computers Inc.
gr...@guesswhichwordgoeshere.com

Paul Leventis (at home)

unread,
Jun 17, 2004, 9:59:50 PM6/17/04
to
Hi John,

> Forgive me if this has been asked before, but does anybody have
> comments or links to simple methods of compressing/decompressing
> Xilinx configuration bitstreams?

Can't help you on the Xilinx front, but many of Altera's newest chips
(Cyclone, Stratix II) support on-the-fly decompression of the bitstream.
The Quartus software compresses the bitstream which is then programmed into
the device using pretty much any of the many methods of programming
available, and the chip's configuration controller will decompress the
bitstream that it sees. This typically achieves a 1.9-2.3:1 compression
ratio, depending on the device utilization, RAM contents and such.

Some of our programming devices also can decompress bitstreams on-the-fly,
allowing bitstream compression for other chip families that do not support
decompression internally.

See the Configuration Handbook Volume 2
(http://www.altera.com/literature/hb/cfg/cfg_volume2.pdf) for a detailed
description of device programming and compression options.

Regards,

Paul Leventis
Altera Corp.


Allan Herriman

unread,
Jun 17, 2004, 11:06:09 PM6/17/04
to
On Thu, 17 Jun 2004 22:09:31 GMT, "Clark Pope" <cep...@mindspring.com>
wrote:

>The bit generation tool has an option to compress the .bit file. I use this
>when I'm loading over JTAG to save time. I assume Xilinx has info on in
>system programming with a compressed .bit file.

This 'compression' merely merges identical frames. The probability of
getting identical frames in a well utilised FPGA isn't very high, so
this doesn't result in much reduction in file size.

Some experiments I did a few years ago (on Virtex-E and Virtex-2
files) indicated that the this compression made subsequent compression
by tools such as gzip *worse*.
It is, however, the only way to speed up JTAG loading.

Regards,
Allan.

Neil Glenn Jacobson

unread,
Jun 18, 2004, 6:25:39 PM6/18/04
to
While this doesn't exactly answer your question, the new Xilinx XCFP
serial PROMs support storage of compressed bitstream data. The data is
compressed when you translate to the PROM format and the PROM does the
decompression before delivery to the FPGA.

http://www.xilinx.com/bvdocs/publications/ds123.pdf

roller

unread,
Jun 19, 2004, 9:55:47 AM6/19/04
to

"John Larkin" <jjla...@highSNIPlandTHIStechPLEASEnology.com> escribió en el
mensaje news:hh34d0tud78se5vqe...@4ax.com...

try searching for RLE (run length encoding) that's the encoding used for
.PCX graphic files

> Thanks,
>
> John
>


Nico Coesel

unread,
Jun 19, 2004, 11:27:40 AM6/19/04
to
John Larkin <jjla...@highSNIPlandTHIStechPLEASEnology.com> wrote:

Tried it but found the files aren't reduced in size much and more
important, the software required to decompress the file eats away all
the savings for a 400k device. In other words: Unless you have more
than around half a million gates of configuration data, it's not worth
it.

--
Reply to nico@nctdevpuntnl (punt=.)
Bedrijven en winkels vindt U op www.adresboekje.nl

John Larkin

unread,
Jun 19, 2004, 12:08:16 PM6/19/04
to
On Sat, 19 Jun 2004 15:27:40 GMT, ni...@puntnl.niks (Nico Coesel)
wrote:


OK, bear with me on this. Here's a piece of a .rbt for a Spartan XL...

01111111111111111111111111111111111111111111111111111111111011111111111111111111111111110111111110111111011111111110111111110101011101111110111111011111111111111111110011111111111111111111111111111111111111111111111111111110101
01111111111111111111111111111111111111111111111111111111111111111111111111111101111111111111111111111111110111111101111111111111111110111111111111110111111111111011111101111111111111111111111111111111111111111111111111111110011
01111111111111111111111111111111111111111111111111111100011111111111111111101111111100111111110011111111111111011101111111111100111011110011111011111111111111111111111110110111001111111111111111111111110111111011111111111111011
01111111111111111111111111111111111111111111111111111111011111111111111111101111111101011111111110011111111111111100111111111111011111111101111111111111111111110111101111111111110111111111111111111111111111111111111111111111110
01111111111111111111111111111111111111111111110111111111111111111111111111111111111111111111111011111111111111111011010111111110011111111011111111111011111011111011110101111111000111111111011111111111101111111111111110101101111
00111111111111111111111111111111111111111111000111111111111111111111111111111111111111111111111111111111110111111111110110111111011111111111111111111101111111111111111101111111110111111100011111111111111111111111111101101100000
01111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110111111111101110111101011111111111111111111111111111111111111101111111101111111111111111111111111111111111111110111111100
00011111111111111111111111111111111111111111111111111111111100111111011111111111001111110110101111001111111101111111111111001111111100111111111001111101101011110110011111101010111101111111111111111111111010111100111111111111000
01101011111111111111111111111111111111111111111111111111111110101111111111111111101011111110011110111111110101001110111111101011011100111111111010010111001111110110101101111111111111111111111111111111110011111101111111010100111
01111011111111111111111111111111111111111111111111111111111111111111011111111111111111110111111111111111110110111111111111101011011111111111111111111101111111111111111101111111111101111111111111111111111111111111111111011111010
01101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110011111111111111111111111111111111111111111001111111111111111110111111111111111111111111111111111111111111111111100111110011000
01111111111111111111111111111111111111111111111111111111111011111111101111111110111111111011111111101111011110111111111110111101111001101111101111111110101011111011010111101111111110111111101111111111111111111110111111110100011
01101111111111111111111111111111111111111111111011111111111111111111111111111111111111111111111111111111111101011010111111111111110111101111111111111101011011111111111111011110111111111111111111111111111111111111111111111110101
00111100111111111111111111111111111111111111111011111111110100111111100011111101001111111000111111111111111110101011111101101011110010011111011011111111101011110110101111010001111110111111111111111111111111111101111100111110111


Where there are lots of 1's. Other hunks of this file are almost all
1's. So what we need is a not-very-general compression scheme, with
the only "dictionary" entry being "the following is a hunk of 1's". So
the decompressor could be very simple.

Interestingly, this is for a Spartan 2:

00000000000001001000000000000000
00000000000000000000000000000000
00000000000100100000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000001001000000000000000
00000000000000000000000000000000
00000000000100100100000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000001001100000000000000
00000000000000000000000000000000
11111111000100110000000100000100
00000000010001000000000000010000
00000000000001110100100000000000
11010100000000000011010000000000
00000001000000000000000000001000
00111111110001000000000000000000

Which has long runs of zeroes!

Just eyeballing these files, it looks like something very simple could
get at least a 2:1 squash factor.

John

Nico Coesel

unread,
Jun 19, 2004, 1:51:15 PM6/19/04
to
John Larkin <jjla...@highlandSNIPtechTHISnologyPLEASE.com> wrote:

>On Sat, 19 Jun 2004 15:27:40 GMT, ni...@puntnl.niks (Nico Coesel)
>wrote:
>
>>John Larkin <jjla...@highSNIPlandTHIStechPLEASEnology.com> wrote:
>>
>>>
>>>Forgive me if this has been asked before, but does anybody have
>>>comments or links to simple methods of compressing/decompressing
>>>Xilinx configuration bitstreams? I've been perusing a few of my .rbt
>>>files, and they have long bunches of 1s and 0s (interestingly,
>>>different designs seem to have more 1s, others mostly 0s.) I'd think
>>>that something very simple might achieve pretty serious (as, maybe
>>>2:1-ish) compression without a lot of runtime complexity. We generally
>>>run a uP from EPROM, with the uP code and the packed Xilinx config
>>>stuff in the same eprom, with the uP bit-banging the Xilinx FPGA at
>>>powerup time. So a simple decompressor would be nice.
>>>
>>>I did google for this... haven't found much.
>>
>>Tried it but found the files aren't reduced in size much and more
>>important, the software required to decompress the file eats away all
>>the savings for a 400k device. In other words: Unless you have more
>>than around half a million gates of configuration data, it's not worth
>>it.
>
>
>OK, bear with me on this. Here's a piece of a .rbt for a Spartan XL...
>

>00111111110001000000000000000000
>
>Which has long runs of zeroes!
>
>Just eyeballing these files, it looks like something very simple could
>get at least a 2:1 squash factor.

Did you ever try to compress these files? I totally agree with you
that these files _look_ easy to compress, but they aren't. I tried
RLE, but that will only save 5% to 10%. ZIP does a little better. I
just tried to compress a .bit file for a 400k gate Xilinx device and
it reduces the size by 26% but you'll need to have room for the ZIP
decompression code...

Tim

unread,
Jun 19, 2004, 3:33:04 PM6/19/04
to
Nico Coesel wrote:

> Did you ever try to compress these files? I totally agree with you
> that these files _look_ easy to compress, but they aren't.

But with a little knowledge of the structure maybe we can do
better than blind RLE or whatever. Surely any structure
which the eye can see can be efficiently encoded?

e.g. "There will be lots of repeats for unused LUTs.
These are coded as abc and should be decoded as xyz"


Nico Coesel

unread,
Jun 19, 2004, 7:23:57 PM6/19/04
to
"Tim" <t...@rockylogic.com.nooospam.com> wrote:

>Nico Coesel wrote:
>
>> Did you ever try to compress these files? I totally agree with you
>> that these files _look_ easy to compress, but they aren't.
>
>But with a little knowledge of the structure maybe we can do
>better than blind RLE or whatever. Surely any structure
>which the eye can see can be efficiently encoded?

Another poster claims huge space savings by using a special tool. I
haven't looked into it.

>e.g. "There will be lots of repeats for unused LUTs.
> These are coded as abc and should be decoded as xyz"

That's the problem: the routing software smears the entire design over
the entire FPGA if it can. You can specify to leave unused space from
the bit-file, but you'll see the length varies with every routing run.
Perhaps the best space saver is to constrain the router to use only a
part of the FPGA which just is big enough to contain your design. Next
specify to leave out the unused stuff.

Zak

unread,
Jun 20, 2004, 7:52:58 AM6/20/04
to
Nico Coesel wrote:

> Did you ever try to compress these files? I totally agree with you
> that these files _look_ easy to compress, but they aren't. I tried
> RLE, but that will only save 5% to 10%.

Probably because the looks for repeating bytes, while here we have only
repeating stretches of 0's. What might work is to re-code the file into
numbers giving the number of 0 bits between 1's as a first step:

00100000101000000000010000011000000000001 would turn into
2 - 5 - 1 - 10 - 5 - 0 - 11.

Stretches of 0 more than 254 long could be encoded as 255, meaning 255
zeroes and no 1, whith the next number to give more 0's. 1-[255 0s]-1
would code to 255 0 in that case.

The resulting bytes are probably easier to huffman compress. Or it may
pay to do this for 0 runs up to 16 long, and coding these as bytes with
values 0-15 (not as nibble pairs, subsequent nibbles probably do not
have any relationship).


Thomas

John Larkin

unread,
Jun 20, 2004, 1:54:40 PM6/20/04
to
On Sat, 19 Jun 2004 17:51:15 GMT, ni...@puntnl.niks (Nico Coesel)
wrote:

>


>Did you ever try to compress these files? I totally agree with you
>that these files _look_ easy to compress, but they aren't. I tried
>RLE, but that will only save 5% to 10%. ZIP does a little better. I
>just tried to compress a .bit file for a 400k gate Xilinx device and
>it reduces the size by 26% but you'll need to have room for the ZIP
>decompression code...

I tried my simple run-encoder. On various designs I have around, it
achieved compression ratios of (best) 0.56 and worst 1.04 (ie,
compressed was bigger than uncompresssed!) The worst was on a fairly
dense XC2S400 bga part, whose rbt file had hardly any long runs of
anything. Even pkzip only managed to crunch the binary config image to
0.74 on this one. It looks to me that the newer Xilinx chip files tend
to be less compressible... seem to have fewer runs. So maybe there's
no very-simple-to-unpack thing that's generally useful.

Needs more thought someday, I guess.

John

Nico Coesel

unread,
Jun 20, 2004, 7:09:41 PM6/20/04
to
Zak <ju...@zak.invalid> wrote:

This makes sense. Haven't tried is though. I presume(d) ZIP looks at
the bits instead of the bytes. Still, don't feel lucky because you
seen a lot of contiguous '1's and '0's.

Here is a wild idea:
Another way of compressing the file may be by stripping the frame
headers (which are repeated at the start of each frame, these can
easely be added during decompression) and sorting the resulting data.
Next step is compressing it, but not by going from left to right, but
going from top to bottom and compress column after column. Because of
the sorting, least changes from 0 to 1 are to be expected in a column.
Decompressing however would require a fair amount of memory, so the
data also has to be divided in blocks so only a block at a time needs
to be decompressed. IIRC it doesn't matter in which order the data
frames are loaded as long as the command frames are at the right
place.

Xilinx has some thorough information on their programming datastream
on their website.

Kolja Sulimma

unread,
Jun 21, 2004, 8:31:05 AM6/21/04
to
> >OK, bear with me on this. Here's a piece of a .rbt for a Spartan XL...
> >
> >00111111110001000000000000000000
> >
> >Which has long runs of zeroes!
> >
> >Just eyeballing these files, it looks like something very simple could
> >get at least a 2:1 squash factor.
>
> Did you ever try to compress these files? I totally agree with you
> that these files _look_ easy to compress, but they aren't. I tried
> RLE, but that will only save 5% to 10%. ZIP does a little better. I
> just tried to compress a .bit file for a 400k gate Xilinx device and
> it reduces the size by 26% but you'll need to have room for the ZIP
> decompression code...

As noted before, Ralph Kuhnert, a student of mine, did.
http://www.sulimma.de/prak/ws0001/projekte/ralph/Projekt/index.htm
http://www.sulimma.de/prak/ws0001/projekte/ralph/Projekt/Projekt.PPT

He achieved 30% to 70% compression just using RLE on XC4K data.
You probably applied the RLE on bytes as a previous poster suggested.
That does not help because the Xilinx data is not byte aligned.
(In the histogramms you can see for example that for all designs runs
of 19 consecutive 1s are quite common. This probably represents some
CLB data, an unsued LUT or something like that.)
You need to encoded the individual bits.
What worked very well for XC4K is to use 4 Bits per codeword to encode
either a zero followed by 0 to 13 ones or 14 ones.

Kolja Sulimma

rickman

unread,
Jun 21, 2004, 5:27:06 PM6/21/04
to

I see that no one has addressed the basic issue of just how compressable
these files are and when you can expect to achieve good compression and
bad compression.

The runs of 1's and 0's are typically located in areas of the bitstream
that represent unused portions of the chip. So for designs that are
sparce, you can get high levels of compression, not unlike the basic
form of compression that Xilinx provides in the Virtex chips (frame
compression). But as the utilization of the chip goes up, the bitstream
becomes more random and the compressability of the bitsteam goes down.
It largely does not matter how you compress the data, once the chip is
largely used, you won't be able to get much compression.

So in the end, compression will help you reduce the size of your bit
steam when the design is much smaller than the chip (where you could use
a smaller chip), but if your design grows the compression will be
reduced and you will end up needing nearly as large a memory as an
uncompressed bit stream. So reducing the size of the chip may be a
better solution if your memory will ultimately limit the size of your
design. A smaller FPGA reduces the size of the bit stream and also
costs less.

--

Rick "rickman" Collins

rick.c...@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX

0 new messages