FPGARelated.com
Forums

efinix bit stream question

Started by John Larkin November 27, 2022
We use the efinix T20 trion FPGA.

Questions about the config bit streams:

Are they always the same size, or does it depend on how much logic is
compiled? Would a simple application use less?

Are the streams very compressible? We have done some simple run-length
coding to greatly reduce the storage requirement for other FPGAs.
Configs tend to have long runs of 0's.

The T20/256 claims to need 5.4 megabits. I'd like to store the fpga
config and application code in a Raspberry Pi Pico, which has 2 MB of
onboard flash. Storing the full config would use about a third of
that, so reducing that would be useful.

Am 27.11.22 um 05:34 schrieb John Larkin:
> We use the efinix T20 trion FPGA. > > Questions about the config bit streams: > > Are they always the same size, or does it depend on how much logic is > compiled? Would a simple application use less?
With Xilinx it would for sure. Never used efinix, but I would consider it broken if it didn't.
> Are the streams very compressible?
I would simply test example files with zip, zcat and similar. IIRC, there is even a flow-through decompressor. We have done some simple run-length
> coding to greatly reduce the storage requirement for other FPGAs. > Configs tend to have long runs of 0's. > > The T20/256 claims to need 5.4 megabits. I'd like to store the fpga > config and application code in a Raspberry Pi Pico, which has 2 MB of > onboard flash. Storing the full config would use about a third of > that, so reducing that would be useful.
cheers, Gerhard
On Sun, 27 Nov 2022 10:46:47 +0100, Gerhard Hoffmann <dk4xp@arcor.de>
wrote:

>Am 27.11.22 um 05:34 schrieb John Larkin: >> We use the efinix T20 trion FPGA. >> >> Questions about the config bit streams: >> >> Are they always the same size, or does it depend on how much logic is >> compiled? Would a simple application use less? > >With Xilinx it would for sure. Never used efinix, but I would >consider it broken if it didn't. > >> Are the streams very compressible? > >I would simply test example files with zip, zcat and similar. >IIRC, there is even a flow-through decompressor. > >We have done some simple run-length >> coding to greatly reduce the storage requirement for other FPGAs. >> Configs tend to have long runs of 0's. >> >> The T20/256 claims to need 5.4 megabits. I'd like to store the fpga >> config and application code in a Raspberry Pi Pico, which has 2 MB of >> onboard flash. Storing the full config would use about a third of >> that, so reducing that would be useful. > >cheers, Gerhard >
I'm at home and don't have access to a compiled bitstream, and this is a discussion group. I'll get a T20 bit stream Monday or Tuesday and see what it looks like. If there are many runs of 0's, compression and decompression are very simple. Or maybe a typical stream is just shorter than the max. I recall a Xilinx or maybe Altera stream that compressed about 3:1 with a very simple algorithm. I think I compressed runs of 0's and 1's on that one, with a PowerBasic program. We considered fancier dictionary-based schemes, sort of like Zip, but they weren't worth the hassle.
On Sun, 27 Nov 2022 08:16:57 -0800, John Larkin
<jlarkin@highlandSNIPMEtechnology.com> wrote:

>On Sun, 27 Nov 2022 10:46:47 +0100, Gerhard Hoffmann <dk4xp@arcor.de> >wrote: > >>Am 27.11.22 um 05:34 schrieb John Larkin: >>> We use the efinix T20 trion FPGA. >>> >>> Questions about the config bit streams: >>> >>> Are they always the same size, or does it depend on how much logic is >>> compiled? Would a simple application use less? >> >>With Xilinx it would for sure. Never used efinix, but I would >>consider it broken if it didn't. >> >>> Are the streams very compressible? >> >>I would simply test example files with zip, zcat and similar. >>IIRC, there is even a flow-through decompressor. >> >>We have done some simple run-length >>> coding to greatly reduce the storage requirement for other FPGAs. >>> Configs tend to have long runs of 0's. >>> >>> The T20/256 claims to need 5.4 megabits. I'd like to store the fpga >>> config and application code in a Raspberry Pi Pico, which has 2 MB of >>> onboard flash. Storing the full config would use about a third of >>> that, so reducing that would be useful. >> >>cheers, Gerhard >> > >I'm at home and don't have access to a compiled bitstream, and this is >a discussion group. > >I'll get a T20 bit stream Monday or Tuesday and see what it looks >like. If there are many runs of 0's, compression and decompression are >very simple. Or maybe a typical stream is just shorter than the max. > >I recall a Xilinx or maybe Altera stream that compressed about 3:1 >with a very simple algorithm. I think I compressed runs of 0's and 1's >on that one, with a PowerBasic program. > >We considered fancier dictionary-based schemes, sort of like Zip, but >they weren't worth the hassle. > >
I recall the conclusion that the best dictionary entry for a random data block is itself. Zip doesn't compress random binary data files very well. FPGA bit streams are nonrandom in having long runs of 0's.
On Sat, 26 Nov 2022 20:34:23 -0800, John Larkin
<jlarkin@highlandSNIPMEtechnology.com> wrote:

>We use the efinix T20 trion FPGA. > >Questions about the config bit streams: > >Are they always the same size, or does it depend on how much logic is >compiled? Would a simple application use less? > >Are the streams very compressible? We have done some simple run-length >coding to greatly reduce the storage requirement for other FPGAs. >Configs tend to have long runs of 0's. > >The T20/256 claims to need 5.4 megabits. I'd like to store the fpga >config and application code in a Raspberry Pi Pico, which has 2 MB of >onboard flash. Storing the full config would use about a third of >that, so reducing that would be useful.
Here's a T20 bit stream. The length seems to be constant vs functions coded, but there are enough runs of all 0's that it's probably worth compressing. https://www.dropbox.com/s/vm247lntp78jm20/Efinix_T20_bitstream.hex?dl=0 The actual config file will be binary, not hex of course.
On 2/12/22 08:12, John Larkin wrote:
> On Sat, 26 Nov 2022 20:34:23 -0800, John Larkin > <jlarkin@highlandSNIPMEtechnology.com> wrote: > >> We use the efinix T20 trion FPGA. >> >> Questions about the config bit streams: >> >> Are they always the same size, or does it depend on how much logic is >> compiled? Would a simple application use less? >> >> Are the streams very compressible? We have done some simple run-length >> coding to greatly reduce the storage requirement for other FPGAs. >> Configs tend to have long runs of 0's. >> >> The T20/256 claims to need 5.4 megabits. I'd like to store the fpga >> config and application code in a Raspberry Pi Pico, which has 2 MB of >> onboard flash. Storing the full config would use about a third of >> that, so reducing that would be useful. > > Here's a T20 bit stream. The length seems to be constant vs functions > coded, but there are enough runs of all 0's that it's probably worth > compressing. > > https://www.dropbox.com/s/vm247lntp78jm20/Efinix_T20_bitstream.hex?dl=0 > > The actual config file will be binary, not hex of course.
Gzip compresses your 2.0MB down to 105kB. The decompressor isn't tiny, but it's fairly small. The lz4 decompressor is tiny and still gets to 221kB. Possibly less if you RLE first. bz2 gets it to 76kB, and xz or lzma to 72kB. Compression is one area where it's best to rely on work done by people who understand the theory. Some of these algorithms have a tiny decompressor, the magic is in the compressor. CH
On 01/12/2022 21:12, John Larkin wrote:
> On Sat, 26 Nov 2022 20:34:23 -0800, John Larkin > <jlarkin@highlandSNIPMEtechnology.com> wrote: > >> We use the efinix T20 trion FPGA. >> >> Questions about the config bit streams: >> >> Are they always the same size, or does it depend on how much logic is >> compiled? Would a simple application use less? >> >> Are the streams very compressible? We have done some simple run-length >> coding to greatly reduce the storage requirement for other FPGAs. >> Configs tend to have long runs of 0's. >> >> The T20/256 claims to need 5.4 megabits. I'd like to store the fpga >> config and application code in a Raspberry Pi Pico, which has 2 MB of >> onboard flash. Storing the full config would use about a third of >> that, so reducing that would be useful. > > Here's a T20 bit stream. The length seems to be constant vs functions > coded, but there are enough runs of all 0's that it's probably worth > compressing. > > https://www.dropbox.com/s/vm247lntp78jm20/Efinix_T20_bitstream.hex?dl=0 > > The actual config file will be binary, not hex of course.
Quick scan with one of my utilities gives: Filename : \users\martin\downloads\Efinix~1.hex File size = 4071902 Entropy = 1.225 ( max. 5.545 ) States used = 3.40 ( max. 256 ) Zero frequency : 0-9 11-47 58-64 71-255 Most frequent bytes: 48 30 "0" 2198086 10 A ... 1357302 49 31 "1" 98740 52 34 "4" 97072 56 38 "8" 96870 50 32 "2" 94906 54 36 "6" 26994 51 33 "3" 26880 67 43 "C" 26478 57 39 "9" 25500 65 41 "A" 6820 53 35 "5" 5944 The hex file consists mostly of character "0" bytes and linefeeds. Simple run length encoding would compact it a lot. It seems "7","B","D","E","F" are quite rare in these files. The raw binary file obviously won't have the linefeeds and will be only one byte for every three in the ASCII .hex file so about 1.3M. Back of the envelope RLE might get you a ~20x decrease in size. The right compressor and it could be made a lot smaller. If you put up the binary I'll scan that for byte entropy too. -- Regards, Martin Brown
On 27/11/2022 16:16, John Larkin wrote:
> On Sun, 27 Nov 2022 10:46:47 +0100, Gerhard Hoffmann <dk4xp@arcor.de> > wrote: > >> Am 27.11.22 um 05:34 schrieb John Larkin: >>> We use the efinix T20 trion FPGA. >>> >>> Questions about the config bit streams: >>> >>> Are they always the same size, or does it depend on how much logic is >>> compiled? Would a simple application use less? >> >> With Xilinx it would for sure. Never used efinix, but I would >> consider it broken if it didn't. >> >>> Are the streams very compressible? >> >> I would simply test example files with zip, zcat and similar. >> IIRC, there is even a flow-through decompressor. >> >> We have done some simple run-length >>> coding to greatly reduce the storage requirement for other FPGAs. >>> Configs tend to have long runs of 0's. >>> >>> The T20/256 claims to need 5.4 megabits. I'd like to store the fpga >>> config and application code in a Raspberry Pi Pico, which has 2 MB of >>> onboard flash. Storing the full config would use about a third of >>> that, so reducing that would be useful. >> >> cheers, Gerhard >> > > I'm at home and don't have access to a compiled bitstream, and this is > a discussion group. > > I'll get a T20 bit stream Monday or Tuesday and see what it looks > like. If there are many runs of 0's, compression and decompression are > very simple. Or maybe a typical stream is just shorter than the max.
Binary looks to have incredibly high redundancy and compressibility. One of the lowest byte entropy scores I have seen in a long time. There appear to be strong correlations of identical blocks at strides of 9, 12, 24, 36 as well as huge runs of nul bytes. The odd one of 0a. Also a quick eyeball reveals walking ones 80,40,20,10,08,04,02,01,00 at around 107227 (stride 9). There is an incredibly long run of 15372 nul bytes at offset 143811 RLE the nul bytes should get you most of the way there and maybe some code to RLE the most obvious repeated sequences if you need a bit more. -- Regards, Martin Brown
On Fri, 2 Dec 2022 12:15:56 +0000, Martin Brown
<'''newspam'''@nonad.co.uk> wrote:

>On 27/11/2022 16:16, John Larkin wrote: >> On Sun, 27 Nov 2022 10:46:47 +0100, Gerhard Hoffmann <dk4xp@arcor.de> >> wrote: >> >>> Am 27.11.22 um 05:34 schrieb John Larkin: >>>> We use the efinix T20 trion FPGA. >>>> >>>> Questions about the config bit streams: >>>> >>>> Are they always the same size, or does it depend on how much logic is >>>> compiled? Would a simple application use less? >>> >>> With Xilinx it would for sure. Never used efinix, but I would >>> consider it broken if it didn't. >>> >>>> Are the streams very compressible? >>> >>> I would simply test example files with zip, zcat and similar. >>> IIRC, there is even a flow-through decompressor. >>> >>> We have done some simple run-length >>>> coding to greatly reduce the storage requirement for other FPGAs. >>>> Configs tend to have long runs of 0's. >>>> >>>> The T20/256 claims to need 5.4 megabits. I'd like to store the fpga >>>> config and application code in a Raspberry Pi Pico, which has 2 MB of >>>> onboard flash. Storing the full config would use about a third of >>>> that, so reducing that would be useful. >>> >>> cheers, Gerhard >>> >> >> I'm at home and don't have access to a compiled bitstream, and this is >> a discussion group. >> >> I'll get a T20 bit stream Monday or Tuesday and see what it looks >> like. If there are many runs of 0's, compression and decompression are >> very simple. Or maybe a typical stream is just shorter than the max. > >Binary looks to have incredibly high redundancy and compressibility. >One of the lowest byte entropy scores I have seen in a long time.
My comment was about really random data. An FPGA bit stream certainly has repeated patterns. One might build a N-bit structure, a multiplier or accumulator or filter or DDS, and bit-slice blocks are very likely repeated N times. Maybe I can find some college kid who'd like to do a project or thesus to find or code a minimal decomp algorithm for efinix+rasperry pi, in exchange for some pittance. I can imagine some dictionary-based thing where a dictionary entry is its own first occurrence in the bit file. The decompressor is basically scissors and a pot of glue.
> >There appear to be strong correlations of identical blocks at strides of >9, 12, 24, 36 as well as huge runs of nul bytes. The odd one of 0a. > >Also a quick eyeball reveals walking ones 80,40,20,10,08,04,02,01,00 >at around 107227 (stride 9). > >There is an incredibly long run of 15372 nul bytes at offset 143811 > >RLE the nul bytes should get you most of the way there and maybe some >code to RLE the most obvious repeated sequences if you need a bit more.
I was thinking of just compressing runs of 0's, but there could be a few other smallish patterns that might not be horrible to stash in the decompressor dictionary. That presents the question, are there patterns that are common to *all* T20 bit streams? I need a low-paid lackey.
On 02/12/2022 15:22, John Larkin wrote:
> On Fri, 2 Dec 2022 12:15:56 +0000, Martin Brown > <'''newspam'''@nonad.co.uk> wrote: > >> On 27/11/2022 16:16, John Larkin wrote: >>> On Sun, 27 Nov 2022 10:46:47 +0100, Gerhard Hoffmann <dk4xp@arcor.de> >>> wrote: >>> >>>> Am 27.11.22 um 05:34 schrieb John Larkin: >>>>> We use the efinix T20 trion FPGA. >>>>> >>>>> Questions about the config bit streams: >>>>> >>>>> Are they always the same size, or does it depend on how much logic is >>>>> compiled? Would a simple application use less? >>>> >>>> With Xilinx it would for sure. Never used efinix, but I would >>>> consider it broken if it didn't. >>>> >>>>> Are the streams very compressible? >>>> >>>> I would simply test example files with zip, zcat and similar. >>>> IIRC, there is even a flow-through decompressor. >>>> >>>> We have done some simple run-length >>>>> coding to greatly reduce the storage requirement for other FPGAs. >>>>> Configs tend to have long runs of 0's. >>>>> >>>>> The T20/256 claims to need 5.4 megabits. I'd like to store the fpga >>>>> config and application code in a Raspberry Pi Pico, which has 2 MB of >>>>> onboard flash. Storing the full config would use about a third of >>>>> that, so reducing that would be useful. >>>> >>>> cheers, Gerhard >>>> >>> >>> I'm at home and don't have access to a compiled bitstream, and this is >>> a discussion group. >>> >>> I'll get a T20 bit stream Monday or Tuesday and see what it looks >>> like. If there are many runs of 0's, compression and decompression are >>> very simple. Or maybe a typical stream is just shorter than the max. >> >> Binary looks to have incredibly high redundancy and compressibility. >> One of the lowest byte entropy scores I have seen in a long time. > > My comment was about really random data. An FPGA bit stream certainly > has repeated patterns. One might build a N-bit structure, a multiplier > or accumulator or filter or DDS, and bit-slice blocks are very likely > repeated N times.
I don't think an FPGA bitstream is anything remotely like random data. The vast majority of the bytes are zeroes (70%), then bytes with 1 bit set ~2% each, 2 bits set <0.7%. It depends how hard you are prepared to work. Bytes with more than 3 bits set are comparatively rare. In your example the bytes 8A, A7, BF, DB, ED all appeared just once and the token BE did not occur at all. In principle for this application you can afford to use insane amounts of CPU power to encode if it makes the decoder simpler and faster. My instinct is that it is only worth compressing enough to make room for whatever code has to fit into the same space. I recall way back jumping through endless hoops to fit slightly more firmware code into 8k ROMs back in the days when 64k was a lot of ram.
> Maybe I can find some college kid who'd like to do a project or thesus > to find or code a minimal decomp algorithm for efinix+rasperry pi, in > exchange for some pittance.
I used to have a university sandwich student for a year and sometimes a student over the long vacation and give them projects that were interesting and otherwise wouldn't get done. The occasional one turned out to be exceptionally good. The rest did an OK job. It is only worth doing if they can finish a project that you don't have the time to do. Usually something that involves taking a lot of raw data and looking to see if there is anything interesting going on.
> > I can imagine some dictionary-based thing where a dictionary entry is > its own first occurrence in the bit file. The decompressor is > basically scissors and a pot of glue.
Judging by the way it looks to my correlator I would expect LHA type algorithms to do rather well on it. There is an inordinate amount of block duplication. A few simple subs will easily get you under 250k.
>> There appear to be strong correlations of identical blocks at strides of >> 9, 12, 24, 36 as well as huge runs of nul bytes. The odd one of 0a. >> >> Also a quick eyeball reveals walking ones 80,40,20,10,08,04,02,01,00 >> at around 107227 (stride 9). >> >> There is an incredibly long run of 15372 nul bytes at offset 143811 >> >> RLE the nul bytes should get you most of the way there and maybe some >> code to RLE the most obvious repeated sequences if you need a bit more. > > I was thinking of just compressing runs of 0's, but there could be a > few other smallish patterns that might not be horrible to stash in the > decompressor dictionary. That presents the question, are there > patterns that are common to *all* T20 bit streams? > > I need a low-paid lackey.
What stops you from having one? But you will get more use out of one that is paid the going rate. -- Regards, Martin Brown