FPGARelated.com
Forums

FIFO timing, the right way

Started by Piotr Wyderski April 22, 2019
Hi all,

I am working on a block that needs to accumulate (at least) K data items 
and then consume them in a burst, while the next group of items might be
flowing in. As the items are not consumed sequentially, a very efficient 
approach is to have a FIFO interface on the write side and a limited 
lookahead random access interface one on the read side. The read side 
works OK. The hard part turned out to be the FIFO full flag setting.
I would like to use the full capacity of the FIFO and stall the data 
writer at the correct moment. Say I want to copy the content of a very 
big ROM into a 4-elements deep FIFO, the basic logic would be as follows:

     writer: process(clock) is begin

         if (rising_edge(clock)) then

             fifo_write_request <= '0';

             if (fifo_full = '0') then

                 fifo_write_request <= '1';
                 rom_address <= rom_address + 1;

             end if;

         end if;

     end process;

In the FIFO interface part I have a 3-bit counter SIZE that calculates 
the number of items in the FIFO. I update it under the same "if 
(rising_edge(clock))" condition to follow the instantaneous 
fifo_write_request & fifo_read_request constellation ("00" => +0, "10" 
=> +1, "01" => -1, "11" => +0).

And here comes the problem: simply setting fifo_full <= (SIZE = 4)
adds a cycle of latency and the writer thinks it is allowed to write one 
more item, corrupting the data. To add insult to injury, SIZE becomes 5
and fifo_full is de-aserted, so the writer pumps in even more data.
Changing it to fifo_full <= (SIZE = 3) behaves equally wrong, just moves
the problem one cycle to the left.

If I move the following into the concurrent part:

   fifo_full <= (SIZE = 4) or ((SIZE = 3) and (fifo_write_request = '1'))

(to account for a possible ongoing write), everything works great, 
because the condition is very logical (no pun intended). But I think
this solution is lame due to its purely combinatorial nature. So, could
you please tell me either:

a) that it is not lame, if it works, it works and and my concerns are 
based on a superstition.

b) or the correct way to setup the flag synchronously.

I don't care if the read side starts as soon as possible (including the 
ongoing write) or one cycle later, so it can be excluded from the 
write-side logic. This is to drive a string of WS2812 LEDs.

	Best regards, Piotr
Hello

On 22/04/2019 08:42, Piotr Wyderski wrote:
[...]
> In the FIFO interface part I have a 3-bit counter SIZE that calculates > the number of items in the FIFO. I update it under the same "if > (rising_edge(clock))" condition to follow the instantaneous > fifo_write_request & fifo_read_request constellation ("00" => +0, "10" > => +1, "01" => -1, "11" => +0). > > And here comes the problem: simply setting fifo_full <= (SIZE = 4) > adds a cycle of latency and the writer thinks it is allowed to write one > more item, corrupting the data. To add insult to injury, SIZE becomes 5 > and fifo_full is de-aserted, so the writer pumps in even more data. > Changing it to fifo_full <= (SIZE = 3) behaves equally wrong, just moves > the problem one cycle to the left. > > If I move the following into the concurrent part: > > &nbsp; fifo_full <= (SIZE = 4) or ((SIZE = 3) and (fifo_write_request = '1'))
Either you set the flag at the same time that the counter reaches its max value in a sequential process : fifo_full <= ((SIZE = 3) and (fifo_write_request = '1')); which has a one cycle latency on the flag test side Or you do it combinatorially like you did, because sometimes there is no other way (and anyway, however you write your logic, it always ends up with a bunch of combinatorial stuff followed by a flip-flop) Instead of comparing SIZE = 4, you could use SIZE > 3 which takes care of the case where SIZE goes beyond 4 and is very simple in terms of logic because it only needs the most significant bit of SIZE. BTW, note that in the expression above, fifo_full is of boolean type, not std_logic) Nicolas
Hi Nicholas,

Nicolas Matringe wrote:

> Either you set the flag at the same time that the counter reaches its > max value in a sequential process : > > fifo_full <= ((SIZE = 3) and (fifo_write_request = '1')); > > which has a one cycle latency on the flag test side
It is deceptively simple, but unfortunately it doesn't work. This is a write-only (no dequeue ops) simulation for a 4-entry FIFO: https://i.postimg.cc/MHvqvmmt/sim1.png All values are signals, i.e. the delay is included. It shows two bugs: 5 items have been accepted and no permanent stall is produced.
> Or you do it combinatorially like you did, because sometimes there is no > other way (and anyway, however you write your logic, it always ends up > with a bunch of combinatorial stuff followed by a flip-flop)
The combinatorial solution works, but for some unknown reason I am not particularly fond of it and wanted to consult it with experts here.
> Instead of comparing SIZE = 4, you could use SIZE > 3 which takes care > of the case where SIZE goes beyond 4 and is very simple in terms of > logic because it only needs the most significant bit of SIZE.
SIZE > 4 means the disaster has already happened.
> BTW, note that in the expression above, fifo_full is of boolean type, > not std_logic)
Sure, I just wanted to show the general idea, not to introduce the otherwise required strong typing as an extra layer of obfuscation. Best regards, Piotr
On 22/04/2019 12:29, Piotr Wyderski wrote:
> Hi Nicholas, > > Nicolas Matringe wrote: > >> Either you set the flag at the same time that the counter reaches its >> max value in a sequential process : >> >> fifo_full <= ((SIZE = 3) and (fifo_write_request = '1')); >> >> which has a one cycle latency on the flag test side > > It is deceptively simple, but unfortunately it doesn't work. > This is a write-only (no dequeue ops) simulation for a 4-entry FIFO: > > https://i.postimg.cc/MHvqvmmt/sim1.png
You could generate an additional fifo_almost_full signal that is set when SIZE reaches 3 and use it like this, along with fifo_full and fifo_write_request : ... if (fifo_full = 0) and ((fifo_almost_full and fifo_write_request) = 0) then fifo_write_request <= '1'; rom_address <= rom_address + 1; end if; ...
>> Instead of comparing SIZE = 4, you could use SIZE > 3 which takes care >> of the case where SIZE goes beyond 4 and is very simple in terms of >> logic because it only needs the most significant bit of SIZE. > > SIZE > 4 means the disaster has already happened.
Inded but what about SIZE > 3, as I stated ? It gives you simple comparison logic and disaster mitigating (stalling whatever the value is once above the limit) In general when designing these sorts of things I find it helps a huge lot to draw the timing waveforms by hand before starting to code. Nicolas
On Monday, April 22, 2019 at 2:42:29 AM UTC-4, Piotr Wyderski wrote:
> Hi all, > > I am working on a block that needs to accumulate (at least) K data items > and then consume them in a burst, while the next group of items might be > flowing in. As the items are not consumed sequentially, a very efficient > approach is to have a FIFO interface on the write side and a limited > lookahead random access interface one on the read side. The read side > works OK. The hard part turned out to be the FIFO full flag setting. > I would like to use the full capacity of the FIFO and stall the data > writer at the correct moment. Say I want to copy the content of a very > big ROM into a 4-elements deep FIFO, the basic logic would be as follows: > > writer: process(clock) is begin > > if (rising_edge(clock)) then > > fifo_write_request <= '0'; > > if (fifo_full = '0') then > > fifo_write_request <= '1'; > rom_address <= rom_address + 1; > > end if; > > end if; > > end process; > > In the FIFO interface part I have a 3-bit counter SIZE that calculates > the number of items in the FIFO. I update it under the same "if > (rising_edge(clock))" condition to follow the instantaneous > fifo_write_request & fifo_read_request constellation ("00" => +0, "10" > => +1, "01" => -1, "11" => +0). > > And here comes the problem: simply setting fifo_full <= (SIZE = 4) > adds a cycle of latency and the writer thinks it is allowed to write one > more item, corrupting the data. To add insult to injury, SIZE becomes 5 > and fifo_full is de-aserted, so the writer pumps in even more data. > Changing it to fifo_full <= (SIZE = 3) behaves equally wrong, just moves > the problem one cycle to the left. > > If I move the following into the concurrent part: > > fifo_full <= (SIZE = 4) or ((SIZE = 3) and (fifo_write_request = '1')) > > (to account for a possible ongoing write), everything works great, > because the condition is very logical (no pun intended). But I think > this solution is lame due to its purely combinatorial nature. So, could > you please tell me either: > > a) that it is not lame, if it works, it works and and my concerns are > based on a superstition. > > b) or the correct way to setup the flag synchronously. > > I don't care if the read side starts as soon as possible (including the > ongoing write) or one cycle later, so it can be excluded from the > write-side logic. This is to drive a string of WS2812 LEDs. > > Best regards, Piotr
The usual procedure is to set full and empty inside a clocked process. I'm assuming that both read and write are clocked by a single clock. If read and write are in different clock domains then you need a dual clock fifo which would go beyond the scope of what you're asking for here. Normally you want FIFO flag outputs to be clocked since that will maximize the amount of time available for the external logic that will use those flags. If you use 'just logic' to generate the flags then you run the risk of that FIFO flag needlessly becoming part of the critical timing path. In order to set the full flag synchronously, you simply set it when there are 'n-1' things in the FIFO and there is a write request and no read request. VHDL sample code would be: if (Reset = '1') then WrUsedw <= 0; Full <= '0'; elsif (Write = '1') then if (Read = '0') then if (WrUsedw >= FIFO_DEPTH - 1) then Full <= '1'; end if; WrUsedw <= WrUsedw + 1; end if; elsif (Read = '1') then Full <= '0'; WrUsedw <= WrUsedw - 1; end if; end if; Kevin Jennings