Hi, I'm trying to implement a design on the Xilinx FPGA which is heavily pipelined, let say 50 stages, and each stage has a data width of aorund 30 bits. Now, I want to halt the pipeline or "freeze" it at some instances. One way would be to tie up all the clock enable signals of those registers and then control the pipleline using this clock_enable signal. However, I'm affraid about that control signal's fan-out (probably something around 1500). Do you think it's a good design practice? Any suggestions or similar experience? BTW, Will using the clock-enable signal of FFs affect the amount of consumed resources? Or they are simply built-in features that we can choose either to use or not to use?
Heavily pipelined design
Started by ●February 1, 2009
Reply by ●February 1, 20092009-02-01
> Hi, > > I'm trying to implement a design on the Xilinx FPGA which is heavily > pipelined, let say 50 stages, and each stage has a data width of > aorund 30 bits. Now, I want to halt the pipeline or "freeze" it at > some instances. One way would be to tie up all the clock enable > signals of those registers and then control the pipleline using this > clock_enable signal. However, I'm affraid about that control signal's > fan-out (probably something around 1500). >Can you anticipate this freeze?> Do you think it's a good design practice? Any suggestions or similar > experience? > > BTW, Will using the clock-enable signal of FFs affect the amount of > consumed resources? Or they are simply built-in features that we can > choose either to use or not to use? >It will probably consume routing resources. ---Matthew Hicks
Reply by ●February 1, 20092009-02-01
On Jan 31, 8:34=A0pm, Ehsan <ehsan.hosse...@gmail.com> wrote:> Hi, > > I'm trying to implement a design on the Xilinx FPGA which is heavily > pipelined, let say 50 stages, and each stage has a data width of > aorund 30 bits. Now, I want to halt the pipeline or "freeze" it at > some instances. One way would be to tie up all the clock enable > signals of those registers and then control the pipleline using this > clock_enable signal. However, I'm affraid about that control signal's > fan-out (probably something around 1500). > > Do you think it's a good design practice? Any suggestions or similar > experience? > > BTW, Will using the clock-enable signal of FFs affect the amount of > consumed resources? Or they are simply built-in features that we can > choose either to use or not to use?Ehsan, you did not mention the clock frequency, only the high fan-out of 1500. You can gate the clock, with the enable input to the global clock buffer/multiplexer. Seems like a better solution to me. Peter Alfke
Reply by ●February 1, 20092009-02-01
"Peter Alfke":> On Jan 31, 8:34 pm, Ehsan <ehsan.hosse...@gmail.com> wrote:>> I'm trying to implement a design on the Xilinx FPGA which is heavily >> pipelined, let say 50 stages, and each stage has a data width of >> aorund 30 bits. Now, I want to halt the pipeline or "freeze" it at >> some instances. One way would be to tie up all the clock enable >> signals of those registers and then control the pipleline using this >> clock_enable signal. However, I'm affraid about that control signal's >> fan-out (probably something around 1500).Can the ce signal be pipelined, too? If not: Does it actually have a high combinatorial delay, compared to the expected clock period?>> Do you think it's a good design practice? Any suggestions or similar >> experience?Hm, I'm not fully sure about this. At least it might be the way to go if you want the tool chain to not bore with warnings about things you intended.>> BTW, Will using the clock-enable signal of FFs affect the amount of >> consumed resources? Or they are simply built-in features that we can >> choose either to use or not to use?You'll likely already have a lot of ce-signals, and many of them might actually be non-combinatorial, so comining with a global ce would require some combinatorial logic per local ce.> Ehsan, you did not mention the clock frequency, only the high fan-out > of 1500. > You can gate the clock, with the enable input to the global clock > buffer/multiplexer. > Seems like a better solution to me.But if the pipeline requires quater-clocks, this would end up with two additional global clock buffers. Probably a little too much, compared to the percentage of area-usage, that 1.5K FF might mean. Gruss Jan Bruns
Reply by ●February 1, 20092009-02-01
Ehsan <ehsan.hosseini@gmail.com> wrote:>Hi, > >I'm trying to implement a design on the Xilinx FPGA which is heavily >pipelined, let say 50 stages, and each stage has a data width of >aorund 30 bits. Now, I want to halt the pipeline or "freeze" it at >some instances. One way would be to tie up all the clock enable >signals of those registers and then control the pipleline using this >clock_enable signal. However, I'm affraid about that control signal's >fan-out (probably something around 1500). > >Do you think it's a good design practice? Any suggestions or similar >experience?Just try it first. If the signal gets too slow or too heavily loaded, make multiple clock enable signals which do the same to divide the load. But my guess is that the routing software already takes care of this issue.>BTW, Will using the clock-enable signal of FFs affect the amount of >consumed resources? Or they are simply built-in features that we can >choose either to use or not to use?The CE is usually included in the combinatorial logic so expect some additional combinatorial logic to be used. -- Failure does not prove something is impossible, failure simply indicates you are not using the right tools... "If it doesn't fit, use a bigger hammer!" --------------------------------------------------------------
Reply by ●February 1, 20092009-02-01
On 2009-02-01, Ehsan <ehsan.hosseini@gmail.com> wrote:> Mathew, > No, the freeze cannot be anticipated. Actually, the output of the > pipeline would be connected to a FIFO and once the FIFO is full the > pipe should be halted. The data within the stages should not be lost. > However, we can change the pipeline design so that the number of > working cycles are known. I mean, we can wait until the FIFO becomes > empty and then start writing to it. In the mean time, there would be > no FIFO reads until it gets Ful. Therefore, the number of working > cycles are known, i.e. the size of the FIFO.Could you create a high watermark signal in the FIFO when there is room for around 4 more words in the FIFO? If so you will have plenty of time to propagate a pipelined "clock enable" signal to your datapath with a moderate amount of fan-out for every level of pipelining in your CE signal. /Andreas
Reply by ●February 1, 20092009-02-01
On Feb 1, 1:02=A0pm, Matthew Hicks <mdhic...@uiuc.edu> wrote:> > Hi, > > > I'm trying to implement a design on the Xilinx FPGA which is heavily > > pipelined, let say 50 stages, and each stage has a data width of > > aorund 30 bits. Now, I want to halt the pipeline or "freeze" it at > > some instances. One way would be to tie up all the clock enable > > signals of those registers and then control the pipleline using this > > clock_enable signal. However, I'm affraid about that control signal's > > fan-out (probably something around 1500). > > Can you anticipate this freeze? > > > Do you think it's a good design practice? Any suggestions or similar > > experience? > > > BTW, Will using the clock-enable signal of FFs affect the amount of > > consumed resources? Or they are simply built-in features that we can > > choose either to use or not to use? > > It will probably consume routing resources. > > ---Matthew HicksMathew, No, the freeze cannot be anticipated. Actually, the output of the pipeline would be connected to a FIFO and once the FIFO is full the pipe should be halted. The data within the stages should not be lost. However, we can change the pipeline design so that the number of working cycles are known. I mean, we can wait until the FIFO becomes empty and then start writing to it. In the mean time, there would be no FIFO reads until it gets Ful. Therefore, the number of working cycles are known, i.e. the size of the FIFO.
Reply by ●February 1, 20092009-02-01
On Feb 1, 2:26=A0pm, Peter Alfke <al...@sbcglobal.net> wrote:> On Jan 31, 8:34=A0pm, Ehsan <ehsan.hosse...@gmail.com> wrote: > > > > > > > Hi, > > > I'm trying to implement a design on the Xilinx FPGA which is heavily > > pipelined, let say 50 stages, and each stage has a data width of > > aorund 30 bits. Now, I want to halt the pipeline or "freeze" it at > > some instances. One way would be to tie up all the clock enable > > signals of those registers and then control the pipleline using this > > clock_enable signal. However, I'm affraid about that control signal's > > fan-out (probably something around 1500). > > > Do you think it's a good design practice? Any suggestions or similar > > experience? > > > BTW, Will using the clock-enable signal of FFs affect the amount of > > consumed resources? Or they are simply built-in features that we can > > choose either to use or not to use? > > Ehsan, you did not mention the clock frequency, only the high fan-out > of 1500. > You can gate the clock, with the enable input to the global clock > buffer/multiplexer. > Seems like a better solution to me. > Peter Alfke- Hide quoted text - > > - Show quoted text -Peter, The clock would be at say 100 MHz, and the FPGA is virtex4 FX100 speed grade -10. I've not tried clock gating so far. But, I was thinking of something similar, using the clock resources for the global ce. Since they are routed all over the chip and are low skew, they might help.
Reply by ●February 1, 20092009-02-01
On Sat, 31 Jan 2009 20:34:56 -0800 (PST), Ehsan wrote:>I'm trying to implement a design on the Xilinx FPGA which is heavily >pipelined, let say 50 stages, and each stage has a data width of >aorund 30 bits. Now, I want to halt the pipeline or "freeze" it at >some instances. One way would be to tie up all the clock enable >signals of those registers and then control the pipleline using this >clock_enable signal. However, I'm affraid about that control signal's >fan-out (probably something around 1500). > >Do you think it's a good design practice? Any suggestions or similar >experience?From a later post it seems you're stalling the pipe because of back-pressure from its output (sink) FIFO. I agree that it seems pretty ugly to do this by propagating a _combinational_ signal all the way back up the pipe, when you've taken so much trouble to make each stage _registered_ (by pipelining) in the forward direction. One possible solution is to provide a 1-place holding buffer (effectively a 1-deep FIFO) every few stages. Having the holding buffer there means that you can now register the back-pressure "freeze" signal, introducing a cycle of latency in the freeze operation. This costs an additional data register, but you don't need one at every stage - just often enough to alleviate the fanout issues you describe. I suspect roughly one buffer per 5 pipe stages is about the right compromise, but that is very much a guess. This arrangement also introduces a chain of registers into the freeze signal's path, which is likely to make it far easier for the tools to deal with any large fanout issues that may arise. One final point: Introducing these holding buffers along the pipe effectively gives the whole pipe some FIFO behaviour. That may mean that you don't need the pipe's output FIFO at all, which could be another useful resource trade-off. Obviously a full-dress FIFO localised at the end of the pipe will be implemented in blockRAM, whereas the holding buffers will burn up logic cells, so this probably isn't a real win in practice. -- Jonathan Bromley, Consultant DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK jonathan.bromley@MYCOMPANY.com http://www.MYCOMPANY.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.
Reply by ●February 1, 20092009-02-01
>Could you create a high watermark signal in the FIFO when there is >room for around 4 more words in the FIFO? If so you will have plenty >of time to propagate a pipelined "clock enable" signal to your >datapath with a moderate amount of fan-out for every level of >pipelining in your CE signal.Most FIFOs have an almost-full signal. It's needed for things like this. -- These are my opinions, not necessarily my employer's. I hate spam.





