FPGARelated.com
Forums

Auto pipeline logic??

Started by Davy June 14, 2005
Hi all,

Using HDL to pipelining manually is a hardy task. And I found some
tools like Synplify have pipeline tools. But the pipeline they provided
is just insert reg between RAM and logic.

My question is: Is there a tool to auto pipeline the logic? For
example, I want to pipeline the logic by insert N regs. And if there
exists such a tool, what does it modify, HDL or netlist level?

Best regards,
Davy

I think Xilinx, or maybe it's Mentor Graphics, that  has a tool called
Precision, that is suppose to "push or pull" registers in order to make
timing criteria.  I don't know what it costs.

b r a d @ a i v i s i o n . c o m



Hi Davy,

> My question is: Is there a tool to auto pipeline the logic? For > example, I want to pipeline the logic by insert N regs. And if there > exists such a tool, what does it modify, HDL or netlist level?
Both Synplify Pro and XST (and probably other synthesis tools) can do this to some extent. It's also often called Register re-timing. It works like this: you can write your big block of combinatorial logic in HDL, then add N registers after it (also in HDL), then get the tools to push the registers around to optimize the timing of the circuit. That's the theory! In practice you'd be somewhat foolish to rely on this today (Try it!). However, it's likely to become more and more important in future. An area in which XST does this fairly well is multipliers. If you use the right settings and/or attributes, it's possible to write a combinatorial multiply followed by a register-based delay line, and have the registers pushed back into the adder tree automatically. The original HDL source is untouched; it's the resulting netlist which is optimized. Cheers, -Ben-
Hi,

Yes I know re-timing, it just push pull the register(rely on the
original netlist), but not insert register.

Is there any tool to insert registers?

Thanks!
Davy

> Yes I know re-timing, it just push pull the register(rely on the > original netlist), but not insert register. > Is there any tool to insert registers?
Well, inserting registers changes your design in a fundamental way. Most circuits I can think of would just stop working if you added registers to them at random. Only you, the designer, know exactly how much pipelining it is legal to apply to a given part of your circuit. So I don't believe such a tool exists - certainly not in the general case. Cheers, -Ben-
Davy wrote:
> Hi, > > Yes I know re-timing, it just push pull the register(rely on the > original netlist), but not insert register. > > Is there any tool to insert registers? >
I would think that would be a very bad idea to try to do automatically - it would completely change your timing. It's one thing to automatically do re-timing to improve your margins or your maximum clock rate, but adding registers will change the function of your logic. You might just as well ask for a tool to insert extra logic to improve your design.
> Thanks! > Davy >
You have to insert your own registers to make the pipeline a desired
latency.
You can then let the tool move the logic across those boundaries.

How would you specify to the tool what you want pipelined, what you don't,
and what the expected final latency in clocks is?
You insert registers in the paths you want piped.

The tool I use to insert registers:  vi.


"Davy" <zhushenli@gmail.com> wrote in message
news:1118807485.054216.114960@g14g2000cwa.googlegroups.com...
> Hi, > > Yes I know re-timing, it just push pull the register(rely on the > original netlist), but not insert register. > > Is there any tool to insert registers? > > Thanks! > Davy
Ben Jones (ben.jones@xilinx.com) wrote:
: > Yes I know re-timing, it just push pull the register(rely on the
: > original netlist), but not insert register.
: > Is there any tool to insert registers?

: Well, inserting registers changes your design in a fundamental way.
: Most circuits I can think of would just stop working if you added
: registers to them at random. Only you, the designer, know exactly
: how much pipelining it is legal to apply to a given part of your
: circuit. So I don't believe such a tool exists - certainly not in the
: general case.

This is very true, but there's no reason a designer couldn't specify a 
bunch of signals (e.g. the data signal from a combinatorial multiply and 
associated control signals) and some tool would add aribtrary (to a user 
specified limit) stages of pipelining to all signals to meet timing, with 
logic/register shuffling.  This would only work the control and data flows 
can be aribtrarily pipelined, but many ops can be described this way/

A half way house to acheive this is to use current register shuffling on 
the data signals and experimentally add registers to reach timing, and 
then pipline associated control signals.  If done using a VHDL generate 
tc. it's a two second text editor job to do the later.

If someone writes the tools then the whole operation could be scripted, 
with the logic to be messed and associated signals isolated in a soure 
file.

All in all construtive use of a text editor on the source is much less 
hastle :-)

---
cds
> : Well, inserting registers changes your design in a fundamental way. > : Most circuits I can think of would just stop working if you added > : registers to them at random. Only you, the designer, know exactly > : how much pipelining it is legal to apply to a given part of your > : circuit. So I don't believe such a tool exists - certainly not in the > : general case. > This is very true, but there's no reason a designer couldn't specify a > bunch of signals (e.g. the data signal from a combinatorial multiply and > associated control signals) and some tool would add aribtrary (to a user > specified limit) stages of pipelining to all signals to meet timing, with > logic/register shuffling. This would only work the control and data flows > can be aribtrarily pipelined, but many ops can be described this way/
True, although I don't see much merit in doing it that way. In FPGAs, the pipeline registers are essentially free (because they're there after every LUT, even if you don't use them). So you don't get much advantage from "just" meeting timing - if you have four clock cycles do do something in, then you might as well take all four - who cares? You'll get better results out of the tools that way, too. Of course, if you *do* care about the latency of your operations and you want to minimize it, then you're already thinking in enough depth about your design that an automated tool would be unnecessary. Cheers, -Ben- P.S. Whenever someone says "automated tool" I immediately envisage a smug paperclip: "I see you're trying to close timing - would you like some help with that?" This of course means that all my subsequent utterances on the subject can be safely disregarded. :-)
Hi Davy,

No tool that I know of, but you can write code in such a way that the
pipelining is configurable. I've written a few blocks where I could
adjust the pipelining of the block by changing a "pipelining schedule",
which was just an array variable containing pipeline-able points in the
design. By changing this variable, I could change the amount of
pipelines, and therefore the amount of registers used and the fmax of
the design. This has worked pretty well for me for designs like binary
trees of arbitrary depth. At the top of the code I would have a
variable like:

  pipeline_schedule(TREE_DEPTH-1 downto 0) := ( 0, 1, 1, 0, 1, 1 );

and further in the code where I would have, say a tree I would do
something like (this is not the actual code, just the idea here):

  for i in 1 to TREE_DEPTH-1 generate
    for j in 0 to LEAVES-1 generate
      if (pipeline_schedule(i)==0) generate
        -- just a level of logic
        a(i)(j*2) <= max( a(i-1)(j), a(i-1)(j+1) );
      end generate;
      if (pipeline_schedule(i)==1) generate
        -- create a pipeline stage
        if clk='1' and clk'event
          a(i) <= max( a(i-1)(j), a(i-1)(j+1) );
        endif
      endif
    endgenerate;
  endgenerate;

So whereever the variable pipeline_schedule has a 1, that level in the
tree would be pipelined. This works for a regular structure like a
tree. It would be more difficult to code something non-regular like a
complex control circuit with configurable pipelining. Anyways, this
method allowed me to balance an IP block between resource usage and
speed. You could extend the idea by automatically creating a schedule
if you knew how many levels of logic a certain fmax in a certain device
could be tolerated (I didnt' go this far; I created the schedule
empirically).

-- Pete