There are 8 messages in this thread.
You are currently looking at messages 0 to 8.
Synthesis optimization people seem to like registers at I/O. Particularly, in Xilinx manual: "The synthesis tools will not optimize across the Partition interface. If an asynchronous timing critical path crosses Partition boundaries, logic optimizations will not occur across the Partition boundary. To mitigate this issue, add a register to the asynchronous signal at the Partition boundary." I like the registers all over design. Though, they speak like it is game inject a register in arbitrary place.
On Sep 17, 10:44=A0am, valtih1978 <d...@not.email.me> wrote: > Synthesis optimization people seem to like registers at I/O. > Particularly, in Xilinx manual: > > =A0 =A0"The synthesis tools will not optimize across the Partition > interface. If an asynchronous timing critical path crosses Partition > boundaries, logic optimizations will not occur across the Partition > boundary. To mitigate this issue, add a register to the asynchronous > signal at the Partition boundary." > > I like the registers all over design. Though, they speak like it is game > inject a register in arbitrary place. In order to have reliable (deterministic and short) timing at the IO boundaries you need to have registers in the IO. The other comment that you referred to is also a very good practice. Registering an asynchronous input at the boundary will resolve the asynchronous event to a single clock edge within the module (metastability concerns aside) so that timing analysis can be done correctly and so that all parts of the module will "see" the same value. Since this is an asynchronous signal and by its definition can happen at any time adding a register has no practical impact. These are not absolute rules. You are free to create your design in any way that you see fit, but when the design isn't stable and reliable you should remember these design tips.
You speak like about primary I/O. Yet, partitions are blocks of the same FPGA design. They are under full control of the tools. Do you mean that partitions treated as designs, absolutely external to each other? Thank you.
On 9/17/2011 10:44 AM, valtih1978 wrote: > Synthesis optimization people seem to like registers at I/O. > Particularly, in Xilinx manual: > > "The synthesis tools will not optimize across the Partition interface. > If an asynchronous timing critical path crosses Partition boundaries, > logic optimizations will not occur across the Partition boundary. To > mitigate this issue, add a register to the asynchronous signal at the > Partition boundary." > > I like the registers all over design. Though, they speak like it is game > inject a register in arbitrary place. One of the games is register retiming to improve Fmax. If I describe a 128 input OR gate without registers, my fmax will be very bad, and the tools can do nothing about it. If I pipeline the design with registers, syntheses can move the luts around to optimize Fmax. -- Mike Treseler
Mike Treseler <m...@gmail.com> wrote: (snip) >> "The synthesis tools will not optimize across the Partition interface. >> If an asynchronous timing critical path crosses Partition boundaries, >> logic optimizations will not occur across the Partition boundary. To >> mitigate this issue, add a register to the asynchronous signal at the >> Partition boundary." (snip) > One of the games is register retiming to improve Fmax. > If I describe a 128 input OR gate without registers, > my fmax will be very bad, and the tools can do nothing about it. > If I pipeline the design with registers, syntheses can move > the luts around to optimize Fmax. It would be nice if the tools could help placing of such registers. For the 128 input OR, I might know that it can have one pipeline register inside, but don't know where it should be. (My pipelines are usually more complicated than an OR gate, but the same idea applies.) One possibility would be to have a set of registers that are optional, such that the tools should leave them in if it improves Fmax, but omit them if it doesn't. It will take a while to converge, but still better than trial and error. -- glen
> Mike Treseler<m...@gmail.com> wrote: >> One of the games is register retiming to improve Fmax. >> If I describe a 128 input OR gate without registers, >> my fmax will be very bad, and the tools can do nothing about it. >> If I pipeline the design with registers, syntheses can move >> the luts around to optimize Fmax. On 9/24/2011 10:42 AM, glen herrmannsfeldt wrote: > It would be nice if the tools could help placing of such registers. > > For the 128 input OR, I might know that it can have one pipeline > register inside, but don't know where it should be. It doesn't matter, at least with quartus. I can put a shift(n) register on all the outputs, turn on reg dupe and reg retime, and let synthesis have a go at it. > (My pipelines > are usually more complicated than an OR gate, but the same idea > applies.) Mine too. The gate example was just for clarity. > One possibility would be to have a set of registers that are optional, > such that the tools should leave them in if it improves Fmax, but > omit them if it doesn't. Would be nice, but that's not how it is with quartus today. Trial and error works OK with a reasonable starting guess for n. -- Mike Treseler______________________________
Mike Treseler <m...@gmail.com> wrote: (snip, I wrote) >> It would be nice if the tools could help placing of such registers. >> For the 128 input OR, I might know that it can have one pipeline >> register inside, but don't know where it should be. > It doesn't matter, at least with quartus. > I can put a shift(n) register on all the outputs, > turn on reg dupe and reg retime, and let synthesis have a go at it. >> (My pipelines are usually more complicated than an OR gate, >> but the same idea applies.) > Mine too. The gate example was just for clarity. >> One possibility would be to have a set of registers that are optional, >> such that the tools should leave them in if it improves Fmax, but >> omit them if it doesn't. > Would be nice, but that's not how it is with quartus today. > Trial and error works OK with a reasonable starting guess for n. OK, one specific problem. The designs I am interested in are linear arrays of pipelined elements. I can optimize the individual element, but I also need to optimize their placement in the arrays. It might be that the tools can well optimize a linear (on chip) array of unit cells, but sometime it will get to the chip boundary and need to turn around. The paths needed at that point are likely longer, and so I might want to add additional registers. I don't know very well where those points are. In addition, there is the path from the I/O buffers to the first, and from the last to I/O buffers, which again have complications in timing and placement. Even more, last time I tried this with the Xilinx tools, adding extra registers where I thought they might help, the tools implemented the double register as SRL16 cells, (shift registers), instead of spacing them out as I had hoped. -- glen______________________________
On 9/24/2011 3:28 PM, glen herrmannsfeldt wrote: > OK, one specific problem. The designs I am interested in are > linear arrays of pipelined elements. I can optimize the individual > element, but I also need to optimize their placement in the arrays. > It might be that the tools can well optimize a linear (on chip) > array of unit cells, but sometime it will get to the chip boundary > and need to turn around. If the design does not fit in one FPGA, you are on your own. > Even more, last time I tried this with the Xilinx tools, adding > extra registers where I thought they might help, the tools implemented > the double register as SRL16 cells, (shift registers), instead of > spacing them out as I had hoped. That means register retiming was not switched on. Try a simpler example to see if ISE can do it at all. To banish all SRL16s (or the altera equivalents) add a reset input to the process. Good luck. -- Mike Treseler______________________________