Reply by glen herrmannsfeldt July 10, 20152015-07-10
kaz <37480@fpgarelated> wrote:
>>kaz <37480@fpgarelated> wrote: >>(snip, and previously snipped)
(snip)
>>> It is Virtex 6 and I have achieved better than 368MHz at module level. >>> At integration (into very large project) it fails marginally. As I >>> said
(snip, I wrote)
>>Do you mean that it fails, even when timing satisfies the >>post-route timing data?
(snip)
> No, by module level I mean when compiled on its own. The actual > project is not based on any incremental approach or logic lock > but all the lower modules are just added to project to > be fitted freely anywhere it chooses.
I use Spartan, so it might be different. Some years ago, I was working on a project where the pre-route timing was so good, maybe twice as fast as it needed to be, and I didn't even bother to look at the post-route timing until I tried it, and it didn't work. At least for Spartan, the routing is optimized over the whole chip. Well, there are things that you can do to give hints and such, but it is easily possible that changing one part changes the timing of some unrelated part. It might be that you can route some modules, keep them fixed while you route others. That probably makes more sense in big designs. But whatever it does, it should meet post-route timing, and you should run that to be sure that is fast enough. -- glen
Reply by kaz July 10, 20152015-07-10
>kaz <37480@fpgarelated> wrote: > >(snip, and previously snipped) > >> It is Virtex 6 and I have achieved better than 368MHz at module level. >> At integration (into very large project) it fails marginally. As I
said
>it >> is the path from read register to fifo data output (single clock 16
words
>> depth, distributed ram).I don't expect any logic apart from fifo
stages
>> implemented in luts. I am just asking if there is anyway to improve
such
>> paths. I tried block ram and it failed very badly. > >Do you mean that it fails, even when timing satisfies the >post-route timing data? > >That isn't good. > >-- glen
No, by module level I mean when compiled on its own. The actual project is not based on any incremental approach or logic lock but all the lower modules are just added to project to be fitted freely anywhere it chooses. Kaz --------------------------------------- Posted through http://www.FPGARelated.com
Reply by glen herrmannsfeldt July 10, 20152015-07-10
kaz <37480@fpgarelated> wrote:

(snip, and previously snipped)

> It is Virtex 6 and I have achieved better than 368MHz at module level. > At integration (into very large project) it fails marginally. As I said it > is the path from read register to fifo data output (single clock 16 words > depth, distributed ram).I don't expect any logic apart from fifo stages > implemented in luts. I am just asking if there is anyway to improve such > paths. I tried block ram and it failed very badly.
Do you mean that it fails, even when timing satisfies the post-route timing data? That isn't good. -- glen
Reply by kaz July 10, 20152015-07-10
> >What you're asking is device-related. What FPGA family are you using? >Generally speaking, Xilinx devices can implement distributed memory >FIFO pretty well, but you need to limit the depth of the FIFO to >get it to run at high clock rates. Also there's a big difference >between common-clock FIFO and independent-clock FIFO. The first >kind can be implemented in SRL, the second cannot. > >Also it would be good to see a full failing path from your timing >report (.twr file) to see if this is a logic level issue or >a routing length issue. > >-- >Gabor
It is Virtex 6 and I have achieved better than 368MHz at module level. At integration (into very large project) it fails marginally. As I said it is the path from read register to fifo data output (single clock 16 words depth, distributed ram).I don't expect any logic apart from fifo stages implemented in luts. I am just asking if there is anyway to improve such paths. I tried block ram and it failed very badly. Kaz --------------------------------------- Posted through http://www.FPGARelated.com
Reply by GaborSzakacs July 10, 20152015-07-10
kaz wrote:
> I have several fifos that are small and implemented as distributed > ram.There are some timing violations reported on paths between source > register of fifo control signals(e.g. read signal) and fifo data output. > > This raised some question in my head as how timing is assessed for such > fifos (or SRL for that matter). A fifo or SRL chain uses luts plus output > register. Wouldn't that mean there is inherently a long path to the output > register or should we say it is long but not combinatorial? Is there > anyway to improve timing in such designs like fifos or SRL chains. > > Regards > > Kaz > > --------------------------------------- > Posted through http://www.FPGARelated.com
What you're asking is device-related. What FPGA family are you using? Generally speaking, Xilinx devices can implement distributed memory FIFO pretty well, but you need to limit the depth of the FIFO to get it to run at high clock rates. Also there's a big difference between common-clock FIFO and independent-clock FIFO. The first kind can be implemented in SRL, the second cannot. Also it would be good to see a full failing path from your timing report (.twr file) to see if this is a logic level issue or a routing length issue. -- Gabor
Reply by kaz July 10, 20152015-07-10
I have several fifos that are small and implemented as distributed
ram.There are some timing violations reported on paths between source
register of fifo control signals(e.g. read signal) and fifo data output.

This raised some question in my head as how timing is assessed for such
fifos (or SRL for that matter). A fifo or SRL chain uses luts plus output
register. Wouldn't that mean there is inherently a long path to the output
register or should we say it is long but not combinatorial? Is there
anyway to improve timing in such designs like fifos or SRL chains.

Regards

Kaz

---------------------------------------
Posted through http://www.FPGARelated.com