FPGARelated.com
Forums

PHB FPGA question

Started by Unknown December 27, 2020
OK, pointy-haired boss question.

Given a ZYNQ 7020, speed grade 1. A 3.3 volt i/o bank gets a clock
from an LVDS input. We have a resync flop in an i/o cell, clocked by
this, with a D input from somewhere. Output is the strongest/fastest
3.3 volt option.

About what would be the typical prop delay from the clock to the
output pin?

Online search yields a lot of words and no numbers. Experts say useful
things like "it depends." The tools apparently give a range of timings
over worst-case supply voltage, process, and temperature that vary by
about 4:1 with no typical.

Second question: has anyone ever pushed an FPGA core voltage up to get
more speed? In one little test I did, on an Artix 7, a simple case
changed chip prop delay by 1 ns, from about 8.5 to about 7.5 ns, with
a 70 mV core supply increase. That delay was essentially all
combinational.



 


-- 

John Larkin      Highland Technology, Inc

The best designs are necessarily accidental.


  
Am 27.12.20 um 21:50 schrieb jlarkin@highlandsniptechnology.com:
> OK, pointy-haired boss question. > > Given a ZYNQ 7020, speed grade 1. A 3.3 volt i/o bank gets a clock > from an LVDS input. We have a resync flop in an i/o cell, clocked by > this, with a D input from somewhere. Output is the strongest/fastest > 3.3 volt option.
Maybe LVDS is not the way to go. The LV means they want it somewhat fast, but power still does matter. The game is different with LVPECL or CML.
> About what would be the typical prop delay from the clock to the > output pin? > > Online search yields a lot of words and no numbers. Experts say useful > things like "it depends." The tools apparently give a range of timings > over worst-case supply voltage, process, and temperature that vary by > about 4:1 with no typical.
Because it really depends.The clock could be routed in a thousand different ways on the chip. It is usually best to use one of the typically 4 global clock nets, but there may be local ones that are placed nicely to your outputs. And avoid tri-state buffers. They are sloooow, alone for the logic they contain. I have not used Zyncs, but mostly Virtexes. The canonic way to get short clock to out delays would be to use a global clock net without any logic in front of it and then write a constraints file with the specs you need. Leave the work to the router. Don't be too greedy in the first round, you can put on the thumb screws later. First make sure that what you spec is that what you want. Sometimes is is not intuitive to specify that. You can see in the static timing verifier where the ps are lost and where work for improvements is futile.
> Second question: has anyone ever pushed an FPGA core voltage up to get > more speed? In one little test I did, on an Artix 7, a simple case > changed chip prop delay by 1 ns, from about 8.5 to about 7.5 ns, with > a 70 mV core supply increase. That delay was essentially all > combinational.
Never tried that. But I have built pipelines 24 stages deep. Do less combinatorial stuff in one stage and start early/parallel enough. I think, if that would be possible in a reliable way, Xilinx would spec it that way and would ask for more money. Gerhard
On Sun, 27 Dec 2020 23:12:15 +0100, Gerhard Hoffmann <dk4xp@arcor.de>
wrote:

>Am 27.12.20 um 21:50 schrieb jlarkin@highlandsniptechnology.com: >> OK, pointy-haired boss question. >> >> Given a ZYNQ 7020, speed grade 1. A 3.3 volt i/o bank gets a clock >> from an LVDS input. We have a resync flop in an i/o cell, clocked by >> this, with a D input from somewhere. Output is the strongest/fastest >> 3.3 volt option. > >Maybe LVDS is not the way to go. The LV means they want it somewhat >fast, but power still does matter. The game is different with LVPECL >or CML.
The clock into the FPGA is from a differential PECL comparator, so using an LVDS input makes sense. The output will be 3.3v cmos. Rumor has it that, in general, lvds i/o is about a ns faster than cmos.
> > >> About what would be the typical prop delay from the clock to the >> output pin? >> >> Online search yields a lot of words and no numbers. Experts say useful >> things like "it depends." The tools apparently give a range of timings >> over worst-case supply voltage, process, and temperature that vary by >> about 4:1 with no typical. > >Because it really depends.
That's the standard answer: it depends. There are a zillion appnotes and class notes, none of which include the word "nanosecond." The clock could be routed in a thousand
>different ways on the chip. It is usually best to use one of the >typically 4 global clock nets, but there may be local ones that are >placed nicely to your outputs. >And avoid tri-state buffers. They are sloooow, alone for the logic >they contain. > >I have not used Zyncs, but mostly Virtexes. > >The canonic way to get short clock to out delays would be to use >a global clock net without any logic in front of it and then write >a constraints file with the specs you need. Leave the work to the >router. >Don't be too greedy in the first round, you can put on the >thumb screws later. First make sure that what you spec is >that what you want. Sometimes is is not intuitive to specify that. > >You can see in the static timing verifier where the ps are lost >and where work for improvements is futile. > > > >> Second question: has anyone ever pushed an FPGA core voltage up to get >> more speed? In one little test I did, on an Artix 7, a simple case >> changed chip prop delay by 1 ns, from about 8.5 to about 7.5 ns, with >> a 70 mV core supply increase. That delay was essentially all >> combinational. > >Never tried that. But I have built pipelines 24 stages deep. >Do less combinatorial stuff in one stage and start early/parallel >enough.
I want my output to transition immediately after the first clock edge. Or maybe before. -- John Larkin Highland Technology, Inc The best designs are necessarily accidental.
On Sunday, December 27, 2020 at 1:51:05 PM UTC-7, jla...@highlandsniptechno=
logy.com wrote:
> OK, pointy-haired boss question.=20 >=20 > Given a ZYNQ 7020, speed grade 1. A 3.3 volt i/o bank gets a clock=20 > from an LVDS input. We have a resync flop in an i/o cell, clocked by=20 > this, with a D input from somewhere. Output is the strongest/fastest=20 > 3.3 volt option.=20 >=20 > About what would be the typical prop delay from the clock to the=20 > output pin?=20 >=20 > Online search yields a lot of words and no numbers. Experts say useful=20 > things like "it depends." The tools apparently give a range of timings=20 > over worst-case supply voltage, process, and temperature that vary by=20 > about 4:1 with no typical.=20 >=20 > Second question: has anyone ever pushed an FPGA core voltage up to get=20 > more speed? In one little test I did, on an Artix 7, a simple case=20 > changed chip prop delay by 1 ns, from about 8.5 to about 7.5 ns, with=20 > a 70 mV core supply increase. That delay was essentially all=20 > combinational.=20 >=20 >=20 >=20 >=20 >=20 >=20 > --=20 >=20 > John Larkin Highland Technology, Inc=20 >=20 > The best designs are necessarily accidental.
I'm not exactly sure what your goal is, but if you want to subtract out the= clock routing delay, use an MMCM so that the clock to the flipflop will ha= ve nearly the same phase as the clock at the input. You can also make sure= that the flipflop is an output flop packed in the IOB so that the flop out= put -> pin delay will be short and more deterministic. You can use a direc= tive in the HDL to ensure the flop is in an IOB. You can also set a constr= aint for the max delay out; otherwise the tools assume they have an entire = clock period to get the signal from the flop to the pin. If the input is r= eally asynchronous, you really ought to use a 2-flop synchronizer. Some of the Xilinx parts have a separate column in the datasheet for a lowe= r core voltage, which saves power but degrades timing. I definitely wouldn= 't try increasing the voltage to beyond the spec. It might work, for a whi= le... The best way to get better speed would probably be to ensure that th= e junction temperature stays low.