FPGARelated.com
Forums

FPGA as heater

Started by John Larkin April 10, 2017
On 4/12/2017 4:20 PM, John Larkin wrote:
> On Wed, 12 Apr 2017 12:37:59 -0700 (PDT), Kevin Neilson > <kevin.neilson@xilinx.com> wrote: > >>>> If you really need to control output delay you can use the IODELAY block, possibly along with a copper trace feedback line. >>> >>> >>> Our output data-valid window is predicted by the tools to be very >>> narrow relative to the clock period. We figure that controlling the >>> temperature (and maybe controlling Vcc-core vs temperature) will open >>> up the timing window. The final analysis will have to be experimental. >>> >>> We can't crank in a constant delay to fix anything; the problem is the >>> predicted variation in delay. >>> >> >> I still think the IODELAY could help you. The output goes through an adjustable IODELAY, then you route the output back in through a pin, adjust the input IODELAY to figure out where the incoming edge is, and then use a feedback loop to keep the output delay constant. It's a technique used for deskewing DRAM data. I think the main clock would also have to be deskewed with a BUFG so you have a good reference for the input. Or, if you characterized the delay-vs-temp in the lab, you could run in open-loop mode by adjusting the IODELAY tap based on the temperature you read. >> >> Yes, the tools are definitely pessimistic. They're only useful for worst-case. I'm pretty sure you can put in the max temperature when doing PAR, so you could isolate the effects of just that, but it will still probably be worse variation than in reality. > > My FPGA guy says that the ZYNQ does not have adjustable delay after > the i/o block flops. We can vary drive strength in four steps, and we > may be able to do something with that.
That's also not adjustable in real time though. I believe what the others are talking about is a real time adjustable delay that is built into the clocking module. I don't know about the Zynq, but Xilinx has what they call a delay locked loop which sounds exactly like what you need. I believe it works by syncing the output signal to the clock signal. There will be some signal path in the feedback loop which will still cause timing variation with temperature and I suppose voltage, but the variation in process can be compensated. -- Rick C
On Wednesday, 4/12/2017 4:27 PM, rickman wrote:
> On 4/12/2017 4:20 PM, John Larkin wrote: >> On Wed, 12 Apr 2017 12:37:59 -0700 (PDT), Kevin Neilson >> <kevin.neilson@xilinx.com> wrote: >> >>>>> If you really need to control output delay you can use the IODELAY >>>>> block, possibly along with a copper trace feedback line. >>>> >>>> >>>> Our output data-valid window is predicted by the tools to be very >>>> narrow relative to the clock period. We figure that controlling the >>>> temperature (and maybe controlling Vcc-core vs temperature) will open >>>> up the timing window. The final analysis will have to be experimental. >>>> >>>> We can't crank in a constant delay to fix anything; the problem is the >>>> predicted variation in delay. >>>> >>> >>> I still think the IODELAY could help you. The output goes through an >>> adjustable IODELAY, then you route the output back in through a pin, >>> adjust the input IODELAY to figure out where the incoming edge is, >>> and then use a feedback loop to keep the output delay constant. It's >>> a technique used for deskewing DRAM data. I think the main clock >>> would also have to be deskewed with a BUFG so you have a good >>> reference for the input. Or, if you characterized the delay-vs-temp >>> in the lab, you could run in open-loop mode by adjusting the IODELAY >>> tap based on the temperature you read. >>> >>> Yes, the tools are definitely pessimistic. They're only useful for >>> worst-case. I'm pretty sure you can put in the max temperature when >>> doing PAR, so you could isolate the effects of just that, but it will >>> still probably be worse variation than in reality. >> >> My FPGA guy says that the ZYNQ does not have adjustable delay after >> the i/o block flops. We can vary drive strength in four steps, and we >> may be able to do something with that. > > That's also not adjustable in real time though. > > I believe what the others are talking about is a real time adjustable > delay that is built into the clocking module. I don't know about the > Zynq, but Xilinx has what they call a delay locked loop which sounds > exactly like what you need. I believe it works by syncing the output > signal to the clock signal. There will be some signal path in the > feedback loop which will still cause timing variation with temperature > and I suppose voltage, but the variation in process can be compensated. >
In the 7-series what you want is the MMCM, which has the ability to adjust the output phase in steps of 1/56 of the VCO period. This adjustment can be applied to a subset of the MMCM outputs, so you can for example vary the outgoing clock phase while keeping the data phase constant with respect to the clock driving the MMCM. On the other hand, the whole point of a source synchronous interface is to just need low skew between outputs - not low skew between the input clock and the outputs. Typically just placing the outputs in the IOB and using the same clock resource is good enough. Skew between outputs is much lower than the variance in output delay. -- Gabor -- Gabor
On 4/12/2017 5:16 PM, Gabor wrote:
> On Wednesday, 4/12/2017 4:27 PM, rickman wrote: >> On 4/12/2017 4:20 PM, John Larkin wrote: >>> On Wed, 12 Apr 2017 12:37:59 -0700 (PDT), Kevin Neilson >>> <kevin.neilson@xilinx.com> wrote: >>> >>>>>> If you really need to control output delay you can use the IODELAY >>>>>> block, possibly along with a copper trace feedback line. >>>>> >>>>> >>>>> Our output data-valid window is predicted by the tools to be very >>>>> narrow relative to the clock period. We figure that controlling the >>>>> temperature (and maybe controlling Vcc-core vs temperature) will open >>>>> up the timing window. The final analysis will have to be experimental. >>>>> >>>>> We can't crank in a constant delay to fix anything; the problem is the >>>>> predicted variation in delay. >>>>> >>>> >>>> I still think the IODELAY could help you. The output goes through >>>> an adjustable IODELAY, then you route the output back in through a >>>> pin, adjust the input IODELAY to figure out where the incoming edge >>>> is, and then use a feedback loop to keep the output delay constant. >>>> It's a technique used for deskewing DRAM data. I think the main >>>> clock would also have to be deskewed with a BUFG so you have a good >>>> reference for the input. Or, if you characterized the delay-vs-temp >>>> in the lab, you could run in open-loop mode by adjusting the IODELAY >>>> tap based on the temperature you read. >>>> >>>> Yes, the tools are definitely pessimistic. They're only useful for >>>> worst-case. I'm pretty sure you can put in the max temperature when >>>> doing PAR, so you could isolate the effects of just that, but it >>>> will still probably be worse variation than in reality. >>> >>> My FPGA guy says that the ZYNQ does not have adjustable delay after >>> the i/o block flops. We can vary drive strength in four steps, and we >>> may be able to do something with that. >> >> That's also not adjustable in real time though. >> >> I believe what the others are talking about is a real time adjustable >> delay that is built into the clocking module. I don't know about the >> Zynq, but Xilinx has what they call a delay locked loop which sounds >> exactly like what you need. I believe it works by syncing the output >> signal to the clock signal. There will be some signal path in the >> feedback loop which will still cause timing variation with temperature >> and I suppose voltage, but the variation in process can be compensated. >> > > In the 7-series what you want is the MMCM, which has the ability to > adjust the output phase in steps of 1/56 of the VCO period. This > adjustment can be applied to a subset of the MMCM outputs, so you > can for example vary the outgoing clock phase while keeping the > data phase constant with respect to the clock driving the MMCM. > > On the other hand, the whole point of a source synchronous interface > is to just need low skew between outputs - not low skew between the > input clock and the outputs. Typically just placing the outputs in > the IOB and using the same clock resource is good enough. Skew > between outputs is much lower than the variance in output delay.
Yeah, well, it's not like we really know the true and full problem. We just know he doesn't like the timing range reported by the tools. -- Rick C
Den onsdag den 12. april 2017 kl. 22.20.19 UTC+2 skrev John Larkin:
> On Wed, 12 Apr 2017 12:37:59 -0700 (PDT), Kevin Neilson > <kevin.neilson@xilinx.com> wrote: > > >> > If you really need to control output delay you can use the IODELAY block, possibly along with a copper trace feedback line. > >> > >> > >> Our output data-valid window is predicted by the tools to be very > >> narrow relative to the clock period. We figure that controlling the > >> temperature (and maybe controlling Vcc-core vs temperature) will open > >> up the timing window. The final analysis will have to be experimental. > >> > >> We can't crank in a constant delay to fix anything; the problem is the > >> predicted variation in delay. > >> > > > >I still think the IODELAY could help you. The output goes through an adjustable IODELAY, then you route the output back in through a pin, adjust the input IODELAY to figure out where the incoming edge is, and then use a feedback loop to keep the output delay constant. It's a technique used for deskewing DRAM data. I think the main clock would also have to be deskewed with a BUFG so you have a good reference for the input. Or, if you characterized the delay-vs-temp in the lab, you could run in open-loop mode by adjusting the IODELAY tap based on the temperature you read. > > > >Yes, the tools are definitely pessimistic. They're only useful for worst-case. I'm pretty sure you can put in the max temperature when doing PAR, so you could isolate the effects of just that, but it will still probably be worse variation than in reality. > > My FPGA guy says that the ZYNQ does not have adjustable delay after > the i/o block flops. We can vary drive strength in four steps, and we > may be able to do something with that.
you are right the 7010 and 7020 only have high range IO so no odelay are you just trying to keep a fixed alignment between clock and data output? you can do tricks with DDR output flops, data out with a DDR with both inputs as data, clock out with a DDR with 0,1 as input -Lasse
> My FPGA guy says that the ZYNQ does not have adjustable delay after > the i/o block flops. We can vary drive strength in four steps, and we > may be able to do something with that. >
Hmm. I've used a real-time-adjustable ODELAY block, but that wasn't in a Zynq. If you can add more hardware to the board, you could re-register the data in some external 74LS flops. You could use unregistered outputs and make your own delay line with a carry chain, which you can create with behavioral code.
On Wed, 12 Apr 2017 15:22:25 -0700 (PDT), Kevin Neilson
<kevin.neilson@xilinx.com> wrote:

>> My FPGA guy says that the ZYNQ does not have adjustable delay after >> the i/o block flops. We can vary drive strength in four steps, and we >> may be able to do something with that. >> > >Hmm. I've used a real-time-adjustable ODELAY block, but that wasn't in a Zynq. > >If you can add more hardware to the board, you could re-register the data in some external 74LS flops.
We are exactly trying to drive external flops, some 1 ns CMOS parts. They are clocked by the same clock that is going into the ZYNQ, and the FPGA needs to set up their D inputs reliably. We can't use a PLL or DLL inside the FPGA. So the problem is that the Xilinx tools are reporting a huge (almost 3:1) spread in possible prop delay from our applied clock to the iob outputs. The tools apparently assume the max process+temperature+power supply limits, without letting us constrain these, and without assigning any specific blame.
> >You could use unregistered outputs and make your own delay line with a carry chain, which you can create with behavioral code.
I think that has even higher uncertainty, probably more than a full clock period, so we couldn't reliably load those external flops. -- John Larkin Highland Technology, Inc picosecond timing precision measurement jlarkin att highlandtechnology dott com http://www.highlandtechnology.com
On 4/12/2017 7:21 PM, John Larkin wrote:
> On Wed, 12 Apr 2017 15:22:25 -0700 (PDT), Kevin Neilson > <kevin.neilson@xilinx.com> wrote: > >>> My FPGA guy says that the ZYNQ does not have adjustable delay after >>> the i/o block flops. We can vary drive strength in four steps, and we >>> may be able to do something with that. >>> >> >> Hmm. I've used a real-time-adjustable ODELAY block, but that wasn't in a Zynq. >> >> If you can add more hardware to the board, you could re-register the data in some external 74LS flops. > > We are exactly trying to drive external flops, some 1 ns CMOS parts. > They are clocked by the same clock that is going into the ZYNQ, and > the FPGA needs to set up their D inputs reliably. We can't use a PLL > or DLL inside the FPGA. > > So the problem is that the Xilinx tools are reporting a huge (almost > 3:1) spread in possible prop delay from our applied clock to the iob > outputs. The tools apparently assume the max process+temperature+power > supply limits, without letting us constrain these, and without > assigning any specific blame. > > >> >> You could use unregistered outputs and make your own delay line with a carry chain, which you can create with behavioral code. > > I think that has even higher uncertainty, probably more than a full > clock period, so we couldn't reliably load those external flops.
The way you have constrained the design I think you will need to design your own chip. I would say you need to find a way to relax one of your many constraints. Not using the PLL/DLL is a real killer. That would be a good one to fix. I haven't use the Xilinx tools in a long time, but I seem to recall there was a way to work with a single temperature. It may have been the hot number or the cold number, but not an arbitrary value in between. But that may have been the post layout simulation timing. Simulation is not a great way to verify timing in general, but it could be made to work for your case. I'd say get a Xilinx FAE involved. -- Rick C
> We are exactly trying to drive external flops, some 1 ns CMOS parts. > They are clocked by the same clock that is going into the ZYNQ, and > the FPGA needs to set up their D inputs reliably. We can't use a PLL > or DLL inside the FPGA. >=20 > So the problem is that the Xilinx tools are reporting a huge (almost > 3:1) spread in possible prop delay from our applied clock to the iob > outputs. The tools apparently assume the max process+temperature+power > supply limits, without letting us constrain these, and without > assigning any specific blame.
Like Lasse said above, you can adjust the output delay with a half-cycle re= solution using ODDRs. This sounds good enough for your application. I use= d that exact method once for a DRAM (single-data-rate) interface. (I think= the training method was to write data to an unused location in DRAM with v= arious phase relationships, read it back, and see which writes were success= ful.) Your issue sounds a lot like the same issues people have with DRAM. = I don't think you'll see a 3:1 variation in reality.
On Thu, 13 Apr 2017 15:22:52 -0700 (PDT), Kevin Neilson
<kevin.neilson@xilinx.com> wrote:

> >> We are exactly trying to drive external flops, some 1 ns CMOS parts. >> They are clocked by the same clock that is going into the ZYNQ, and >> the FPGA needs to set up their D inputs reliably. We can't use a PLL >> or DLL inside the FPGA. >> >> So the problem is that the Xilinx tools are reporting a huge (almost >> 3:1) spread in possible prop delay from our applied clock to the iob >> outputs. The tools apparently assume the max process+temperature+power >> supply limits, without letting us constrain these, and without >> assigning any specific blame. > >Like Lasse said above, you can adjust the output delay with a half-cycle resolution using ODDRs.
I can declare the differential-input clock polarity either way, which would shift things 3.5 ns (out of a 7 ns clock.) But the guaranteed data-valid window is less than 2 ns.
> This sounds good enough for your application. I used that exact method once for a DRAM (single-data-rate) interface. (I think the training method was to write data to an unused location in DRAM with various phase relationships, read it back, and see which writes were successful.) Your issue sounds a lot like the same issues people have with DRAM. I don't think you'll see a 3:1 variation in reality.
I sure hope so. -- John Larkin Highland Technology, Inc lunatic fringe electronics
Den fredag den 14. april 2017 kl. 06.12.52 UTC+2 skrev John Larkin:
> On Thu, 13 Apr 2017 15:22:52 -0700 (PDT), Kevin Neilson > <kevin.neilson@xilinx.com> wrote: > > > > >> We are exactly trying to drive external flops, some 1 ns CMOS parts. > >> They are clocked by the same clock that is going into the ZYNQ, and > >> the FPGA needs to set up their D inputs reliably. We can't use a PLL > >> or DLL inside the FPGA. > >> > >> So the problem is that the Xilinx tools are reporting a huge (almost > >> 3:1) spread in possible prop delay from our applied clock to the iob > >> outputs. The tools apparently assume the max process+temperature+power > >> supply limits, without letting us constrain these, and without > >> assigning any specific blame. > > > >Like Lasse said above, you can adjust the output delay with a half-cycle resolution using ODDRs. > > I can declare the differential-input clock polarity either way, which > would shift things 3.5 ns (out of a 7 ns clock.) But the guaranteed > data-valid window is less than 2 ns. >
the point of using DDR was not to shift the clock but to keep the clock and data aligned "regenerating" the clock with a DDR, means the clock and data gets treated the same and both have the same path DDR-IOB so they should track getting the output clock aligned with the input clock (if needed) might be possible using the "zero-delay-buffer" mode of the MMCM