comp.arch.fpga | FPGA as heater| page 4

Reply by rickman ●April 12, 20172017-04-12

On 4/12/2017 4:20 PM, John Larkin wrote:
> On Wed, 12 Apr 2017 12:37:59 -0700 (PDT), Kevin Neilson
> <kevin.neilson@xilinx.com> wrote:
>
>>>> If you really need to control output delay you can use the IODELAY block, possibly along with a copper trace feedback line.
>>>
>>>
>>> Our output data-valid window is predicted by the tools to be very
>>> narrow relative to the clock period. We figure that controlling the
>>> temperature (and maybe controlling Vcc-core vs temperature) will open
>>> up the timing window. The final analysis will have to be experimental.
>>>
>>> We can't crank in a constant delay to fix anything; the problem is the
>>> predicted variation in delay.
>>>
>>
>> I still think the IODELAY could help you.  The output goes through an adjustable IODELAY, then you route the output back in through a pin, adjust the input IODELAY to figure out where the incoming edge is, and then use a feedback loop to keep the output delay constant.  It's a technique used for deskewing DRAM data.  I think the main clock would also have to be deskewed with a BUFG so you have a good reference for the input.  Or, if you characterized the delay-vs-temp in the lab, you could run in open-loop mode by adjusting the IODELAY tap based on the temperature you read.
>>
>> Yes, the tools are definitely pessimistic.  They're only useful for worst-case.  I'm pretty sure you can put in the max temperature when doing PAR, so you could isolate the effects of just that, but it will still probably be worse variation than in reality.
>
> My FPGA guy says that the ZYNQ does not have adjustable delay after
> the i/o block flops. We can vary drive strength in four steps, and we
> may be able to do something with that.

That's also not adjustable in real time though.

I believe what the others are talking about is a real time adjustable 
delay that is built into the clocking module.  I don't know about the 
Zynq, but Xilinx has what they call a delay locked loop which sounds 
exactly like what you need.  I believe it works by syncing the output 
signal to the clock signal.  There will be some signal path in the 
feedback loop which will still cause timing variation with temperature 
and I suppose voltage, but the variation in process can be compensated.

-- 

Rick C

Reply by Gabor ●April 12, 20172017-04-12

On Wednesday, 4/12/2017 4:27 PM, rickman wrote:
> On 4/12/2017 4:20 PM, John Larkin wrote:
>> On Wed, 12 Apr 2017 12:37:59 -0700 (PDT), Kevin Neilson
>> <kevin.neilson@xilinx.com> wrote:
>>
>>>>> If you really need to control output delay you can use the IODELAY 
>>>>> block, possibly along with a copper trace feedback line.
>>>>
>>>>
>>>> Our output data-valid window is predicted by the tools to be very
>>>> narrow relative to the clock period. We figure that controlling the
>>>> temperature (and maybe controlling Vcc-core vs temperature) will open
>>>> up the timing window. The final analysis will have to be experimental.
>>>>
>>>> We can't crank in a constant delay to fix anything; the problem is the
>>>> predicted variation in delay.
>>>>
>>>
>>> I still think the IODELAY could help you.  The output goes through an 
>>> adjustable IODELAY, then you route the output back in through a pin, 
>>> adjust the input IODELAY to figure out where the incoming edge is, 
>>> and then use a feedback loop to keep the output delay constant.  It's 
>>> a technique used for deskewing DRAM data.  I think the main clock 
>>> would also have to be deskewed with a BUFG so you have a good 
>>> reference for the input.  Or, if you characterized the delay-vs-temp 
>>> in the lab, you could run in open-loop mode by adjusting the IODELAY 
>>> tap based on the temperature you read.
>>>
>>> Yes, the tools are definitely pessimistic.  They're only useful for 
>>> worst-case.  I'm pretty sure you can put in the max temperature when 
>>> doing PAR, so you could isolate the effects of just that, but it will 
>>> still probably be worse variation than in reality.
>>
>> My FPGA guy says that the ZYNQ does not have adjustable delay after
>> the i/o block flops. We can vary drive strength in four steps, and we
>> may be able to do something with that.
> 
> That's also not adjustable in real time though.
> 
> I believe what the others are talking about is a real time adjustable 
> delay that is built into the clocking module.  I don't know about the 
> Zynq, but Xilinx has what they call a delay locked loop which sounds 
> exactly like what you need.  I believe it works by syncing the output 
> signal to the clock signal.  There will be some signal path in the 
> feedback loop which will still cause timing variation with temperature 
> and I suppose voltage, but the variation in process can be compensated.
> 

In the 7-series what you want is the MMCM, which has the ability to
adjust the output phase in steps of 1/56 of the VCO period.  This
adjustment can be applied to a subset of the MMCM outputs, so you
can for example vary the outgoing clock phase while keeping the
data phase constant with respect to the clock driving the MMCM.

On the other hand, the whole point of a source synchronous interface
is to just need low skew between outputs - not low skew between the
input clock and the outputs.  Typically just placing the outputs in
the IOB and using the same clock resource is good enough.  Skew
between outputs is much lower than the variance in output delay.

-- 
Gabor

-- 
Gabor

Reply by rickman ●April 12, 20172017-04-12

On 4/12/2017 5:16 PM, Gabor wrote:
> On Wednesday, 4/12/2017 4:27 PM, rickman wrote:
>> On 4/12/2017 4:20 PM, John Larkin wrote:
>>> On Wed, 12 Apr 2017 12:37:59 -0700 (PDT), Kevin Neilson
>>> <kevin.neilson@xilinx.com> wrote:
>>>
>>>>>> If you really need to control output delay you can use the IODELAY
>>>>>> block, possibly along with a copper trace feedback line.
>>>>>
>>>>>
>>>>> Our output data-valid window is predicted by the tools to be very
>>>>> narrow relative to the clock period. We figure that controlling the
>>>>> temperature (and maybe controlling Vcc-core vs temperature) will open
>>>>> up the timing window. The final analysis will have to be experimental.
>>>>>
>>>>> We can't crank in a constant delay to fix anything; the problem is the
>>>>> predicted variation in delay.
>>>>>
>>>>
>>>> I still think the IODELAY could help you.  The output goes through
>>>> an adjustable IODELAY, then you route the output back in through a
>>>> pin, adjust the input IODELAY to figure out where the incoming edge
>>>> is, and then use a feedback loop to keep the output delay constant.
>>>> It's a technique used for deskewing DRAM data.  I think the main
>>>> clock would also have to be deskewed with a BUFG so you have a good
>>>> reference for the input.  Or, if you characterized the delay-vs-temp
>>>> in the lab, you could run in open-loop mode by adjusting the IODELAY
>>>> tap based on the temperature you read.
>>>>
>>>> Yes, the tools are definitely pessimistic.  They're only useful for
>>>> worst-case.  I'm pretty sure you can put in the max temperature when
>>>> doing PAR, so you could isolate the effects of just that, but it
>>>> will still probably be worse variation than in reality.
>>>
>>> My FPGA guy says that the ZYNQ does not have adjustable delay after
>>> the i/o block flops. We can vary drive strength in four steps, and we
>>> may be able to do something with that.
>>
>> That's also not adjustable in real time though.
>>
>> I believe what the others are talking about is a real time adjustable
>> delay that is built into the clocking module.  I don't know about the
>> Zynq, but Xilinx has what they call a delay locked loop which sounds
>> exactly like what you need.  I believe it works by syncing the output
>> signal to the clock signal.  There will be some signal path in the
>> feedback loop which will still cause timing variation with temperature
>> and I suppose voltage, but the variation in process can be compensated.
>>
>
> In the 7-series what you want is the MMCM, which has the ability to
> adjust the output phase in steps of 1/56 of the VCO period.  This
> adjustment can be applied to a subset of the MMCM outputs, so you
> can for example vary the outgoing clock phase while keeping the
> data phase constant with respect to the clock driving the MMCM.
>
> On the other hand, the whole point of a source synchronous interface
> is to just need low skew between outputs - not low skew between the
> input clock and the outputs.  Typically just placing the outputs in
> the IOB and using the same clock resource is good enough.  Skew
> between outputs is much lower than the variance in output delay.

Yeah, well, it's not like we really know the true and full problem.  We 
just know he doesn't like the timing range reported by the tools.

-- 

Rick C

Reply by ●April 12, 20172017-04-12

Den onsdag den 12. april 2017 kl. 22.20.19 UTC+2 skrev John Larkin:
> On Wed, 12 Apr 2017 12:37:59 -0700 (PDT), Kevin Neilson
> <kevin.neilson@xilinx.com> wrote:
> 
> >> > If you really need to control output delay you can use the IODELAY block, possibly along with a copper trace feedback line.
> >> 
> >> 
> >> Our output data-valid window is predicted by the tools to be very
> >> narrow relative to the clock period. We figure that controlling the
> >> temperature (and maybe controlling Vcc-core vs temperature) will open
> >> up the timing window. The final analysis will have to be experimental.
> >> 
> >> We can't crank in a constant delay to fix anything; the problem is the
> >> predicted variation in delay.
> >> 
> >
> >I still think the IODELAY could help you.  The output goes through an adjustable IODELAY, then you route the output back in through a pin, adjust the input IODELAY to figure out where the incoming edge is, and then use a feedback loop to keep the output delay constant.  It's a technique used for deskewing DRAM data.  I think the main clock would also have to be deskewed with a BUFG so you have a good reference for the input.  Or, if you characterized the delay-vs-temp in the lab, you could run in open-loop mode by adjusting the IODELAY tap based on the temperature you read. 
> >
> >Yes, the tools are definitely pessimistic.  They're only useful for worst-case.  I'm pretty sure you can put in the max temperature when doing PAR, so you could isolate the effects of just that, but it will still probably be worse variation than in reality.
> 
> My FPGA guy says that the ZYNQ does not have adjustable delay after
> the i/o block flops. We can vary drive strength in four steps, and we
> may be able to do something with that.

you are right the 7010 and 7020 only have high range IO so no odelay

are you just trying to keep a fixed alignment between clock and data output?

you can do tricks with DDR output flops, data out with a DDR with both inputs
as data, clock out with a DDR with 0,1 as input

-Lasse

Reply by Kevin Neilson ●April 12, 20172017-04-12

> My FPGA guy says that the ZYNQ does not have adjustable delay after
> the i/o block flops. We can vary drive strength in four steps, and we
> may be able to do something with that.
> 

Hmm.  I've used a real-time-adjustable ODELAY block, but that wasn't in a Zynq.

If you can add more hardware to the board, you could re-register the data in some external 74LS flops.

You could use unregistered outputs and make your own delay line with a carry chain, which you can create with behavioral code.

Reply by John Larkin ●April 12, 20172017-04-12

On Wed, 12 Apr 2017 15:22:25 -0700 (PDT), Kevin Neilson
<kevin.neilson@xilinx.com> wrote:

>> My FPGA guy says that the ZYNQ does not have adjustable delay after
>> the i/o block flops. We can vary drive strength in four steps, and we
>> may be able to do something with that.
>> 
>
>Hmm.  I've used a real-time-adjustable ODELAY block, but that wasn't in a Zynq.
>
>If you can add more hardware to the board, you could re-register the data in some external 74LS flops.

We are exactly trying to drive external flops, some 1 ns CMOS parts.
They are clocked by the same clock that is going into the ZYNQ, and
the FPGA needs to set up their D inputs reliably. We can't use a PLL
or DLL inside the FPGA.

So the problem is that the Xilinx tools are reporting a huge (almost
3:1) spread in possible prop delay from our applied clock to the iob
outputs. The tools apparently assume the max process+temperature+power
supply limits, without letting us constrain these, and without
assigning any specific blame.

>
>You could use unregistered outputs and make your own delay line with a carry chain, which you can create with behavioral code.

I think that has even higher uncertainty, probably more than a full
clock period, so we couldn't reliably load those external flops.

-- 

John Larkin         Highland Technology, Inc
picosecond timing   precision measurement 

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

Reply by rickman ●April 12, 20172017-04-12

On 4/12/2017 7:21 PM, John Larkin wrote:
> On Wed, 12 Apr 2017 15:22:25 -0700 (PDT), Kevin Neilson
> <kevin.neilson@xilinx.com> wrote:
>
>>> My FPGA guy says that the ZYNQ does not have adjustable delay after
>>> the i/o block flops. We can vary drive strength in four steps, and we
>>> may be able to do something with that.
>>>
>>
>> Hmm.  I've used a real-time-adjustable ODELAY block, but that wasn't in a Zynq.
>>
>> If you can add more hardware to the board, you could re-register the data in some external 74LS flops.
>
> We are exactly trying to drive external flops, some 1 ns CMOS parts.
> They are clocked by the same clock that is going into the ZYNQ, and
> the FPGA needs to set up their D inputs reliably. We can't use a PLL
> or DLL inside the FPGA.
>
> So the problem is that the Xilinx tools are reporting a huge (almost
> 3:1) spread in possible prop delay from our applied clock to the iob
> outputs. The tools apparently assume the max process+temperature+power
> supply limits, without letting us constrain these, and without
> assigning any specific blame.
>
>
>>
>> You could use unregistered outputs and make your own delay line with a carry chain, which you can create with behavioral code.
>
> I think that has even higher uncertainty, probably more than a full
> clock period, so we couldn't reliably load those external flops.

The way you have constrained the design I think you will need to design 
your own chip.  I would say you need to find a way to relax one of your 
many constraints.  Not using the PLL/DLL is a real killer.  That would 
be a good one to fix.

I haven't use the Xilinx tools in a long time, but I seem to recall 
there was a way to work with a single temperature.  It may have been the 
hot number or the cold number, but not an arbitrary value in between. 
But that may have been the post layout simulation timing.  Simulation is 
not a great way to verify timing in general, but it could be made to 
work for your case.  I'd say get a Xilinx FAE involved.

-- 

Rick C

Reply by Kevin Neilson ●April 13, 20172017-04-13

> We are exactly trying to drive external flops, some 1 ns CMOS parts.
> They are clocked by the same clock that is going into the ZYNQ, and
> the FPGA needs to set up their D inputs reliably. We can't use a PLL
> or DLL inside the FPGA.
>=20
> So the problem is that the Xilinx tools are reporting a huge (almost
> 3:1) spread in possible prop delay from our applied clock to the iob
> outputs. The tools apparently assume the max process+temperature+power
> supply limits, without letting us constrain these, and without
> assigning any specific blame.

Like Lasse said above, you can adjust the output delay with a half-cycle re=
solution using ODDRs.  This sounds good enough for your application.  I use=
d that exact method once for a DRAM (single-data-rate) interface.  (I think=
 the training method was to write data to an unused location in DRAM with v=
arious phase relationships, read it back, and see which writes were success=
ful.)  Your issue sounds a lot like the same issues people have with DRAM. =
 I don't think you'll see a 3:1 variation in reality.

Reply by John Larkin ●April 14, 20172017-04-14

On Thu, 13 Apr 2017 15:22:52 -0700 (PDT), Kevin Neilson
<kevin.neilson@xilinx.com> wrote:

>
>> We are exactly trying to drive external flops, some 1 ns CMOS parts.
>> They are clocked by the same clock that is going into the ZYNQ, and
>> the FPGA needs to set up their D inputs reliably. We can't use a PLL
>> or DLL inside the FPGA.
>> 
>> So the problem is that the Xilinx tools are reporting a huge (almost
>> 3:1) spread in possible prop delay from our applied clock to the iob
>> outputs. The tools apparently assume the max process+temperature+power
>> supply limits, without letting us constrain these, and without
>> assigning any specific blame.
>
>Like Lasse said above, you can adjust the output delay with a half-cycle resolution using ODDRs. 

I can declare the differential-input clock polarity either way, which
would shift things 3.5 ns (out of a 7 ns clock.) But the guaranteed
data-valid window is less than 2 ns.


> This sounds good enough for your application.  I used that exact method once for a DRAM (single-data-rate) interface.  (I think the training method was to write data to an unused location in DRAM with various phase relationships, read it back, and see which writes were successful.)  Your issue sounds a lot like the same issues people have with DRAM.  I don't think you'll see a 3:1 variation in reality.

I sure hope so.



-- 

John Larkin         Highland Technology, Inc

lunatic fringe electronics

Reply by ●April 14, 20172017-04-14

Den fredag den 14. april 2017 kl. 06.12.52 UTC+2 skrev John Larkin:
> On Thu, 13 Apr 2017 15:22:52 -0700 (PDT), Kevin Neilson
> <kevin.neilson@xilinx.com> wrote:
> 
> >
> >> We are exactly trying to drive external flops, some 1 ns CMOS parts.
> >> They are clocked by the same clock that is going into the ZYNQ, and
> >> the FPGA needs to set up their D inputs reliably. We can't use a PLL
> >> or DLL inside the FPGA.
> >> 
> >> So the problem is that the Xilinx tools are reporting a huge (almost
> >> 3:1) spread in possible prop delay from our applied clock to the iob
> >> outputs. The tools apparently assume the max process+temperature+power
> >> supply limits, without letting us constrain these, and without
> >> assigning any specific blame.
> >
> >Like Lasse said above, you can adjust the output delay with a half-cycle resolution using ODDRs. 
> 
> I can declare the differential-input clock polarity either way, which
> would shift things 3.5 ns (out of a 7 ns clock.) But the guaranteed
> data-valid window is less than 2 ns.
> 

the point of using DDR was not to shift the clock but to keep the clock and 
data aligned

"regenerating" the clock with a DDR, means the clock and data gets treated the same and both have the same path DDR-IOB so they should track

getting the output clock aligned with the input clock (if needed) might be possible using the "zero-delay-buffer" mode of the MMCM

Previous 2 345 6 Next

FPGA as heater

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Quick Links

About FPGARelated.com

Social Networks

The Related Media Group