FPGARelated.com
Forums

Xilinx Multiple Clock Domains

Started by Brad Smallridge October 5, 2004
Is there any "how to" documents on how to negotiate a two clock domain?  I
want to run an SRAM with a 3X clock and have everything else run slower.
One of my issues is how the slower clock domain knows the phase of the
faster domain, such that data can come across the clock domain, from fast to
slow, at the right time. If I have a clock divider, such issues can be
resolved in the logic, but I am using a DCM, and the internal workings don't
seem to be as available, that is you just have two outputs, one fast, one
slow.

I also need to simulate this in ModelSim.  I haven't yet even seen the fast
clock signal appear in the signals or waveform generator.  Do I need an
upgrade?  Barring this, I suppose I could develop a component with the core
design and then drive it with a VHDL module with a fast clock and another
clock divided by three. Is this a good plan?




Brad,
Have you considered having just one fast 3X clock for all your design? Just
use a clock enable to enable the slower stuff every third cycle. This make
fixing all the problems you mention a breeze! Also, you might like to read
the Xilinx documentation on Multi-cycle paths.
Cheers, Syms.
"Brad Smallridge" <bradsmallridge@dslextreme.com> wrote in message
news:10m6469iorejaa9@corp.supernews.com...
> Is there any "how to" documents on how to negotiate a two clock domain? I > want to run an SRAM with a 3X clock and have everything else run slower. > One of my issues is how the slower clock domain knows the phase of the > faster domain, such that data can come across the clock domain, from fast
to
> slow, at the right time. If I have a clock divider, such issues can be > resolved in the logic, but I am using a DCM, and the internal workings
don't
> seem to be as available, that is you just have two outputs, one fast, one > slow. > > I also need to simulate this in ModelSim. I haven't yet even seen the
fast
> clock signal appear in the signals or waveform generator. Do I need an > upgrade? Barring this, I suppose I could develop a component with the
core
> design and then drive it with a VHDL module with a fast clock and another > clock divided by three. Is this a good plan?
Brad,

All DCM outputs are phase aligned.

So, for example, if you use the CLK0 output, and the CLKFX output with 
M=3/D=1, every time CLK0 has a rising edge, there will be a rising edge 
for the CLKFX +/- the jitter of the DCM.

Or saying it differently, every third edge of the CLKFX corresponds to a 
CLK0 edge.

That is why the DCM is useful, is that it phase aligns everything to 
known phases and known phase alignments.

This accuracy in alignment is covered in the DCM specifications, as the 
skew between DCM outputs, in the datasheet.

Austin

Brad Smallridge wrote:
> Is there any "how to" documents on how to negotiate a two clock domain? I > want to run an SRAM with a 3X clock and have everything else run slower. > One of my issues is how the slower clock domain knows the phase of the > faster domain, such that data can come across the clock domain, from fast to > slow, at the right time. If I have a clock divider, such issues can be > resolved in the logic, but I am using a DCM, and the internal workings don't > seem to be as available, that is you just have two outputs, one fast, one > slow. > > I also need to simulate this in ModelSim. I haven't yet even seen the fast > clock signal appear in the signals or waveform generator. Do I need an > upgrade? Barring this, I suppose I could develop a component with the core > design and then drive it with a VHDL module with a fast clock and another > clock divided by three. Is this a good plan? > > > >
Austin,

Has the possibility of skew between the 1x and Nx clock due to loading and input
jitter been eliminated then?  I had a problem back when SpartanII was first
released with a design where the incoming clock had enough jitter on it
(introduced apparently by switching of outputs on the same bank as the clock pin)
and vastly different loading on the 1x and 2x clocks so that I had problems
crossing clock domains where I had a flip flop in one domain driving the direct
input of a flip-flop in the other domain via the direct slice to slice connect
inside a clb.  Ever since then, I have been very careful about crossing domains
even if they are generated by the same DLL/DCM.

One way to do it is to make a copy of the slower clock in the faster clock domain,
and then use that for clock enables to make sure the signal is sensed away from
the edge where it changes.



Austin Lesea wrote:

> Brad, > > All DCM outputs are phase aligned. > > So, for example, if you use the CLK0 output, and the CLKFX output with > M=3/D=1, every time CLK0 has a rising edge, there will be a rising edge > for the CLKFX +/- the jitter of the DCM. > > Or saying it differently, every third edge of the CLKFX corresponds to a > CLK0 edge. > > That is why the DCM is useful, is that it phase aligns everything to > known phases and known phase alignments. > > This accuracy in alignment is covered in the DCM specifications, as the > skew between DCM outputs, in the datasheet. > > Austin > > Brad Smallridge wrote: > > Is there any "how to" documents on how to negotiate a two clock domain? I > > want to run an SRAM with a 3X clock and have everything else run slower. > > One of my issues is how the slower clock domain knows the phase of the > > faster domain, such that data can come across the clock domain, from fast to > > slow, at the right time. If I have a clock divider, such issues can be > > resolved in the logic, but I am using a DCM, and the internal workings don't > > seem to be as available, that is you just have two outputs, one fast, one > > slow. > > > > I also need to simulate this in ModelSim. I haven't yet even seen the fast > > clock signal appear in the signals or waveform generator. Do I need an > > upgrade? Barring this, I suppose I could develop a component with the core > > design and then drive it with a VHDL module with a fast clock and another > > clock divided by three. Is this a good plan? > > > > > > > >
-- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759
On Tue, 5 Oct 2004 14:23:03 -0700, "Brad Smallridge" <bradsmallridge@dslextreme.com> wrote:
>Is there any "how to" documents on how to negotiate a two clock domain?
You may find the following useful. http://www.fpga-faq.com/FAQ_Pages/0017_Tell_me_about_metastables.htm Crossing clock domains is also discussed in several documents linked at the bottom of the above referenced page. Philip =================== Philip Freidin philip.freidin@fpga-faq.com Host for WWW.FPGA-FAQ.COM
I would answer that NO, the skew has not been eliminated.  The literature
gives the impression that all the DCM outputs are perfectly phase-aligned
when it appears to be just not true.  How can it be?  The DCM can only
account for the delay across the BUFG *in its feedback path*.  That means
only the CLK0 or CLK2X can be perfectly phase-aligned, and not even the
latter in the V2Pro because of the erratum disallowing the use of CLK2X for
the feedback.  The other outputs (CLKFX, CLKDV) have different loads and
should have different delays across their respective BUFGs.  I don't see how
they could possibly be aligned with the input.

I recently had a problem on a V2Pro trying to transfer data from a 2X domain
to a 1X domain, where both domains were driven by DCMs.  The transfers had
multiple errors indicating that the skew between the domains was too large.
I resolved the problem by transferring across domains away from the edge of
the receiving domain.  Everything I have read implies that this isn't
necessary.

Creating a copy of the slow clock in the fast domain is the method I use.
The slow clock has to be sampled; actually I think I sampled the CLK90 or
one of those to ensure I meet setup.  With the copied clock I can always do
the transfer in the middle of the slow clock cycle (or, in the poster's
case, on the first third of the slow clock's cycle).

The DCMs work very well; I just think that the caveats for their use are not
well-specified.  An app-note explaining the clock-copying method Ray
describes would be very helpful, if such a note does not yet exist.
-Kevin

"Ray Andraka" <ray@andraka.com> wrote in message
news:41632CBB.AD3368E0@andraka.com...
> Austin, > > Has the possibility of skew between the 1x and Nx clock due to loading and
input
> jitter been eliminated then? I had a problem back when SpartanII was
first
> released with a design where the incoming clock had enough jitter on it > (introduced apparently by switching of outputs on the same bank as the
clock pin)
> and vastly different loading on the 1x and 2x clocks so that I had
problems
> crossing clock domains where I had a flip flop in one domain driving the
direct
> input of a flip-flop in the other domain via the direct slice to slice
connect
> inside a clb. Ever since then, I have been very careful about crossing
domains
> even if they are generated by the same DLL/DCM. > > One way to do it is to make a copy of the slower clock in the faster clock
domain,
> and then use that for clock enables to make sure the signal is sensed away
from
> the edge where it changes. > > > > Austin Lesea wrote: > > > Brad, > > > > All DCM outputs are phase aligned. > > > > So, for example, if you use the CLK0 output, and the CLKFX output with > > M=3/D=1, every time CLK0 has a rising edge, there will be a rising edge > > for the CLKFX +/- the jitter of the DCM. > > > > Or saying it differently, every third edge of the CLKFX corresponds to a > > CLK0 edge. > > > > That is why the DCM is useful, is that it phase aligns everything to > > known phases and known phase alignments. > > > > This accuracy in alignment is covered in the DCM specifications, as the > > skew between DCM outputs, in the datasheet. > > > > Austin > > > > Brad Smallridge wrote: > > > Is there any "how to" documents on how to negotiate a two clock
domain? I
> > > want to run an SRAM with a 3X clock and have everything else run
slower.
> > > One of my issues is how the slower clock domain knows the phase of the > > > faster domain, such that data can come across the clock domain, from
fast to
> > > slow, at the right time. If I have a clock divider, such issues can be > > > resolved in the logic, but I am using a DCM, and the internal workings
don't
> > > seem to be as available, that is you just have two outputs, one fast,
one
> > > slow. > > > > > > I also need to simulate this in ModelSim. I haven't yet even seen the
fast
> > > clock signal appear in the signals or waveform generator. Do I need
an
> > > upgrade? Barring this, I suppose I could develop a component with the
core
> > > design and then drive it with a VHDL module with a fast clock and
another
> > > clock divided by three. Is this a good plan? > > > > > > > > > > > > > > -- > --Ray Andraka, P.E. > President, the Andraka Consulting Group, Inc. > 401/884-7930 Fax 401/884-7950 > email ray@andraka.com > http://www.andraka.com > > "They that give up essential liberty to obtain a little > temporary safety deserve neither liberty nor safety." > -Benjamin Franklin, 1759 > >
Kevin,

All of the outputs for the DCM are generated in the "outgen" block, 
which uses matched paths and devices, fully buffered.  All of the 
outputs are derived from the delay line, so all of the timing is related.

That is how we can do it.

The skew is +/- 100 ps (as I recall) to acount for all the mismatches 
possible at the output of the DCM.

Now going into the BUG trees (which is fully buffered, so loads don't 
count) puts another uncertainty on the values, but from BUFG to BUFG 
these are also matched pretty well (less than a few tens of ps mismatch).

The flight time along the BUFG tree will vary, and if you get off near 
the center, or get off in the top right hand corner on a large part (ie 
2VP100), you may also have 500 to 700 ps of time difference between 
these two nets.

Add system jitter, and DCM jitter to it, and if you are not careful, at 
a high frequency, you might get the result you did.  On the other hand, 
'being careful' means having to relationally place things, or hand place 
things, so that may not be something you want to do (I wouldn't unless I 
had to).

One very common misconception is that using CLK2X FB is somehow better 
than CLK0 FB:  it is not.  There is no difference specification wise. 
The CLK2X gets divided by two just before the phase detector, so any 
belief that it is better matched somehow assumes we did a perfect job 
matching a /2 to a straight thru path (which, again, we do our best).

If you believe the CLK2X is better, then why do you not believe the 
'outgen' is just as good?

I am all for full synchronous design, (simpler, easier to verify), but 
it seems that folks keep finding ways to use the DCM in what I might 
call "isochronous design."

Austin

Kevin Neilson wrote:
> I would answer that NO, the skew has not been eliminated. The literature > gives the impression that all the DCM outputs are perfectly phase-aligned > when it appears to be just not true. How can it be? The DCM can only > account for the delay across the BUFG *in its feedback path*. That means > only the CLK0 or CLK2X can be perfectly phase-aligned, and not even the > latter in the V2Pro because of the erratum disallowing the use of CLK2X for > the feedback. The other outputs (CLKFX, CLKDV) have different loads and > should have different delays across their respective BUFGs. I don't see how > they could possibly be aligned with the input. > > I recently had a problem on a V2Pro trying to transfer data from a 2X domain > to a 1X domain, where both domains were driven by DCMs. The transfers had > multiple errors indicating that the skew between the domains was too large. > I resolved the problem by transferring across domains away from the edge of > the receiving domain. Everything I have read implies that this isn't > necessary. > > Creating a copy of the slow clock in the fast domain is the method I use. > The slow clock has to be sampled; actually I think I sampled the CLK90 or > one of those to ensure I meet setup. With the copied clock I can always do > the transfer in the middle of the slow clock cycle (or, in the poster's > case, on the first third of the slow clock's cycle). > > The DCMs work very well; I just think that the caveats for their use are not > well-specified. An app-note explaining the clock-copying method Ray > describes would be very helpful, if such a note does not yet exist. > -Kevin > > "Ray Andraka" <ray@andraka.com> wrote in message > news:41632CBB.AD3368E0@andraka.com... > >>Austin, >> >>Has the possibility of skew between the 1x and Nx clock due to loading and > > input > >>jitter been eliminated then? I had a problem back when SpartanII was > > first > >>released with a design where the incoming clock had enough jitter on it >>(introduced apparently by switching of outputs on the same bank as the > > clock pin) > >>and vastly different loading on the 1x and 2x clocks so that I had > > problems > >>crossing clock domains where I had a flip flop in one domain driving the > > direct > >>input of a flip-flop in the other domain via the direct slice to slice > > connect > >>inside a clb. Ever since then, I have been very careful about crossing > > domains > >>even if they are generated by the same DLL/DCM. >> >>One way to do it is to make a copy of the slower clock in the faster clock > > domain, > >>and then use that for clock enables to make sure the signal is sensed away > > from > >>the edge where it changes. >> >> >> >>Austin Lesea wrote: >> >> >>>Brad, >>> >>>All DCM outputs are phase aligned. >>> >>>So, for example, if you use the CLK0 output, and the CLKFX output with >>>M=3/D=1, every time CLK0 has a rising edge, there will be a rising edge >>>for the CLKFX +/- the jitter of the DCM. >>> >>>Or saying it differently, every third edge of the CLKFX corresponds to a >>>CLK0 edge. >>> >>>That is why the DCM is useful, is that it phase aligns everything to >>>known phases and known phase alignments. >>> >>>This accuracy in alignment is covered in the DCM specifications, as the >>>skew between DCM outputs, in the datasheet. >>> >>>Austin >>> >>>Brad Smallridge wrote: >>> >>>>Is there any "how to" documents on how to negotiate a two clock > > domain? I > >>>>want to run an SRAM with a 3X clock and have everything else run > > slower. > >>>>One of my issues is how the slower clock domain knows the phase of the >>>>faster domain, such that data can come across the clock domain, from > > fast to > >>>>slow, at the right time. If I have a clock divider, such issues can be >>>>resolved in the logic, but I am using a DCM, and the internal workings > > don't > >>>>seem to be as available, that is you just have two outputs, one fast, > > one > >>>>slow. >>>> >>>>I also need to simulate this in ModelSim. I haven't yet even seen the > > fast > >>>>clock signal appear in the signals or waveform generator. Do I need > > an > >>>>upgrade? Barring this, I suppose I could develop a component with the > > core > >>>>design and then drive it with a VHDL module with a fast clock and > > another > >>>>clock divided by three. Is this a good plan? >>>> >>>> >>>> >>>> >> >>-- >>--Ray Andraka, P.E. >>President, the Andraka Consulting Group, Inc. >>401/884-7930 Fax 401/884-7950 >>email ray@andraka.com >>http://www.andraka.com >> >> "They that give up essential liberty to obtain a little >> temporary safety deserve neither liberty nor safety." >> -Benjamin Franklin, 1759 >> >> > > >
Hi,


Does that mean that something like what's below should work without worrying about
timedomain crossing ?

   ___                 ___                 ____                    ____
  | R |               | R |                | R |                  | R |
--| e |---- Comb 1 ---| e |---- Comb 2 ----| e |--\/--- Comb 3 ---| e |---
  | g |               | g |     /          | g |  |               | g |
  | 1 |               | 2 |     |          | 3 |  |               | 4 |
 /|>__|              /|>__|     |    ClkFX-|>__|  |              /|>__|
 |                   |          \_________________/              |
Clk                  Clk                                        Clk

With ClkFX = n * Clk



Note that It doesn't represent something in particular, just something out of my imagination.
What comes to mind is for example an operation that would require 3 multiplexer, you
could time-multiplex them this way and the rest of the pipeline don't know it ...


Sylvain
Austin wrote:
> > Now going into the BUG trees (which is fully buffered, so loads don't > count) puts another uncertainty on the values, but from BUFG to BUFG > these are also matched pretty well (less than a few tens of ps mismatch). >
You seem to be stating that the skew between a heavily loaded clock tree and a lightly loaded clock tree will be close enough that setup/hold will be met when cascaded adjacent flip-flops are clocked from each domain. In my own DDR designs, with lightly loaded DDR register clocks and heavily loaded internal logic clocks, timing reports and lab testing have shown otherwise, and opposite-edge ( or 90/270 phased ) clocks were needed to properly transfer between the two clock domains.
> > If you believe the CLK2X is better, then why do you not believe the > 'outgen' is just as good? >
Again, it's a concern with deskewing the clock tree delays- as Kevin pointed out, you can't feedback the loaded 2x clock net into the DCM, so the 2x clock net will be offset by the BUFG net difference between the 2x clock and the 1x clock net that you can use as feedback. ( the lack of 2X feedback in V2P & S3 also makes it harder to do certain internal/external clock deskew topologies ) Another related concern for cascaded DCM's, or in cases where you need a known (zero) phase relationship between the DCM input and output: unless you set the DCM to SOURCE_SYNCHRONOUS mode, the DCM output clock will LEAD the input clock (by ~1.5 ns in V2) as a result of the internal DCM feedback delay element used to insure zero-hold at IOBs. Brian
The fact that the clocks are in phase isn't the issue.  The issue is that
the slow domain doesn't "know" where the fast domain is in its cycle.
Suppose I send out a read command on fast clk 1, the data comes back on fast
clk 3, and should be transfered to the slow domain on fast clk 4.  And
suppose this cycle repeats every 3 fast clks.  How do I get the slow clock
to lign up with fast clk 4, 7, 10, etc.

The idea of just using one fast clk, with clock enables on all the slow
stuff,  is an attractive idea.  However I don't have any experience in
determining  whether this will lead to a timing constraint issues or not.
It seems that it will be difficult enough to get the performance from the
SRAM interface.  How do I loosen the constraints on the slow stuff?  All
those registers would be on the same clock net.

Ray Andraka suggested making a copy of the slow clock in the fast domain,
and using that signal as a clock enable, I guess in the fast domain?  I'm
not sure how this helps.  The idea of using 90/270 clocks to get rid of skew
issues seems good.

I have found XAPP253 which is an DDR SDRAM controller with some of the same
issues.  Trying to work through that.