FPGARelated.com
Forums

Spartan 3E: MAX_STEPS as a function of CLKIN frequency

Started by Bill Valores March 29, 2010
Hello,

I'm trying to figure out how much delay variance I can achieve using
the variable phase shifter on a Spartan 3E device. I may change the
input frequency if that helps me, and use the CLKFX output as target
clock, so basically I need to figure out how to choose CLKIN wisely to
get an adequate phase shift.

The datasheet and user guide seem to agree on

If CLKIN < 60 MHz: MAX_STEPS =3D =C2=B1[INTEGER(10 =E2=80=A2(TCLKIN =E2=80=
=93 3))]
If CLKIN > 60 MHz: MAX_STEPS =3D =C2=B1[INTEGER(15 =E2=80=A2(TCLKIN =E2=80=
=93 3))]

which is a pretty weird relationship, if I may say so. I can't see why
the clock's frequency would matter at all, since the delay line steps
don't depend on the frequency. The fact that this equation is
discontinuous at 60 MHz doesn't make it look better. What looks even
more weird is that since the minimal input frequency is 5 MHz, we get
MAX_STEPS of =C2=B1(10 =E2=80=A2(200 =E2=80=93 3)) which is =C2=B11970 step=
s, or with a 20ps
step size, =C2=B139.4 ns of delay! That is four times longer than the delay
line dedicated for a fixed phase shifting. Or eight times? I'm lost.

But I'm in good company, it seems. Reading the XAPP 485's attached
sample code, auto_phase_align_s3s.v header comment says "Note counters
are long enough (13-bit) for operation down to 5 MHz assuming 25 pS
per tap. (200,000/25=3D8000)". Looks like someone thought that the FPGA
has a 200 ns delay line. It's getting even more impressive.

The ug331 Spartan 3E user guide offers a hint: Table 3-32 says what
happens in different situations. Among others, there's this situation
labeled "=E2=89=A5 +Limit and < +255" which is commented as "end of delay
line".

So maybe this sums up to that effectively MAX_STEPS could never exceed
=C2=B1255 steps? That would guarantee a minimal delay swing of =C2=B15.12 n=
s (if
each delay tap is 20ps) which sounds pretty familiar (the maximal
delay of other Xilinx FPGAs is 5ns, I believe). But sensible as this
sounds to me, I've not been able to find a conclusive statement about
this.

Can anyone shed some light on this?

Thanks in advance,
Bill
On Mar 29, 3:42=C2=A0pm, Bill Valores <bill.valo...@gmail.com> wrote:
> Hello, > > I'm trying to figure out how much delay variance I can achieve using > the variable phase shifter on a Spartan 3E device. I may change the > input frequency if that helps me, and use the CLKFX output as target > clock, so basically I need to figure out how to choose CLKIN wisely to > get an adequate phase shift. > > The datasheet and user guide seem to agree on > > If CLKIN < 60 MHz: MAX_STEPS =3D =C2=B1[INTEGER(10 =E2=80=A2(TCLKIN =E2=
=80=93 3))]
> If CLKIN > 60 MHz: MAX_STEPS =3D =C2=B1[INTEGER(15 =E2=80=A2(TCLKIN =E2=
=80=93 3))]
> > which is a pretty weird relationship, if I may say so. I can't see why > the clock's frequency would matter at all, since the delay line steps > don't depend on the frequency. The fact that this equation is > discontinuous at 60 MHz doesn't make it look better. What looks even > more weird is that since the minimal input frequency is 5 MHz, we get > MAX_STEPS of =C2=B1(10 =E2=80=A2(200 =E2=80=93 3)) which is =C2=B11970 st=
eps, or with a 20ps
> step size, =C2=B139.4 ns of delay! That is four times longer than the del=
ay
> line dedicated for a fixed phase shifting. Or eight times? I'm lost. > > But I'm in good company, it seems. Reading the XAPP 485's attached > sample code, auto_phase_align_s3s.v header comment says "Note counters > are long enough (13-bit) for operation down to 5 MHz assuming 25 pS > per tap. (200,000/25=3D8000)". Looks like someone thought that the FPGA > has a 200 ns delay line. It's getting even more impressive. > > The ug331 Spartan 3E user guide offers a hint: Table 3-32 says what > happens in different situations. Among others, there's this situation > labeled "=E2=89=A5 +Limit and < +255" which is commented as "end of delay > line". > > So maybe this sums up to that effectively MAX_STEPS could never exceed > =C2=B1255 steps? That would guarantee a minimal delay swing of =C2=B15.12=
ns (if
> each delay tap is 20ps) which sounds pretty familiar (the maximal > delay of other Xilinx FPGAs is 5ns, I believe). But sensible as this > sounds to me, I've not been able to find a conclusive statement about > this. > > Can anyone shed some light on this? > > Thanks in advance, > Bill
About a year ago, I had to do some fairly fine-grained phase-shifting to poke at a hardware problem on a chip we build. I used a Nexys2 board (S3E), used Xilinx's equation, and on an 8.192 MHz clock (which their equation says ought to be able to go +/- 1190 steps, if I read it right, I just did the "nearest power of two" thing and went +/- 1024 steps. IIRC, the actual step size I measured in the lab was around 25 ps, and it seemed to be not only monotonic, but also to have very little variance in the steps. I had an external PLL creating a 98 MHz clock from the 8.192 MHz by multiplying by 12, and I could cycle through several periods of the 98 MHz clock with no problem. My test jig worked flawlessly with no discontinuities. I have no idea what they do or how they do it, but I have no complaints about that portion of the datasheet. No, pretty much all my Xilinx complaints have to do with software rather than hardware, but you don't have enough time for that... Regards, Pat
Bill,

The DCM "fits" a delay line of 256 taps long, to the period of the
clock input signal.

In 3E & 3A, a different architecture was used than in the V2, V2P, 3
and V4, where the taps are always 20 to 40 ps long, and there are
always enough of them to even fit 256 of them to a 5 MHz input clock.

I know, its magic.

Austin






Bill,

Oops.  I mean that it has as many taps as required to "fit" a 5 MHz
input clock, not 256 like V2, V2P, 3, and V4....

Magic...

Austin
austin <austin@xilinx.com> wrote:
 
> Oops. I mean that it has as many taps as required to "fit" a 5 MHz > input clock, not 256 like V2, V2P, 3, and V4....
So as the process gets faster, the number of taps will increase? (With a margin big enough for process/temperature/voltage variations.) -- glen
On Mar 29, 6:06=A0pm, austin <aus...@xilinx.com> wrote:
> Bill, > > The DCM "fits" a delay line of 256 taps long, to the period of the > clock input signal. > > In 3E & 3A, a different architecture was used than in the V2, V2P, 3 > and V4, where the taps are always 20 to 40 ps long, and there are > always enough of them to even fit 256 of them to a 5 MHz input clock. > > I know, its magic. > > Austin
Austin - are you up on your Spartan3E variable phase shift trivia? There is unfortunately no 256-tap fit (for the variable phase shift, at least). The variable phase shift is a tap-by-tap adjustment unlike most every other family out there which did the 256 tap "fit" properly even if it took more than one inc or dec to change taps. The limits defined by the data sheet ended up being a very unusual set of numbers and is enough to make an engineer sweat about whether they'll be able to get the phase shift they want. I was happy to get the 3 bit periods of phase adjustment out of the 300 MHz multiplied clock (based off a 50MHz reference). But what a wild ride that was! I ended up with correspondence directly with the silicon designer when the phase inc/dec update started ignoring commands on occasion. So, Bill - word of caution... If you use the variable phase shift and consider the "about 100 clocks" to make sure the adjustment occurs - or similarly worded metric - you may have occasion where the adjustment doesn't happen or at least for a duration much longer than expected. I ended up with a timeout count such that if I didn't get an ack after a certain period of time, I retried the adjustment. Sadly, I forget what the final implementation included, whether the timeout was still the way to go or if there was a "real" solution.
Thank you for your answers.

I take it that they're serious about having some 4000 taps. Also, I
keep hearing (reading, actually) about the variable phase shifter
taking a long time to acknowledge its commands. Make me wonder why,
since all we're asking for is "what you did right now, just another
tap".

I wish someone revealed what really goes on there, and how they
reached those peculiar equations. All this feels like witchcraft.

I'm going to experiment a bit. The problem is that my experiments will
be on a stepping 0 FPGA, while the target board will have a stepping 1
high speed grade chip. With all this mystery, I wonder if that's going
to make a difference (and they obviously did something with the DCM
between the steppings).

Bill.
On Mar 30, 6:14=A0am, Bill Valores <bill.valo...@gmail.com> wrote:
> Thank you for your answers. > > I take it that they're serious about having some 4000 taps. Also, I > keep hearing (reading, actually) about the variable phase shifter > taking a long time to acknowledge its commands. Make me wonder why, > since all we're asking for is "what you did right now, just another > tap". > > I wish someone revealed what really goes on there, and how they > reached those peculiar equations. All this feels like witchcraft. > > I'm going to experiment a bit. The problem is that my experiments will > be on a stepping 0 FPGA, while the target board will have a stepping 1 > high speed grade chip. With all this mystery, I wonder if that's going > to make a difference (and they obviously did something with the DCM > between the steppings). > > Bill.
$179.00 gets you a XC3S1200E on a Digilent Nexys2 board. Add 30 bucks or so for overnight shipping, and I don't know why you have to be on a different stepping... Regards, Pat
You definitely have a point there. With a board on my table (which I
obviously have everything set up with) I didn't even think in that
direction. Now it's my laziness and having a lot of handy
communication features with the current board vs. the risk that the
test won't be valid.

Plus the fact that the clock source will be the same on my board and
the one I'm targeting. I understand that clock jitter has been a
primary suspect where problems have arisen.

I guess I'll start with my board and see how stable this feels.

Thanks,
Bill

On Mar 30, 3:30=A0pm, Patrick Maupin <pmau...@gmail.com> wrote:
> > $179.00 gets you a XC3S1200E on a Digilent Nexys2 board. =A0Add 30 bucks > or so for overnight shipping, and I don't know why you have to be on a > different stepping... > > Regards, > Pat
Hi again,

Impatient as I am, I went for testing the variable phase shifter on
the existing board. I ran on a xc3s1600e-4 stepping 0, which is not
what I'll have in the end, but the results were so encouraging, that I
feel pretty safe to jump to another device with no further checks.

I wrote a small Verilog module (the "state machine" below), which
sends shift commands in order to reach a certain position, which I set
through some register interface. Given that I wrote the module right
(which it seems like I did), I could arbitrarily jump to any position
I liked, just by changing the register. I had the shifted clock and a
reference clock wired to a testpoint, and I also had a signal going
high whenever my state machine was between PSEN and PSDONE (that is,
waiting). The latter turned out very interesting.

So I fed the DCM with a 5 MHz clock (minimal frequency) and started
playing around. Things went very smooth as long as I remained within
the per-spec =B11970 steps boundary. In my case I got around 22.75ps per
step (measured by jumping 1000 steps or so) consistently. I thought
this number was absurd, but it turned out to be true: The chip allows
=B139.4 ns of delay. That's a useful rail of nearly 80 ns. Pretty
impressive.

But I didn't stop there. I went for larger shifts until things started
to break. =B14000 was OK. When I tried 4500 in either direction, hell
broke lose. The most interesting thing was that the state machine was
waiting for PSDONE. I didn't look deeply into how much time each PSEN-
PSDONE cycle took, but it appeared long, random, and sometimes
completely stuck. In one occasion, the state machine waited for quite
a while, and then suddenly the "waiting" line went low again. On the
scope I saw a 2.5 MHz clock on the DCM's output (the original
frequency divided by two).

This random behavior appears suspiciously similar to reports I've read
from people who said that the PSDONE came much later than expected.
Which makes me think (or hope?) that their problem was in their own
logic, which accidentally pushed the phase shifter beyond its limit.

Another interesting thing about this result, is that the region of
4000-4500 steps is where the extra delay gets close to half a clock
cycle. On the scope, it was pretty evident that things started to go
wrong when the shifted clock got close to a 180 degree shift. And this
might explain the rule allowing 10 steps (0.4 ps at most) for each ns
of cycle period on the shifted clock. As for those extra 3ns, I
suppose there's a reason as well.

I tried this again with a 33.3 MHz input clock. The behavior was the
same. Hell broke lose when the shifting reached half a clock cycle.

In short, all this starts to make sense to me. This looks like "follow
the spec and everything will be all right". The bottom line is that
the datasheet was right, and the user guide was confusing.

Thanks again for your answers. You really were helpful.
Bill.