comp.arch.fpga | accumulator (again)| page 2

Reply by Ed McGettigan ●July 5, 20122012-07-05

On Jul 5, 11:04=A0am, jmariano <jmarian...@gmail.com> wrote:
> Hi Rick, tanks for your help.
>
> > I'm not real clear on your description of your design, but if you are
> > really generating clocks from the 50 MHz, I recommend that inside the
> > FPGA you instead use a single clock and generate clock enables for the
> > various functions.
>
> Yes, I generate a 5 MHz clock inside the module from the main 50 MHz cloc=
k by simple division by 10 because I need a 5 MHz adc clock. I can't use cl=
ock enable because the AD9058 adc does not have a enable input, just clock.
>
> > When you use multiple clocks in a circuit you have
> > to do extra work for every signal that crosses a clock domain. =A0Could
> > that be your problem?
>
> What is the extra work? Have no idea! Synchronization?
>
> > I don't see anything in your original post about simulation. =A0Do you
> > simulate your modules? =A0I highly recommend that you write a test
> > benche for each and every module you code. =A0You may think this takes
> > too much time, but I believe it pays off in the end with shorter
> > integration time.
>
> Sorry about that, I did, in fact, simulate each module and the top entity=
. The behavior simulation gives the expected results, the post and place si=
mulation gives same errors that I could not understand, but I'll run the si=
mulations again and post the results here.
>
> jmariano

The good news here is that you have a simulation that shows the same
behavior in hardware.  Looking at these simulation runs should tell
you exactly what the problem is.  I don't think that anyone here will
be able to the same with the full source code for the design.

Ed McGettigan
--
Xilinx Inc.

Reply by rickman ●July 5, 20122012-07-05

On Jul 5, 2:03=A0pm, Ed McGettigan <ed.mcgetti...@xilinx.com> wrote:
> On Wednesday, July 4, 2012 12:49:07 PM UTC-7, rickman wrote:
> > On Jul 3, 5:45=A0pm, Ed McGettigan <ed.mcgetti...@xilinx.com> wrote:
> > > On Monday, July 2, 2012 10:24:02 PM UTC-7, Tim Wescott wrote:
>
> > > > Paranoid logic designers will have a string of two or three registe=
rs to
> > > > avoid metastability, but I've been told that's not necessary. =A0(I=
'm not
> > > > much of a logic designer).
>
> > > > --
> > > > Tim Wescott
> > > > Control system and signal processing consulting
> > > >www.wescottdesign.com
>
> > > It isn't just the paranoid logic designer, it should be every logic d=
esigner.
>
> > > A single register only partially solves the problem of an asynchronou=
s input with multiple register destinations, but it does not solve the very=
 real metastability problem. =A0At least two registers should be used to en=
sure that the metastability condition has resolved and with increasing cloc=
k frequency and finer process nodes using three or more stages may be neces=
sary.
>
> > > Ed McGettigan
> > > --
> > > Xilinx Inc.
>
> > Hi Ed. =A0They way it was explained to me, I believe from Peter Alfke,
> > is that what really resolves metastability is the slack time in a
> > register to register path. =A0Over the years FPGA process has resulted
> > in FFs which only need a couple of ns to resolve metastability to 1 in
> > a million operation years or something like that (I don't remember the
> > metric, but it was good enough for anything I do). =A0It doesn't matter
> > that you have logic in that path, you just need those few ns in every
> > part of the path. =A0In theory, even if you use multiple registers with
> > no logic, what really matters is the slack time in the path and that
> > is not guaranteed even with no logic. =A0So the design protocol should
> > be to assure the slack time from the input register to all subsequent
> > registers have sufficient slack time.
>
> > Do you remember how much time that needs to be? =A0I want to say 2 ns,
> > but it might be more like 5 ns, I just can't recall. =A0Of course it
> > depends on your clock rates, but I believe Peter picked some more
> > aggressive speeds like 100 MHz for his example.
>
> > Rick
>
> I'm glad to see that one of my 5-6 attempts to post was finally accepted =
by Google. =A0I have got to switch to something else.
>
> Peter Alfke's publications on metastability definitely fall into the semi=
nal category, but you must be careful to extrapolate the original data to t=
he latest technology nodes, circuits and design requirements. =A0There are =
two major factors that impact the metastability equations, the tau or metas=
tability decay rate and the settling time.
>
> The tau value is an inherent characteristic of the circuit and technology=
 node and for a long time the expectation was that this is would decrease w=
ith each generation, but this has stopped being true.
>
> The settling time, Ts, is dependent on the design and is under the user's=
 control. Ts is a factor of the destination clock frequency and the timing =
slack between registers. If you have 100 MHz clock frequency, but you use u=
p 9.5nS to get to the destination your slack is only 500pS. Adding register=
 stages allows for maximum use of the clock period increasing the settling =
time and for each stage it increases again.
>
> Ed McGettigan
> --
> Xilinx Inc.

The info I am referring to are posts that were made here and pertained
to the "current" generation of some six or eight years ago.  At that
time Peter made the point that the "tau" as you call it, had gotten so
fast that the impact was negligible for all but the most stringent
designs and only a small amount of slack time is needed.

A quick search found these two posts about V2Pro devices.  I assume
your newer devices are at least as good as 10 year old technology.
Note that Peter makes a point that the capture window T0, which is a
product in the formula, is not an important parameter.  Tau is an
exponent (in ratio with Tslack) in the formula and so makes much
larger contribution to the result.  The same is true for the two clock
frequencies, they are just products in the formula and so don't make
huge changes to the MTBF.

So it seems like not much would have changed in 10 years in how a
designer should deal with metastability.  Leaving 2 ns of slack time
in the first register to register path should make literally all
designs extremely robust regardless of how many registers are
receiving the first register output or if there is logic in the path.
Just make sure there is 2 ns slack time and your designs should be
good for many, many years!

Rick

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Peter Alfke comp.arch.fpga Oct 10 2002, 8:40 pm

You mentioned metastability, and that caught my attention.

Metastability is a reality, but it (and the fear of it) is highly
overrated.
We recently tested Virtex-IIPro flip-flops, made on 130 nm technology.
You
might call that cutting edge technology, but not exotic.
When a 330 MHz clock synchronized a ~50 MHz input, there was a 200 ps
extra
metastable delay ( causing a clock-to-out + short routing + set-up
total of 1.5
ns) once every second. That translates into a metastable capture
window that
has a width of 3 ns divided by 100 million ( since we looked at both
edges of
the 50 MHz signal).
So the window for a 200 ps extra delay is 0.03 femtoseconds.
If you can tolerate 500 ps more, the MTBF increases 100 000 times, and
the
capture window gets that much smaller.
Metastability is a real, but highly overrated problem.

Peter Alfke, Xilinx Applications
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Peter Alfke comp.arch.fpga Oct 15 2002, 1:11 pm

Here are the K2 values for Virtex-IIPro:

CLB @1.50V: K2 =3D 27.2, i.e. 1/K2 =3D tau =3D 36.8 picoseconds
CLB @1.35V: K2 =3D 23.3, i.e. 1/K2 =3D tau =3D 42.9 picoseconds
CLB @1.65V: K2 =3D 35.7, i.e. 1/K2 =3D tau =3D 28.0 picoseconds

IOB @1.50V: K2 =3D 24.4,  i.e. 1/K2 =3D tau =3D 41.0 picoseconds
IOB @1.35V: K2 =3D 19.24, i.e. 1/K2 =3D tau =3D 52.0 picoseconds
IOB @1.65V: K2 =3D 44.05, i.e. 1/K2 =3D tau =3D 22.7 picoseconds

For each extra 100 ps of acceptable metastable delay,
the MTBF increases by a factor 10.3 for CLB @ 1.35 V,
or a factor 6.85 for IOB @ 1.35 V.
Much better values, of course, at nominal or high Vcc.

Klick on
http://support.xilinx.com/support/techxclusives/techX-home.htm
in early November.

Here is the worst-case data point:

50 MHz asynchronous data rate, 330 MHz clock , single-stage
synchronizer in IOB,
Vcc =3D 1.35 V:
clock-to-Q + short routing + set-up time + metastable delay exceeds
clock period
once per 30,000 years.

At nominal Vcc: once per 100 million years.

At a 250 MHz clock rate, delay exceeds clock period less often than
once per
billion years.

Peter Alfke, Xilinx Applications
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

Reply by lang...@fonz.dk ●July 5, 20122012-07-05

On Jul 5, 8:04=A0pm, jmariano <jmarian...@gmail.com> wrote:
> Hi Rick, tanks for your help.
>
> > I'm not real clear on your description of your design, but if you are
> > really generating clocks from the 50 MHz, I recommend that inside the
> > FPGA you instead use a single clock and generate clock enables for the
> > various functions.
>
> Yes, I generate a 5 MHz clock inside the module from the main 50 MHz cloc=
k by simple division by 10 because I need a 5 MHz adc clock. I can't use cl=
ock enable because the AD9058 adc does not have a enable input, just clock.
>

you could just have a state machine running at 50MHz that grap data
and set/clear the clock

which I guess is partly what you have in you divide by 10

-Lasse

Reply by Brian Drummond ●July 6, 20122012-07-06

On Thu, 05 Jul 2012 11:04:36 -0700, jmariano wrote:

> Hi Rick, tanks for your help.
> 
>> I'm not real clear on your description of your design, but if you are
>> really generating clocks from the 50 MHz, I recommend that inside the
>> FPGA you instead use a single clock and generate clock enables for the
>> various functions.
> 
> Yes, I generate a 5 MHz clock inside the module from the main 50 MHz
> clock by simple division by 10 because I need a 5 MHz adc clock. I can't
> use clock enable because the AD9058 adc does not have a enable input,
> just clock.

That's OK.

But you need to register the AD9058 outputs, inside the FPGA, to your 
internal 50MHz clock. I would also register the S input and the U,V 
outputs from the switch. (In fact I would make the switch a synch process 
with only "clk" in its sensitivity list - it will effectively register 
the switch outputs for you)

All these can be combined into a single synchronous process.

	-- assuming u,v,r,i,adcnn are all signed!
	process(clk)
	begin
           if rising_edge(clk) then
		-- First pipe stage... synchronise the inputs
		if adc_enable then	-- 10 MHz, when ADC is stable
		    adc0_int 	<= adc0;	
		    ...
	        end if;
		-- Second pipe stage... add the (synchronised inputs)
		u 		<= adc0_int - adc90_int;	
		v 		<= ...
		-- I assume "s" has to be synchronised to "adcnn" 
		-- so pipeline it to the same depth (also syncs it)
		s_int 	<= s;
		s_int2 	<= s_int;
		-- Third pipe stage ... the switch
		case s_int2 is when "00" =>
			r <=  u;
			i <=  v;
		    when "01" =>
		...
		end case;
		-- etc
            end if;
	end process;

Addition at 50MHz in a Spartan-3 should be no problem. 

As your sample rate is 1/10 of the clock rate, I would expect you can 
afford a few cycles for internal processing. (If this is not the case you 
need to think carefully about how you pipeline the design)

>> I don't see anything in your original post about simulation.  
> Sorry about that, I did, in fact, simulate each module and the top
> entity. The behavior simulation gives the expected results, the post and
> place simulation gives same errors that I could not understand, 

Excellent. 
Before changing the design, I would sim with low level zero-crossing 
signals, and see which inputs (s, ADCs) and internal signals (U,V, R,I) 
are unstable whenever the large unexpected outputs are occurring.

Then what you need to do to fix will be clear.

You can also install multiple versions in the testbench, asserting their 
outputs are the same, and reporting any difference.

- Brian

Reply by Hal Murray ●July 6, 20122012-07-06

In article <nZ-dnch1rrNvHG_SnZ2dnUVZ_qSdnZ2d@web-ster.com>,
 Tim Wescott <tim@seemywebsite.please> writes:

>Paranoid logic designers will have a string of two or three registers to 
>avoid metastability, but I've been told that's not necessary.  (I'm not 
>much of a logic designer).

Ahh, but are they paranoid enough?

The key is settling time.

In the old days of TTL chips, a pair of FFs (with no logic in between)
got you settling time of as much logic as the worst case delay for
the rest of the system.  In practice, that was enough.

With FPGAs, routhing is important.  A pair of FFs close together
is probably good enough.  If you put them on opposite sides of a big
chip, the routing delays may match the long path of the logic delays
and eat up all of your slack time.

Have any FPGA vendors published recent metastability info?
(Many thanks to Peter Alfke for all his good work in this area.)

I'm not a silicon wizard.  Is it reasonable to simulate this stuff?
I'd like to know worst case rather than typicals.  It should be possible
to do something like verify simulations with lab typicals and then
use simulations to find the numbers for the nasty corners.

-- 
These are my opinions.  I hate spam.

Reply by rickman ●July 8, 20122012-07-08

On Jul 6, 10:00 pm, hal-use...@ip-64-139-1-69.sjc.megapath.net (Hal
Murray) wrote:
> In article <nZ-dnch1rrNvHG_SnZ2dnUVZ_qSdn...@web-ster.com>,
>  Tim Wescott <t...@seemywebsite.please> writes:
>
> >Paranoid logic designers will have a string of two or three registers to
> >avoid metastability, but I've been told that's not necessary.  (I'm not
> >much of a logic designer).
>
> Ahh, but are they paranoid enough?
>
> The key is settling time.
>
> In the old days of TTL chips, a pair of FFs (with no logic in between)
> got you settling time of as much logic as the worst case delay for
> the rest of the system.  In practice, that was enough.
>
> With FPGAs, routhing is important.  A pair of FFs close together
> is probably good enough.  If you put them on opposite sides of a big
> chip, the routing delays may match the long path of the logic delays
> and eat up all of your slack time.
>
> Have any FPGA vendors published recent metastability info?
> (Many thanks to Peter Alfke for all his good work in this area.)
>
> I'm not a silicon wizard.  Is it reasonable to simulate this stuff?
> I'd like to know worst case rather than typicals.  It should be possible
> to do something like verify simulations with lab typicals and then
> use simulations to find the numbers for the nasty corners.

I'm not sure what you would want to simulate.  Metastability is
probabilistic.  There is For a given length of settling time there is
some probability of it happening.  Increasing the settling time
reduces the probability but it will never be zero meaning there is no
max length of time it takes for the output of a metastable ff to
settle.

Is that what you are asking?

Rick

Reply by glen herrmannsfeldt ●July 9, 20122012-07-09

Ed McGettigan <ed.mcgettigan@xilinx.com> wrote:
(snip)
>> >> But it does not work in this way, it behaves in a strange manner...

>> >> Some times I get the expected results but often I get strange values
>> >> (large when they should be small, often negative instead of positive,
>> >> etc.). If I look at the binary representation of the output, it looks
>> >> like if the output din't had time to sum and propagate to the output
>> >> again. In fact, the post place and route simulation shows that when the
>> >> enb signal goes to 0, the output stays in a undetermined condition (you
>> >> know, red line with XXXX).

(snip)
> It isn't just the paranoid logic designer, it should be every 
> logic designer.  

> A single register only partially solves the problem of an 
> asynchronous input with multiple register destinations, but it 
> does not solve the very real metastability problem.  
> At least two registers should be used to ensure that the 
> metastability condition has resolved and with increasing 
> clock frequency and finer process nodes using three or more 
> stages may be necessary.

Metastability can be a problem, but often the problem is clocking
multiple FFs off the same clock edge, with different delays on 
either the clock or data. (The chance of the delays being exactly
equal is close to zero.) The two effects are different.

Note, for example, the common FIFO implementation using a
gray code counter (or binary to gray code converter). 
That avoids the clock edge problem, as either value will
work correctly. 

Metastability is a different problem, but one that also occurs
when using asynchronous input values.

-- glen

Reply by glen herrmannsfeldt ●July 9, 20122012-07-09

rickman <gnuarm@gmail.com> wrote:
(snip)

>> > Paranoid logic designers will have a string of two or three registers to
>> > avoid metastability, but I've been told that's not necessary. &#4294967295;(I'm not
>> > much of a logic designer).

(snip)
> Hi Ed.  They way it was explained to me, I believe from Peter Alfke,
> is that what really resolves metastability is the slack time in a
> register to register path.  Over the years FPGA process has resulted
> in FFs which only need a couple of ns to resolve metastability to 1 in
> a million operation years or something like that (I don't remember the
> metric, but it was good enough for anything I do).  It doesn't matter
> that you have logic in that path, you just need those few ns in every
> part of the path.  In theory, even if you use multiple registers with
> no logic, what really matters is the slack time in the path and that
> is not guaranteed even with no logic.  So the design protocol should
> be to assure the slack time from the input register to all subsequent
> registers have sufficient slack time.

I suppose that is true, but really it shouldn't be a problem.
It is usual for many systems to clock as fast as you can,
consistent with the critical path delay. As metastability
is exponential, even a slightly shorter delay is usually enough
to make enough difference in the exponent.

That assumes that there is a FF to FF path that is faster than
the FF logic FF path. I believe that is usual for FPGAs, but
if you manage to get a critical path with only one LUT, then
I am not so sure. But that is pretty hard in most real systems.

> Do you remember how much time that needs to be?  I want to say 2 ns,
> but it might be more like 5 ns, I just can't recall.  Of course it
> depends on your clock rates, but I believe Peter picked some more
> aggressive speeds like 100 MHz for his example.

I would expect most systems to have at least a 10% margin.
That is, the clock period is at least 10% longer than the
critical path delay. Probably closer to 20%, but maybe 10%.
So, with a 10ns clock there might be only 1ns slack.
Assuming some delay, say 1ns minimum from FF to FF, that
has nine times the slack, and that is in an exponent.

-- glen

Reply by glen herrmannsfeldt ●July 9, 20122012-07-09

jmariano <jmariano65@gmail.com> wrote:

(snip)
> Thank you very much for your input and sorry for the late reply.
> It is really great to be able to get the opinion of such experts,
> specially since, at my current location and in a radius of some 200
> km, I must be the only person working with FPGA and VHDL! I'm also
> glad that the discussion as evolved to levels of complexity far beyond
> my knowledge.

> I was hoping that by now I would be able to say that the thing was
> working as expected but, unfortunately, no.

(snip)

> Here's the full story: I'm implementing a gated integrator, as a part
> of a boxcar averager.  This is the standard noise reduction technique
> used in nuclear magnetic resonance (nmr). This is research, not a
> commercial product! The module gets is data from 4 8 bits ADC's at 5
> MHz (adc0, adc90, adc180, adc270) and accumulates wile enb=1. enb is
> generated in a different module. The module does this:

(big snip)

I believe that most FPGA families have FFs with clock enable.

Be sure that you are writing your logic in such a way that
the tools figure that out. In most cases, I believe that means
not writing it as a gated clock. Write it as FF's with enable.

(I know how to write it in verilog but not VHDL.)

-- glen

Reply by glen herrmannsfeldt ●July 9, 20122012-07-09

Hal Murray <hal-usenet@ip-64-139-1-69.sjc.megapath.net> wrote:

(snip)
> With FPGAs, routhing is important.  A pair of FFs close together
> is probably good enough.  If you put them on opposite sides of a big
> chip, the routing delays may match the long path of the logic delays
> and eat up all of your slack time.

That is a good question. I usually assume that they won't have
a long route, but that might not be a good assumption.

Some time ago, I was working on a small design in a very large FPGA.
Expanding to fill the available space, things were very far apart.
(And, as I had so much space, I put three FFs in to synchronize,
but with long enough routes even that could fail.)

> Have any FPGA vendors published recent metastability info?
> (Many thanks to Peter Alfke for all his good work in this area.)

> I'm not a silicon wizard.  Is it reasonable to simulate this stuff?
> I'd like to know worst case rather than typicals.  It should be possible
> to do something like verify simulations with lab typicals and then
> use simulations to find the numbers for the nasty corners.

As I noted previously, though, often the problem isnt' metastabilty
but multiple FFs on the same asynchronous clock. Different problem.

-- glen

Previous 123 4 Next

accumulator (again)

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Quick Links

About FPGARelated.com

Social Networks

The Related Media Group