comp.arch.fpga | how to speed up my accumulator ??| page 4

Reply by Jim Granville ●December 7, 20042004-12-07

Allan Herriman wrote:
> On Tue, 07 Dec 2004 15:34:36 +1300, Jim Granville
> <no.spam@designtools.co.nz> wrote:
> 
> 
>>John_H wrote:
>>
>>>It's involved...
>>>With your example of 211kHz from a 1MHz reference, the ratio of
>>>906,238,099/2^32 has closest fractions in order of worst to best of
>>>
>>>1/5
>>>4/19
>>>23/109
>>>211/1000
>>>1987386/9418891
>>>19873649/94187910
>>>57633561/273144839
>>>
>>>The offsets are the ideal frequency compared to the ratio frequency:
>>>211 kHz - 200 kHz
>>>211 kHz - 210.52632 kHz
>>>211 kHz - 211.00917 kHz
>>>211 kHz - 211 kHz... Here Excel starts to lose digits:
>>>
>>>  The difference between 906,238,099/2^32 and 906,238,099.456/2^32 is about
>>>5.03e-10 at which point small amounts of jitter are lost.  If the jitter at
>>>that tiny offset is large, you will experience phase jumps when that beat
>>>frequency is felt.  There's no way to filter those with analog filters.
>>>
>>>Your largest observed peaks in the spectrum will be at offsets of 11 kHz,
>>>526 Hz, and 9.17 Hz.  You should be able to see the 526 Hz modulating the
>>>11kHz for spikes much smaller than the 11 kHz peak.
>>
>>Good example maths, but is the principle right ?
>>
>>
>>For the example of 211KHz from 1Mhz, you have 1us quantize, and so will 
>>be able to generate 4us, or 5us periods, giving 250KHz and 200KHz.
>>
>>Over many cycles, the 'wobbling' between these two will average to 
>>211KHz. The more cycles, the better the match to 211KHz.
>>
>>Over a 6 cycle snapshot, you might see 5@200, 1@250, and Favge 208.33Khz
>>That's appx one part in 77 too slow.
>>This 6 cycle frame has a freq of 34.6KHz
>>
>>Next frame group would be (eg)
>>every 79 cycles, to see => 14 @ 250KHz, 65@ 200KHz => 210.76923Khz, 
>>Error is now one part in 1000, and this finer frame is 2.65KHz
>>( etc ) as over wider frame snap-shots, the average frequency gets
>>closer to the 211KHz ideal.
>>
>>So I'd expect to see, on a spectrum analyser, 200KHz, (Dominant) 250KHz
>>and 34.6KHz and 2.65KHz (etc) energies.
> 
> 
> Did you actually plug it in to a spectrum analyser and see those
> tones?

  No, it was just 'back of an envelope' stuff, to get a feel for what
repetition frames are likely, and so what the likely energies are.

> 
> Ten highest spurious tones:
> 
>  55.000kHz -11.4dBc
> 367.000kHz -16.7dBc
> 101.000kHz -17.0dBc
> 165.000kHz -22.4dBc
>   9.000kHz -22.9dBc
> 257.000kHz -24.1dBc
> 321.000kHz -25.1dBc
> 147.000kHz -25.8dBc
> 119.000kHz -27.4dBc
>  37.000kHz -27.7dBc

  Are these rounded to the nearest KHz, as I can't derive 55.00KHz
either... a 19 cycle @ 1MHz frame, would be 52.63KHz ?
  It also seems strange to not see 200KHz, 250KHz... ?

-jg

Reply by rickman ●December 7, 20042004-12-07

Moti wrote:
> 
> Hi Falk,
> My german is pretty "rusty" :)  so if the document is in .pdf format it
> will very hard...
> but if it's in a html format it can translated by google and then it
> will be possible to read it!
> Regards, Moti.

You should be able to copy and paste the text from a PDF into a web page
for translation.  But my experience has been that web page translations
give you English that is not much easier to understand than the language
you are translating from.  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Reply by Hal Murray ●December 7, 20042004-12-07

>>>every 79 cycles, to see => 14 @ 250KHz, 65@ 200KHz => 210.76923Khz, 

>  It also seems strange to not see 200KHz, 250KHz... ?

We started with a 1 MHz clock.  Right.  The above recipe repeats after
14*5 + 65*4 cycles.  That's a total of 381 uSec, or 2.624671 KHz.

How do I get 200 KHz or 250 KHz from that?  What harmonic?

200 / 2.624671 => 76.200026
250 / 2.624671 => 95.250033

Those aren't close enough to integers for rounding to
explain the differences.  (I might have fatfingered something.)

-- 
The suespammers.org mail server is located in California.  So are all my
other mailboxes.  Please do not send unsolicited bulk e-mail or unsolicited
commercial e-mail to my suespammers.org address or any of my other addresses.
These are my opinions, not necessarily my employer's.  I hate spam.

Reply by Falk Brunner ●December 7, 20042004-12-07

"John_H" <johnhandwork@mail.com> schrieb im Newsbeitrag
news:yB5td.10$a61.633@news-west.eli.net...

>   The difference between 906,238,099/2^32 and 906,238,099.456/2^32 is
about
> 5.03e-10 at which point small amounts of jitter are lost.  If the jitter
at
> that tiny offset is large, you will experience phase jumps when that beat
> frequency is felt.  There's no way to filter those with analog filters.

I guess the trick is noise shaping. Adding a (pseudo)random phase error to
distribute the jitter energy over a wider band and also move it to higher
frequencies. Sigma-Delta Style.

Regards
Falk

Reply by John_H ●December 7, 20042004-12-07

"Falk Brunner" <Falk.Brunner@gmx.de> wrote in message
news:31mt3nF3d9asiU1@individual.net...
>
> "John_H" <johnhandwork@mail.com> schrieb im Newsbeitrag
> news:yB5td.10$a61.633@news-west.eli.net...
>
> >   The difference between 906,238,099/2^32 and 906,238,099.456/2^32 is
> about
> > 5.03e-10 at which point small amounts of jitter are lost.  If the jitter
> at
> > that tiny offset is large, you will experience phase jumps when that
beat
> > frequency is felt.  There's no way to filter those with analog filters.
>
> I guess the trick is noise shaping. Adding a (pseudo)random phase error to
> distribute the jitter energy over a wider band and also move it to higher
> frequencies. Sigma-Delta Style.
>
> Regards
> Falk

Noise shaping is the right way to go for a superb quality synthesizer, but
the correction phase error - the output from the noise shaper - needs to be
applied based on the synchronous edge position relative to the "ideal" edge
position - the input to the noise shaper.  (Pseudo)Random doesn't do it.

All this assumes, of course, that there's an analog PLL driven by the single
bit, noise-shaped NCO output.  Without the PLL to filter out the high
frequency phase noise of a Sigma-Delta style NCO, the jitter is still around
1 reference clock period peak-to-peak, maybe worse.

(NCOs are used by many folks in the comp.arch.fpga newsgroup who have no
reason to visit comp.dsp.)

Reply by Falk Brunner ●December 8, 20042004-12-08

"John_H" <johnhandwork@mail.com> schrieb im Newsbeitrag
news:S9std.14$a61.1075@news-west.eli.net...

> All this assumes, of course, that there's an analog PLL driven by the
single
> bit, noise-shaped NCO output.  Without the PLL to filter out the high
> frequency phase noise of a Sigma-Delta style NCO, the jitter is still
around
> 1 reference clock period peak-to-peak, maybe worse.

Yes.

> (NCOs are used by many folks in the comp.arch.fpga newsgroup who have no
> reason to visit comp.dsp.)

????
Dont get it.

Regards
Falk

Reply by rickman ●December 8, 20042004-12-08

John_H wrote:
> 
> Noise shaping is the right way to go for a superb quality synthesizer, but
> the correction phase error - the output from the noise shaper - needs to be
> applied based on the synchronous edge position relative to the "ideal" edge
> position - the input to the noise shaper.  (Pseudo)Random doesn't do it.
> 
> All this assumes, of course, that there's an analog PLL driven by the single
> bit, noise-shaped NCO output.  Without the PLL to filter out the high
> frequency phase noise of a Sigma-Delta style NCO, the jitter is still around
> 1 reference clock period peak-to-peak, maybe worse.

That answers a question I have had for a long time.  It occured to me a
long time ago to use an analog PLL to smooth out the ragged edges in an
NCO clock.  But no one I spoke to about it could say if it would work. 
I always figured that the low pass filter would do the smoothing for
me.  

I should never have doubted myself.  ;)

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Reply by Allan Herriman ●December 8, 20042004-12-08

On Wed, 08 Dec 2004 18:04:31 -0500, rickman <spamgoeshere4@yahoo.com>
wrote:

>John_H wrote:
>> 
>> Noise shaping is the right way to go for a superb quality synthesizer, but
>> the correction phase error - the output from the noise shaper - needs to be
>> applied based on the synchronous edge position relative to the "ideal" edge
>> position - the input to the noise shaper.  (Pseudo)Random doesn't do it.
>> 
>> All this assumes, of course, that there's an analog PLL driven by the single
>> bit, noise-shaped NCO output.  Without the PLL to filter out the high
>> frequency phase noise of a Sigma-Delta style NCO, the jitter is still around
>> 1 reference clock period peak-to-peak, maybe worse.
>
>That answers a question I have had for a long time.  It occured to me a
>long time ago to use an analog PLL to smooth out the ragged edges in an
>NCO clock.  But no one I spoke to about it could say if it would work. 

I must be a 'no one'.

Rick, we have discussed this before, e.g. in this thread:
http://groups-beta.google.com/group/comp.arch.embedded/browse_frm/thread/7e0ec68b5c53e4

This is something I've done in real designs.  I've also developed
tools for estimating the output jitter of the NCO, taking the loop
bandwidth (and order) of the PLL into account.
It is possible to achieve very low levels of jitter at the PLL output,
if the frequencies are carefully chosen such that the higher level
spurious signals at the output of the NCO are well outside the PLL
loop bandwidth.

>I always figured that the low pass filter would do the smoothing for
>me.  

Exactly.  Although this does require the phase detector to be linear
(otherwise the jitter signals will be demodulated).  Common phase
detector types (e.g. most digital phase detectors driving charge
pumps) aren't particularly linear due to inexact balance between the
pull-up and pull-down current sources.  A figure of 10% is sometimes
quoted.

Regards,
Allan

Reply by rickman ●December 9, 20042004-12-09

Allan Herriman wrote:
> 
> On Wed, 08 Dec 2004 18:04:31 -0500, rickman <spamgoeshere4@yahoo.com>
> wrote:
> 
> I must be a 'no one'.

Well, I wouldn't go *that* far..  :)  

> Rick, we have discussed this before, e.g. in this thread:
> http://groups-beta.google.com/group/comp.arch.embedded/browse_frm/thread/7e0ec68b5c53e4

You have a *much* better memory than I do.  I think I had looked into
this, but my idea was rejected by higher ups in favor of a speciallized
chip that actually used the top N bits of the accumulator to drive an
ADC.  This sine wave was then filtered and fed back to the chip for
clipping via a comparator.  

> This is something I've done in real designs.  I've also developed
> tools for estimating the output jitter of the NCO, taking the loop
> bandwidth (and order) of the PLL into account.
> It is possible to achieve very low levels of jitter at the PLL output,
> if the frequencies are carefully chosen such that the higher level
> spurious signals at the output of the NCO are well outside the PLL
> loop bandwidth.

I looked at the posts that you refer to.  That post has some defunct
links for other posts or web pages.  Heck, a couple of them are to
altavista that doesn't even refer you to whoever bought them.  Things
change fast on the internet.  

> >I always figured that the low pass filter would do the smoothing for
> >me.
> 
> Exactly.  Although this does require the phase detector to be linear
> (otherwise the jitter signals will be demodulated).  Common phase
> detector types (e.g. most digital phase detectors driving charge
> pumps) aren't particularly linear due to inexact balance between the
> pull-up and pull-down current sources.  A figure of 10% is sometimes
> quoted.

What phase detectors *are* linear?  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Reply by Ray Andraka ●December 9, 20042004-12-09

Moti,

There are a couple things you can do.  First off, if you look closely at
an accumulator, the feedback from a particular bit only affects that bit
and bits with greater significance.  That suggests that you can perform
partial sums and then combine the results.  One simple trick that takes 2x
the resources of the straight accumulator is to break your 32 bit
accumulator into two 16 bit accumulators.  The carry out of the first gets
registered and fed into the carry in of the second.  Note that by doing
that, the second follows the first by a clock cycle, so you need to delay
the upper half of the addend (the new value getting added, not the
feedback value) by a clock cycle using a register so that it arrives at
the upper half of the accumulator at the same time as the registered carry
from the lower half.  Likewise, the lower half sum output (but not the
feedback) has to be delayed by a clock cycle to align it with the upper
half sum.  On the surface, that would seem to permit almost double the
clock speed (and it did in older Xilinx devices), however in Virtex
devices the propagation time to get on and off the carry chain is an order
of magnitude larger than the bit to bit propagation times, so in reality
the gain from this trick is rather small until you get into truely huge
accumulator widths.

A more usable trick requires a little more attention to the design
implementation.  The carry chains are typically the critical path (mostly
because the times to get on and off the chain are on par with the LUT
delay).  You can't do much anything about the delay in the carry chain or
the intrinsic delay for getting on and off the chain.  You can, however,
minimize the delays on the signal connecting to the carry chain input.
This means making sure that you only have one level of logic (ie,
flip-flops are directly driving the LUTs that feed the carry chain), and
you need to make sure those flip-flops are placed in close proximity to
the carry chain (ideally either in the same CLB, or on an adjacent CLB so
you can use the direct connect wires).  Note that the automatic placement
is not particularly good at making sure those flip-flops are placed this
way.  The accumulator feedback doesn't need to be pipelined because it is
connecting back around to the same bit (assuming you've reduced the logic
to 1 level), which means it is already pipelined as much as it could be.
You may need to pipeline the new addend path in order to achieve the one
level of logic at the accumulator and keep the driving flip-flops in
adjacent slices, but that is OK as it doesn't affect the accumulator
operation.

You normally should use active high resets because that is what is native
to the fpga.  In this case, I don't think it is affecting your timing
however, because it is an asynchronous reset.  Had it been a syncrhonous
reset, some synthesizers would have inserted a gate between the carry
chain and the register, which would have added an extra LUT delay to the
input path rather than inverting the resetn signal.

Hope this helps

Moti Cohen wrote:

> Hello all,
> I've a design that contains a NCO (Numerically controlled oscillator).
> The NCO consists of a 32'bit accumulator. when i write the accumulator
> straight forward like this -
>
> process (clk,resetn)
> begin
>         if resetn = '0' then
>                 accumulator     <= (others =>'0');
>         elsif clk'event and clk ='1' then
>                 accumulator     <= accumulator + inc_value;
>         end if;
> end process;
> Fout <= accumulator (accumulator'high);
>
> the maximum frequency I can achive for 'clk' is ~ 150 MHz (spartan 3).
> I need it to work in ~200 MHz so I figured out that some pipelining is
> needed but I dont know how to do it because of the accumulator
> feedback. Maybe someone here can explain it to me or even give me a
> code example (which will be great).
>
> Thanks in advance, Moti.

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Previous 2 345 Next

how to speed up my accumulator ??

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Quick Links

About FPGARelated.com

Social Networks

The Related Media Group