I designed a low pass IIR filter in starix iv but I got speed problem. I need to run it on 245MHz but can only achieve about 180. I was advised by experts to insert extra registers and this improved speed but the output of filter went wrong. I was advised to balance the filter since I inserted extra registers. But how ? I did some modeling and realized with a surprise that it seems just not possible that I can balance any IIR filter(but can with FIR filter). Has anybody any idea about balancing IIR filters. The difficulty is in the feedback terms. The filter I am using is Yn = (1-alpha)*Xn + alpha*Yn-1 Thanks in advance --------------------------------------- Posted through http://www.FPGARelated.com
balancing IIR filter (after adding extra registers)
Started by ●January 12, 2012
Reply by ●January 12, 20122012-01-12
On Thu, 12 Jan 2012 06:49:42 -0600, zak wrote:> I designed a low pass IIR filter in starix iv but I got speed problem. I > need to run it on 245MHz but can only achieve about 180. I was advised > by experts to insert extra registers and this improved speed but the > output of filter went wrong. > > I was advised to balance the filter since I inserted extra registers. > But how ? > > I did some modeling and realized with a surprise that it seems just not > possible that I can balance any IIR filter(but can with FIR filter). > > Has anybody any idea about balancing IIR filters. The difficulty is in > the feedback terms. > > The filter I am using is Yn = (1-alpha)*Xn + alpha*Yn-1 > > Thanks in advanceThere have to be books on this... What's holding up the train? The addition? The multiplication? The logic in between? I don't do FPGA design anywhere near full time -- does the Stratix IV have hardware multiply? Hardware add? Perhaps even hardware MAC? If it has a hardware multiply-and-add, then you need to make sure you're using it efficiently. If all else fails and you just have to put in delays, then all is not lost (presuming that you can stand some delay in the output). You're designing a pretty elementary low-pass filter, so the first thing you can do is just see what happens when you stick some extra delay in there. Let y_n = a^2 * y_{n-2} + (1 - a^2) * x_{n-1} This should be easier to realize than your difference equation. Now perform a z-transform on this (see: http://www.wescottdesign.com/articles/zTransform/z-transforms.html, and please forgive any broken links, &c): Y(z) = z^-2 * a^2 * Y(z) + z^-1 * (1 - a^2) * X and solve for the transfer function: Y(z) (1 - a^2) z (1 - a^2) z H(z) = ---- = ----------- = -------------- X(z) z^2 - a^2 (z - a)(z + a) If you limit |a| < 1, then H(z) is stable, (and an unstable system is the first "wrong" that you might encounter) but while it has a generally low- pass character up to Fs/4 (Fs = sampling frequency), the response rises after that back to unity -- and that's bad. If you doctor this up a bit with a felicitously placed zero, then you can get Y(z) 0.5 (1 - a^2) (z + 1) H(z) = ---- = --------------------- X(z) (z - a)(z + a) There are a number of ways that you can achieve this, but your result is going to be a filter with unity gain at DC (good), the same general transfer function as your example difference equation (good), except at Fs/2 where the response will be zero (better than yours), and -- hopefully -- the extra delay in the difference equation will be enough to let you pipeline your math enough to realize this thing and get the speed you need. -- My liberal friends think I'm a conservative kook. My conservative friends think I'm a liberal kook. Why am I not happy that they have found common ground? Tim Wescott, Communications, Control, Circuits & Software http://www.wescottdesign.com
Reply by ●January 13, 20122012-01-13
"zak" <kazimayob2@n_o_s_p_a_m.aol.com> wrote in message news:Ne-dndbs46R7S5PSnZ2dnUVZ_s2dnZ2d@giganews.com...>I designed a low pass IIR filter in starix iv but I got speed problem. I > need to run it on 245MHz but can only achieve about 180. I was advised by > experts to insert extra registers and this improved speed but the output > of > filter went wrong. > > I was advised to balance the filter since I inserted extra registers. But > how ? > > I did some modeling and realized with a surprise that it seems just not > possible that I can balance any IIR filter(but can with FIR filter). > > Has anybody any idea about balancing IIR filters. The difficulty is in the > feedback terms. > > The filter I am using is Yn = (1-alpha)*Xn + alpha*Yn-1My guess is you need to add some registers on your input and outputs (and add latencty). I guess this implementation gets put into a DSP core within a tiny area and io's has to run a long distance before getting there or getting out.. The feedback has dedicated routing inside a DSP and should be very fast. Do you know that this gets implemented in a DSP or does the tool try to build it with gates? To really be able to help I would like to see the source, the timing report and details and/or knowledge about input and outputs (like are they IO pins?) of this IIR.
Reply by ●January 13, 20122012-01-13
"zak" <kazimayob2@n_o_s_p_a_m.aol.com> wrote:>I designed a low pass IIR filter in starix iv but I got speed problem. I >need to run it on 245MHz but can only achieve about 180. I was advised by >experts to insert extra registers and this improved speed but the output of >filter went wrong. > >I was advised to balance the filter since I inserted extra registers. But >how ? > >I did some modeling and realized with a surprise that it seems just not >possible that I can balance any IIR filter(but can with FIR filter). > >Has anybody any idea about balancing IIR filters. The difficulty is in the >feedback terms. > >The filter I am using is Yn = (1-alpha)*Xn + alpha*Yn-1You can't use much registers and just adding registers will make routing worse, not better. Your filter seems to consist of 2 multipliers and an adder. The first optimisation you can do is using one's complement instead of two's complement. When using one's complement you don't need to sign extend the multiplicants. In Xilinx FPGAs the multiplipliers get faster when you use less bits. -- Failure does not prove something is impossible, failure simply indicates you are not using the right tools... nico@nctdevpuntnl (punt=.) --------------------------------------------------------------
Reply by ●January 14, 20122012-01-14
Thanks all for the replies. My main concern was not the timing per se as I may eventually get over it. But specifically "Can we balance a given IIR filter" if we have to add extra registers?? In my simple filter design there is chain of [a subtractor=> a multiplier=> a subtractor] without any register in between. Obviously this causes long paths and need be broken by registers according to RTL methodology. I understand Tim is suggesting redesigning IIR with inherent registers in it. It is interesting idea and I managed to verify that the suggested final filter is better than mine but still it will have - I believe - some long paths. Regards Zak --------------------------------------- Posted through http://www.FPGARelated.com
Reply by ●January 15, 20122012-01-15
On Sat, 14 Jan 2012 15:49:06 -0600, zak wrote:> Thanks all for the replies. > > My main concern was not the timing per se as I may eventually get over > it. But specifically "Can we balance a given IIR filter" if we have to > add extra registers?? > > In my simple filter design there is chain of [a subtractor=> a > multiplier=> a subtractor] without any register in between. Obviously > this causes long paths and need be broken by registers according to RTL > methodology. > > I understand Tim is suggesting redesigning IIR with inherent registers > in it. It is interesting idea and I managed to verify that the suggested > final filter is better than mine but still it will have - I believe - > some long paths.Actually what I was suggesting was a difference equation that you might be able to realize with a structure that has more pipelining, not something that you would attempt to implement directly. Pipelining is for you to do -- I'm just being the math egghead. Whether just one clock worth of delay is going to be enough to do all the pipelining you need -- I dunno. OTOH, the math itself imposes no limit to the amount of delay you can have in the filter -- you can have three, four, or 1000 clocks worth. But each delay you add puts a null in the response and increases the overall delay of the filter; at some point the null will encroach on your desired response and that would be a Bad Thing. The difference equation is easy: y_n = d^N * y_{n-N} + (1-d^N) * (1/N) * sum from {k=0} to {N-1} x_{n-k} This gives you the transfer function H(z) = ((1-d^N)/N)*(z^(N-1) + ... + z + 1) / (z^N - d^N), the denominator is basically the same old difference equation, only with as much delay as you need for pipelining. The numerator describes a CIC filter, which is the Easiest FIR of All. Presumably, in order to pipeline this effectively you'd have to add an additional N counts of delay -- at some point the filter output is going to be useless to you just because of delay, if nothing else. -- Tim Wescott Control system and signal processing consulting www.wescottdesign.com
Reply by ●January 15, 20122012-01-15
On Thu, 12 Jan 2012 06:49:42 -0600, zak wrote:> I designed a low pass IIR filter in starix iv but I got speed problem. I > need to run it on 245MHz but can only achieve about 180. I was advised > by experts to insert extra registers and this improved speed but the > output of filter went wrong. > > I was advised to balance the filter since I inserted extra registers. > But how ? > > I did some modeling and realized with a surprise that it seems just not > possible that I can balance any IIR filter(but can with FIR filter). > > Has anybody any idea about balancing IIR filters. The difficulty is in > the feedback terms. > > The filter I am using is Yn = (1-alpha)*Xn + alpha*Yn-1I got another thought. What frequency are you filtering _to_? Why are you using an IIR at all? If you are filtering heavily enough you should be able to prefilter with a CIC, decimate, and run your IIR filter (if you still need it) at a lower rate. Would that meet your requirements? -- Tim Wescott Control system and signal processing consulting www.wescottdesign.com
Reply by ●January 15, 20122012-01-15
Tim Wescott <tim@seemywebsite.please> wrote: (snip)> Actually what I was suggesting was a difference equation that you might > be able to realize with a structure that has more pipelining, not > something that you would attempt to implement directly.> Pipelining is for you to do -- I'm just being the math egghead.> Whether just one clock worth of delay is going to be enough to do all the > pipelining you need -- I dunno.> OTOH, the math itself imposes no limit to the amount of delay you can > have in the filter -- you can have three, four, or 1000 clocks worth. > But each delay you add puts a null in the response and increases the > overall delay of the filter; at some point the null will encroach on your > desired response and that would be a Bad Thing.The way I usually think about this, partly because of the way ones I work on are used, is that with added pipelining you can run interleaved data streams. Now popularly known as Simultaneous Multithreading, instead of processing one data stream faster, process many data streams at about the same speed. (Once in a while, remind the marketing department of the difference. Too often they quote the faster speed without qualification.) -- glen
Reply by ●January 15, 20122012-01-15
Let me explain myself further. A filter (FIR or IIR) has obviously its own terms(z terms of transfer function) which are implemented as registers as you know(let us call them term registers). On the other hand device speed may require its own registers(I call it pipeline registers). I am not worried about input to output delay(let it be 10s of clock periods) i.e I can insert registers at input and output freely. But inside filter stages I need care to keep filter transfer function accurate. For FIR, computations are forward and the rule I found is that if I need to delay any FIR term I should delay all its other terms equally. For IIR filter, there are both forward and feedback computations. I can delay forward terms equally and the result stays correct upto to its end but I cannot do that for feedback terms. example: suppose y(n) = a*x(n) + b*y(n-1) Obviously meaning current output = a*current input + b*previous output. Suppose I wanted to use a structure that ended up with no register between result of b*y(n-1)and adder. So I decided to add a pipeline register. This implies that I am adding b*y(n-2) which can be correct if I added it to a*x(n-1). so I delayed x input and this makes adder result as a*x(n-1). But this also means now feedback term becomes b*y(n-3) naturally. Am I missing the obvious? Zak --------------------------------------- Posted through http://www.FPGARelated.com
Reply by ●January 15, 20122012-01-15
On Sun, 15 Jan 2012 07:40:54 -0600, zak wrote:> Let me explain myself further. > > A filter (FIR or IIR) has obviously its own terms(z terms of transfer > function) which are implemented as registers as you know(let us call > them term registers). > > On the other hand device speed may require its own registers(I call it > pipeline registers). > > I am not worried about input to output delay(let it be 10s of clock > periods) > i.e I can insert registers at input and output freely. But inside filter > stages I need care to keep filter transfer function accurate. For FIR, > computations are forward and the rule I found is that if I > need to delay any FIR term I should delay all its other terms equally. > For IIR filter, there are both forward and feedback computations. I can > delay forward terms equally and the result stays correct upto to its end > but I cannot do that for feedback terms. > > example: suppose y(n) = a*x(n) + b*y(n-1) > > Obviously meaning current output = a*current input + b*previous output. > Suppose I wanted to use a structure that ended up with no register > between result of b*y(n-1)and adder. So I decided to add a pipeline > register. This > > implies that I am adding b*y(n-2) which can be correct if I added it to > a*x(n-1). > so I delayed x input and this makes adder result as a*x(n-1). But this > also > means now feedback term becomes b*y(n-3) naturally. > > Am I missing the obvious?Let us say that you want to do an operation a = b * c + d, and that this operation can only be done with three stages of pipeline delay, such that a is good at the beginning of the 3rd clock after you start: a_n = b * c_{n-3} + d (Let's also say that you're doing this in a true pipeline, so that a_3, a_4, ... are all good assuming that c_0, c_1, ... are good) Let c_{n-3} = a_{n-3}, which we can do by definition because a is good at the beginning of the 3rd clock. Now we have a_n = b * a_{n-3} + d Do I need to continue, or is it all obvious? -- Tim Wescott Control system and signal processing consulting www.wescottdesign.com





