FPGARelated.com
Forums

FFT on an FPGA

Started by Raymond August 17, 2006
Evan Lavelle wrote:
> On Thu, 17 Aug 2006 11:04:44 -0500, "RCIngham" > <robert.ingham@smiths-aerospace.com> wrote: > > >>I suspect that the Xilinx core is fixed-point, and handles the data growth >>that happens during a FFT in some manner described in its documentation. If >>it doesn't, then don't use it. > > > It's fixed-point (vfft32 is, anyway). It does bit growth, but you have > to handle the extra bits yourself. > > >>You could use a number representation such as Q1.15 that handles numbers >>in the range -1.0 to +1.0, and do scaling at each pass of the FFT. > > > This is actually no better than an integer representation - if you > start with Q1.15, then you have to extend to Q1.18, or whatever, after > the first pass, and so on. > > Fixed-point FFTs are pretty useless. If you've got a small dynamic > range in the input, then you'll have a large dynamic range in the > output. Coding a floating-point adder and multiplier is probably a lot > easier than understanding the limitations of a fixed-point FFT and > using it effectively. > > Evan
Fixed point FFTs are fine for many applications. Whether it is suitable for yours depends on many factors including the size of the FFT. There is a potential growth of N for an N point FFT. The FFT conserves energy, so if the input has frequency components that all fall in the same FFT bin, the output for that bin will be N times the input amplitude. If the input is wideband, then the output is spread over all the bins, so the output amplitude is closer to the input amplitude. Often, we know ahead of time the nature of the input signal, so we can do a fixed scaling of the FFT results and wind up with the same number of input bits and output bits. Other times, we know less about the application and need to handle the full range of possible inputs, or perhaps we are interested in the frequency bins that don't contain the dominant signal. In those cases we either need to accommodate a wider FFT output (fixed point), or need to resort to floating point. Make no bones about it, floating point arithmetic is expensive in terms of amount of logic. There are some shortcuts for FFTs that will increase the dynamic range without going to a full floating point implementation. The most common is to use block floating point which rescales the entire FFT result by a common power of 2 selected so that the largest signal does not overflow. Block floating point is typically applied after each FFT stage, and is an effective way to limit the word widths without resorting to floating point arithmetic. The generally available FFT cores are all fixed point. Technically speaking, fixed point is more accurate than floating point; the problem with fixed point is that you need a lot of bits to express a large dynamic range. Floating point is a compromise that dynamically rescales the data as it is processed in order to reduce the number of bits requried to represent a number, at the price of some accuracy. Fixed point design does require the designer to pay more attention in the system design to ensure the scaling is correct at each stage. In most cases however it is feasible and is less complexity than an equivalent floating point design. The windowing does not need floating point, but you do have to get familiar with fractional fixed point notation. (basically your window funtion is integers from 0 to say 65535 for 16 bit unsigned that represent 0 to 65535/65536 with an implied scaling of 2^-16. All you need is a multplier and a ROM to implement it.
Ray Andraka schrieb:
> Evan Lavelle wrote:
> Fixed point FFTs are fine for many applications. Whether it is suitable > for yours depends on many factors including the size of the FFT. There > is a potential growth of N for an N point FFT. The FFT conserves > energy, so if the input has frequency components that all fall in the > same FFT bin, the output for that bin will be N times the input > amplitude. If the input is wideband, then the output is spread over all > the bins, so the output amplitude is closer to the input amplitude. > Often, we know ahead of time the nature of the input signal, so we can > do a fixed scaling of the FFT results and wind up with the same number > of input bits and output bits. Other times, we know less about the > application and need to handle the full range of possible inputs, or > perhaps we are interested in the frequency bins that don't contain the > dominant signal. In those cases we either need to accommodate a wider > FFT output (fixed point), or need to resort to floating point.
Also, it is not clear whether you gain anything by floating point because you need the high dynamic range in the signifcand anyway. Lets say you have 2**N values of M bits that you sum up. (that is essentially what happens inside an FFT) It turns out that you need N+M bits to represent the result fixed point. Now assume that you think that N+M is to large and you want to get by with smaller multipliers by using floating point that has a significand resolution of S << N+M. Once you added up the first half of your values all subsequent values will be scaled down by a factor of N-1 before adding them to the result. This means that only the S-N+1 most significant bits of the inputs are used. To avoid that you need S>>N. As soon as S gets about as large as N+M you might as well use fixed point because a fixed point solution with N+M bits will get you all the dynamic range that you need and the N+M fixed point logic is included in the floating point solution as a building block anyway. Also note: A floating point unit with N+M bit signifiand needs a (N+M) by (N+M) multiplier. A fixed point FFT would only use a N by N multiplier with N+M accumulator. Kolja Sulimma (BTW: << means "much smaller than" and not a left shift)
Kolja Sulimma wrote:
> Ray Andraka schrieb: > >>Evan Lavelle wrote: > > >>Fixed point FFTs are fine for many applications. Whether it is suitable >>for yours depends on many factors including the size of the FFT. There >>is a potential growth of N for an N point FFT. The FFT conserves >>energy, so if the input has frequency components that all fall in the >>same FFT bin, the output for that bin will be N times the input >>amplitude. If the input is wideband, then the output is spread over all >>the bins, so the output amplitude is closer to the input amplitude. >>Often, we know ahead of time the nature of the input signal, so we can >>do a fixed scaling of the FFT results and wind up with the same number >>of input bits and output bits. Other times, we know less about the >>application and need to handle the full range of possible inputs, or >>perhaps we are interested in the frequency bins that don't contain the >>dominant signal. In those cases we either need to accommodate a wider >>FFT output (fixed point), or need to resort to floating point. > > > Also, it is not clear whether you gain anything by floating point > because you need the high dynamic range in the signifcand anyway. > > Lets say you have 2**N values of M bits that you sum up. (that is > essentially what happens inside an FFT) > It turns out that you need N+M bits to represent the result fixed point. > > Now assume that you think that N+M is to large and you want to get by > with smaller multipliers by using floating point that has a significand > resolution of S << N+M. Once you added up the first half of your values > all subsequent values will be scaled down by a factor of N-1 before > adding them to the result. This means that only the S-N+1 most > significant bits of the inputs are used. > To avoid that you need S>>N. As soon as S gets about as large as N+M you > might as well use fixed point because a fixed point solution with N+M > bits will get you all the dynamic range that you need and the N+M fixed > point logic is included in the floating point solution as a building > block anyway. > > Also note: > A floating point unit with N+M bit signifiand needs a (N+M) by (N+M) > multiplier. > A fixed point FFT would only use a N by N multiplier with N+M accumulator. > > Kolja Sulimma > > (BTW: << means "much smaller than" and not a left shift)
I tried to allude to that fact, but reading over my post I see I missed the point. Our floating point FFT takes advantage of this fact. It uses fixed a small fixed (4/8/16 point) point kernel and normalizes the input set to a common exponent before passing it through the kernel,and then denormalizes the intermediate result before storing it. In essence it has a coarser granularity of floating point operators. Accuracy matches a full floating point implementation but the implementation uses a fraction of the hardware of a full floating point implementation.
Ray Andraka <ray@andraka.com> wrote:

>Evan Lavelle wrote: >> On Thu, 17 Aug 2006 11:04:44 -0500, "RCIngham" >> <robert.ingham@smiths-aerospace.com> wrote: >> >> >>>I suspect that the Xilinx core is fixed-point, and handles the data growth >>>that happens during a FFT in some manner described in its documentation. If >>>it doesn't, then don't use it. >> >> >> It's fixed-point (vfft32 is, anyway). It does bit growth, but you have >> to handle the extra bits yourself. >> >> >>>You could use a number representation such as Q1.15 that handles numbers >>>in the range -1.0 to +1.0, and do scaling at each pass of the FFT. >> >> >> This is actually no better than an integer representation - if you >> start with Q1.15, then you have to extend to Q1.18, or whatever, after >> the first pass, and so on. >> >> Fixed-point FFTs are pretty useless. If you've got a small dynamic >> range in the input, then you'll have a large dynamic range in the >> output. Coding a floating-point adder and multiplier is probably a lot >> easier than understanding the limitations of a fixed-point FFT and >> using it effectively. >> >> Evan > > >Fixed point FFTs are fine for many applications. Whether it is suitable >for yours depends on many factors including the size of the FFT. There >is a potential growth of N for an N point FFT. The FFT conserves >energy, so if the input has frequency components that all fall in the >same FFT bin, the output for that bin will be N times the input >amplitude. If the input is wideband, then the output is spread over all >the bins, so the output amplitude is closer to the input amplitude. >Often, we know ahead of time the nature of the input signal, so we can >do a fixed scaling of the FFT results and wind up with the same number >of input bits and output bits. Other times, we know less about the >application and need to handle the full range of possible inputs, or >perhaps we are interested in the frequency bins that don't contain the >dominant signal. In those cases we either need to accommodate a wider >FFT output (fixed point), or need to resort to floating point. > >Make no bones about it, floating point arithmetic is expensive in terms >of amount of logic. There are some shortcuts for FFTs that will >increase the dynamic range without going to a full floating point >implementation. The most common is to use block floating point which >rescales the entire FFT result by a common power of 2 selected so that >the largest signal does not overflow. Block floating point is typically >applied after each FFT stage, and is an effective way to limit the word >widths without resorting to floating point arithmetic.
Depending on the use of the result, it may be an option to compress the input with a log function. This will decrease the number of bits considerably to achieve a certain dynamic range (more or less like ulaw / alaw compression used in digital telephony). -- Reply to nico@nctdevpuntnl (punt=.) Bedrijven en winkels vindt U op www.adresboekje.nl
In article <SmjFg.4549$SZ3.3956@dukeread04>, Ray Andraka
<ray@andraka.com> wrote:

> Make no bones about it, floating point arithmetic is expensive in terms > of amount of logic. There are some shortcuts for FFTs that will > increase the dynamic range without going to a full floating point > implementation. The most common is to use block floating point which > rescales the entire FFT result by a common power of 2 selected so that > the largest signal does not overflow. Block floating point is typically > applied after each FFT stage, and is an effective way to limit the word > widths without resorting to floating point arithmetic.
Does anybody do the following thing?: At each step, the adders check for whether the high order bit of any result is set. (Just an N-way OR gate). If so, then the next step includes a right-shift (for all elements), otherwise there is no right-shift. In the end, you get an FFT result which must be scaled by 2^number-of-right-shifts, but has as much resolution as is possible given the number of bits (for the largest-magnitude components). Sort of similar to floating point, except that there is only one exponent for all components. -- David M. Palmer dmpalmer@email.com (formerly @clark.net, @ematic.com)
David M. Palmer schrieb:
> In article <SmjFg.4549$SZ3.3956@dukeread04>, Ray Andraka > <ray@andraka.com> wrote:
> Does anybody do the following thing?: At each step, the adders check > for whether the high order bit of any result is set. (Just an N-way OR > gate). If so, then the next step includes a right-shift (for all > elements), otherwise there is no right-shift. In the end, you get an > FFT result which must be scaled by 2^number-of-right-shifts, but has as > much resolution as is possible given the number of bits (for the > largest-magnitude components).
See my previous post. For the last half of inputs you loose M-1 bits of resolution. This is not much better than just shifting all inputs by M-1 bits to the right at the beginning. Floating point with S bits in the significand is not better for summing up many small values than S bit fixed point. Kolja Sulimma
Nico Coesel schrieb:
> Depending on the use of the result, it may be an option to compress > the input with a log function. This will decrease the number of bits > considerably to achieve a certain dynamic range (more or less like > ulaw / alaw compression used in digital telephony).
Addition on a log scale is rather difficult. Kolja Sulimma
Kolja Sulimma <news@sulimma.de> wrote:

>Nico Coesel schrieb: >> Depending on the use of the result, it may be an option to compress >> the input with a log function. This will decrease the number of bits >> considerably to achieve a certain dynamic range (more or less like >> ulaw / alaw compression used in digital telephony). > >Addition on a log scale is rather difficult.
You don't have to. Use a conversion table or bit shifting (like u-law / a-law) to convert from lin to log. The fft won't mind the signal it is processing is converted to a log scale. After fft convert the result back to linear if required. The scheme is very simple: in -> lin to log -> fft -> log to lin -> out -- Reply to nico@nctdevpuntnl (punt=.) Bedrijven en winkels vindt U op www.adresboekje.nl
Nico Coesel schrieb:
> Kolja Sulimma <news@sulimma.de> wrote: > > >>Nico Coesel schrieb: >> >>>Depending on the use of the result, it may be an option to compress >>>the input with a log function. This will decrease the number of bits >>>considerably to achieve a certain dynamic range (more or less like >>>ulaw / alaw compression used in digital telephony). >> >>Addition on a log scale is rather difficult. > > > You don't have to. Use a conversion table or bit shifting (like u-law > / a-law) to convert from lin to log. The fft won't mind the signal it > is processing is converted to a log scale. After fft convert the > result back to linear if required. > > The scheme is very simple: > > in -> lin to log -> fft -> log to lin -> out
Definitely not. Log is not linear. log(a+b) <> log a + log b The spectrum of log(sin(x)) constains frequencies that sin(x) does not because sin is not an eigenfunction of log. Scaling the spectrum will not make these extra frequencies vanish. Kolja Sulimma
Kolja Sulimma <news@sulimma.de> wrote:

>Nico Coesel schrieb: >> Kolja Sulimma <news@sulimma.de> wrote: >> >> >>>Nico Coesel schrieb: >>> >>>>Depending on the use of the result, it may be an option to compress >>>>the input with a log function. This will decrease the number of bits >>>>considerably to achieve a certain dynamic range (more or less like >>>>ulaw / alaw compression used in digital telephony). >>> >>>Addition on a log scale is rather difficult. >> >> >> You don't have to. Use a conversion table or bit shifting (like u-law >> / a-law) to convert from lin to log. The fft won't mind the signal it >> is processing is converted to a log scale. After fft convert the >> result back to linear if required. >> >> The scheme is very simple: >> >> in -> lin to log -> fft -> log to lin -> out > >Definitely not. Log is not linear. > >log(a+b) <> log a + log b > >The spectrum of log(sin(x)) constains frequencies that sin(x) does not >because sin is not an eigenfunction of log. > >Scaling the spectrum will not make these extra frequencies vanish.
Just run some numbers on the error being introduced versus the savings in logic. -- Reply to nico@nctdevpuntnl (punt=.) Bedrijven en winkels vindt U op www.adresboekje.nl