comp.arch.fpga | Estimating number of FPGAs needed for an application

Hi all

I'm absolutely new to FPGAs, in fact my work is much more related with
the SW than with the HW, so I need to solve a problem that ideally I
was not targeted to.

The issue is this: I have to estimate (roughly) the number of FPGAs
needed to support a typical signal processing algorithm, steps are as
follows, always in single-precision:

1.16k complex samples FFT
2. 16k complex vector multiplication
3. 16k complex samples IFFT
4. 16k complex vector multiplication
5. 16 k complex vector sum

The idea is to know how many FPGAs will cover this kind of processing
in a given time, to compare with different types of processors. Por
the later, it is really easy just counting number of operations in
GFLOPs, but with hardware devices I am getting a lot of trouble, since
I don't have a clear understanding on what should I count.

Please, give me a hand!

Ruben

Reply by Guenter ●March 12, 20072007-03-12

On Mar 12, 1:35 pm, rbbla...@gmail.com wrote:
[...]
>
> The issue is this: I have to estimate (roughly) the number of FPGAs
> needed to support a typical signal processing algorithm, steps are as
> follows, always in single-precision:
>
> 1.16k complex samples FFT
> 2. 16k complex vector multiplication
> 3. 16k complex samples IFFT
> 4. 16k complex vector multiplication
> 5. 16 k complex vector sum
>
> The idea is to know how many FPGAs will cover this kind of processing
> in a given time, to compare with different types of processors.

With hardware implementation you will need to specify the time you
want this algorithm being processed in. It will make a difference in
the implementation. The faster you want to go, the more you need to
implement in parallel and the more resources you will need.

For the FFT you can request a design fit from here:
http://www.dilloneng.com/ip/fft/fftipfit_cpt

But that is a specific design fit for their FFT. So you might find
other vendors that get you a different fit.

Cheers,

Guenter

Reply by comp.arch.fpga ●March 12, 20072007-03-12

There is additional information needed for this evaluation:
- how often do you need a result (throughput and latency)?
- what is the data type (integer? float? precision?)

Unlike CPUs FPGAs have no native datatypes. For cryptographic
applications you might want to
run an FFT on vectors of single bits. For DNA matching you might have
2-bit or 4-bit data types.
For DSP 18-bit or 36-bit integers are a common choice for Xilinx
FPGAs.

The algorithm that you describe implemented serially on 1-bit data
would use 1% of a small FPGA
and run for several hundred thousand clock cycles.

Alternatively on a large FPGA OTOH you can perform a few hundred 18-
bit x 18-bit Multiplications per cycle.

Kolja Sulimma

On 12 Mrz., 13:35, rbbla...@gmail.com wrote:
> Hi all
>
> I'm absolutely new to FPGAs, in fact my work is much more related with
> the SW than with the HW, so I need to solve a problem that ideally I
> was not targeted to.
>
> The issue is this: I have to estimate (roughly) the number of FPGAs
> needed to support a typical signal processing algorithm, steps are as
> follows, always in single-precision:
>
> 1.16k complex samples FFT
> 2. 16k complex vector multiplication
> 3. 16k complex samples IFFT
> 4. 16k complex vector multiplication
> 5. 16 k complex vector sum
>
> The idea is to know how many FPGAs will cover this kind of processing
> in a given time, to compare with different types of processors. Por
> the later, it is really easy just counting number of operations in
> GFLOPs, but with hardware devices I am getting a lot of trouble, since
> I don't have a clear understanding on what should I count.
>
> Please, give me a hand!
>
> Ruben

Reply by Ray Andraka ●March 12, 20072007-03-12

rbblasco@gmail.com wrote:

> Hi all
> 
> I'm absolutely new to FPGAs, in fact my work is much more related with
> the SW than with the HW, so I need to solve a problem that ideally I
> was not targeted to.
> 
> The issue is this: I have to estimate (roughly) the number of FPGAs
> needed to support a typical signal processing algorithm, steps are as
> follows, always in single-precision:
> 
> 1.16k complex samples FFT
> 2. 16k complex vector multiplication
> 3. 16k complex samples IFFT
> 4. 16k complex vector multiplication
> 5. 16 k complex vector sum
> 
> The idea is to know how many FPGAs will cover this kind of processing
> in a given time, to compare with different types of processors. Por
> the later, it is really easy just counting number of operations in
> GFLOPs, but with hardware devices I am getting a lot of trouble, since
> I don't have a clear understanding on what should I count.
> 
> Please, give me a hand!
> 
> Ruben
> 

You left out a key piece of information: how fast do you need to compute 
these 5 steps?
A processor that can do all 5 can fit on a single FPGA provided there is 
a reasonable amount of time between data sets, and that there is enough 
memory available to buffer the input (if needed), store intermediate 
results, and buffer the output.
The wide swath ocean altimeter design featured in the gallery on my 
website ( http://www.andraka.com/wsoa.htm ), for example does everything 
on your list, in the same order and more in under 250usec for a 4K point 
  data set using very old (original virtex) technology, which has 
comparatively little on-chip memory and no embedded multipliers. About 
2/3rd of the area is dedicated to storage buffers using SRL16s (the 
large cyan block in the middle rignt, the magenta/green block below it, 
and the yellow/green blocks at the bottom are all buffers).  The FPGA 
size is small, features are sparse and speed is slow by today's 
standards. Implementation size depends heavily on the FFT implementation 
of course.  My FFT kernel has the smallest size-performance footprint, 
so using others will result in a bigger design for a given speed.

Reply by glen herrmannsfeldt ●March 12, 20072007-03-12

comp.arch.fpga wrote:
(snip)

> For DNA matching you might have 2-bit or 4-bit data types.

For dynamic programming algorithms, the favorite way to do DNA
matching, it is usual to do 16 bit fixed point arithmetic.

-- glen

Reply by glen herrmannsfeldt ●March 12, 20072007-03-12

rbblasco@gmail.com wrote:

> I'm absolutely new to FPGAs, in fact my work is much more related with
> the SW than with the HW, so I need to solve a problem that ideally I
> was not targeted to.

> The issue is this: I have to estimate (roughly) the number of FPGAs
> needed to support a typical signal processing algorithm, steps are as
> follows, always in single-precision:

> 1.16k complex samples FFT
> 2. 16k complex vector multiplication
> 3. 16k complex samples IFFT
> 4. 16k complex vector multiplication
> 5. 16 k complex vector sum

> The idea is to know how many FPGAs will cover this kind of processing
> in a given time, to compare with different types of processors. Por
> the later, it is really easy just counting number of operations in
> GFLOPs, but with hardware devices I am getting a lot of trouble, since
> I don't have a clear understanding on what should I count.

First, floating point tends to be a lot bigger on FPGAs than fixed
point, especially floating point addition.  If you can get away with
fixed point, even if the actual width is somewhat larger, it is probably
worth doing.

Also, you can't just count 'FPGA', but you have to take into account
the size of the different FPGAs, even from the same product family.

I like systolic array processors, which usually work well for this
type of problem.  The thought process for hardware implementations,
especially good pipelined ones, is somewhat different than for software
implementations.   Usually hardware implementations are used when 
software isn't fast enough, so you need to know how fast it has to go.

There is a tradeoff between time and size, but it isn't linear enough
to quote without more details.

-- glen

Reply by Paul ●March 13, 20072007-03-13

With everyone else's previously mentioned comments in mind as well, I
would recommend downloading Xilinx's webpack tool.  Open their "Core
Generator" software.  Run the FFT core from there.  You can enter in
things like processing frequency, sample frequency, etc...  and it
will give you a resource utilization.  You can also pull this
information from the datasheet for their radix-2 fft core.

I wish you the best of luck - but you may want to recommend that your
boss consult a hardware engineer.  With all do respect to software
engineers (I can't write decent C code to save my life) despite what
management likes to believe, FPGA design is hardware design, not
software design.  Without have a good deal of background experience in
digital design, you're going to find it's difficult to make this kind
of estimate accurately.  Again, nothing against software folks, it's
just a different set of training and experience that's required.

On Mar 12, 3:18 pm, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> rbbla...@gmail.com wrote:
> > I'm absolutely new to FPGAs, in fact my work is much more related with
> > the SW than with the HW, so I need to solve a problem that ideally I
> > was not targeted to.
> > The issue is this: I have to estimate (roughly) the number of FPGAs
> > needed to support a typical signal processing algorithm, steps are as
> > follows, always in single-precision:
> > 1.16k complex samples FFT
> > 2. 16k complex vector multiplication
> > 3. 16k complex samples IFFT
> > 4. 16k complex vector multiplication
> > 5. 16 k complex vector sum
> > The idea is to know how many FPGAs will cover this kind of processing
> > in a given time, to compare with different types of processors. Por
> > the later, it is really easy just counting number of operations in
> > GFLOPs, but with hardware devices I am getting a lot of trouble, since
> > I don't have a clear understanding on what should I count.
>
> First, floating point tends to be a lot bigger on FPGAs than fixed
> point, especially floating point addition.  If you can get away with
> fixed point, even if the actual width is somewhat larger, it is probably
> worth doing.
>
> Also, you can't just count 'FPGA', but you have to take into account
> the size of the different FPGAs, even from the same product family.
>
> I like systolic array processors, which usually work well for this
> type of problem.  The thought process for hardware implementations,
> especially good pipelined ones, is somewhat different than for software
> implementations.   Usually hardware implementations are used when
> software isn't fast enough, so you need to know how fast it has to go.
>
> There is a tradeoff between time and size, but it isn't linear enough
> to quote without more details.
>
> -- glen- Hide quoted text -
>
> - Show quoted text -

Estimating number of FPGAs needed for an application

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Quick Links

About FPGARelated.com

Social Networks

The Related Media Group