FPGARelated.com
Forums

Choice of FPGA device

Started by Varun Jindal November 24, 2004
hello all,

i have been reading a lot about performance comparisons between
leading FPGA chip makers on hteir web-sites. both claim improvement
upon the other by metrics of 20 - 40 % .... though none has ever
described what exactly was compared.

are there resources available on the net, which compare different
architecture in detail (and also impartially) .. !! ??

-- Varun.
Varun Jindal wrote:

> i have been reading a lot about performance comparisons between > leading FPGA chip makers on hteir web-sites. both claim improvement > upon the other by metrics of 20 - 40 % .... though none has ever > described what exactly was compared. > > are there resources available on the net, which compare different > architecture in detail (and also impartially) .. !! ??
Brand A and X are roughly equivalent. Use the vendor that gives you the best service and distribution. -- Mike Treseler
Varun,

Your best bet is to contact the FAE (Field Applications Engineers) for 
both campanies, and have them explain exactly what their claims are 
based on.

What speed grades were compared (e.g. their fastest with our mid-grade)?

What were the settings of the synthesis tools (e.g. their default vs our 
default -- we default for speed of synthesis, theirs for a compromise of 
performance)?

What effort was made to use device specific features (e.g. theirs a lot, 
ours a little)?

What choice of device was made (e.g. their only one choice, versus our 
three options to best fit: LX for logic, SX for DSP, and FX for 
networking and comms)?


Or, you could do like the other posters' suggest:  IGNORE IT and do your 
own benchmark by examining specifications and trying out some intended 
critical logic, and/or examining the offering of IP from each company 
(and its perfomance).

Who is to say which device is 'better'?  Only after careful study, and 
use of specific features that may offer an improvement can one make a 
decision.  And that decision only holds for that one (type of) design!

The "speed superiority" claims appeared three days after we announced 
the availability of three V4 parts as engineering samples.....compared 
to their unavailability.  Hey it ain't fun when your foundry can't 
supply the parts to you, is it?

Our 90nm offerings are succeeding because we did engage early with our 
fab partners, and did shake the process out.  If you wait until the 
process is stable, you will wait forever.  If you don't want the 
process, you are dependent on other larger customers of the fab.....and 
maybe they are making 130nm ASICs and are perfectly happy to wait until 
someone else has paid for the 90nm wafer starts to shake out the new 
process.  And who will use the triple oxide process for reduce leakage 
and power on currents?  No one but an FPGA vendor.  No process, no 
performance.

Our fabs like us for our willingness to be full partners in the 
development of a new advanced process.  I think our customers understand 
that sometimes there will be rough spots in a new introduction of a new 
product on a new process, but overall we continue to offer superior 
products (in my opinion).

Austin


Varun Jindal wrote:

> hello all, > > i have been reading a lot about performance comparisons between > leading FPGA chip makers on hteir web-sites. both claim improvement > upon the other by metrics of 20 - 40 % .... though none has ever > described what exactly was compared. > > are there resources available on the net, which compare different > architecture in detail (and also impartially) .. !! ?? > > -- Varun.

Varun Jindal wrote:

> i have been reading a lot about performance comparisons between > leading FPGA chip makers on hteir web-sites. both claim improvement > upon the other by metrics of 20 - 40 % .... though none has ever > described what exactly was compared.
There are different metrics, and each design has different needs. Even so, it is possible for each to be 20-40% improved over the other. That is why geometric mean is preferred for benchmarks. Say you have two benchmark programs. Machine A runs the first in 1 minute, the second in two. Machine B runs the first in two seconds, the second in one. Machine A runs the first 2 times as fast and the second 0.5 times as fast, the average then is (2+0.5)/2 or 1.25 so machine A runs, on the average, 1.25 times as fast as machine B. If you do it the other way, you find machine B is 1.25 times as fast as machine A. Be very careful when you read benchmark numbers, and always use geometric mean. -- glen
What? Did you mix up minutes and seconds in there? Don't you just add the
times to see which one is quicker? And take into account which type of
program you run most often?
Cheers, Syms.

"glen herrmannsfeldt" <gah@ugcs.caltech.edu> wrote in message
news:co2ip8$in0$1@gnus01.u.washington.edu...
> There are different metrics, and each design has different needs. > > Even so, it is possible for each to be 20-40% improved over > the other. That is why geometric mean is preferred for benchmarks. > > Say you have two benchmark programs. Machine A runs the first > in 1 minute, the second in two. Machine B runs the first in > two seconds, the second in one. > > Machine A runs the first 2 times as fast and the second 0.5 > times as fast, the average then is (2+0.5)/2 or 1.25 so machine > A runs, on the average, 1.25 times as fast as machine B. > > If you do it the other way, you find machine B is 1.25 times > as fast as machine A. > > Be very careful when you read benchmark numbers, and always > use geometric mean. > > -- glen >
Varun,
Altera's benchmarking methodology is documented at
http://www.altera.com/products/devices/performance/benchmark/per-benchmarkmeth.html.
In particular I'd suggest looking at the benchmarking methodology
white paper at http://www.altera.com/literature/wp/wpfpgapbm.pdf which
articulates our exact methodology. I believe that this clear
description of a benchmarking methodology is unique.

Altera benchmarking is done by our engineering group which uses these
results to optimize new architectures, to improve place and route
algorithms, and to improve synthesis results. Marketing uses these
results in a peripheral manner (i.e. they are not run by marketing and
they are not run for marketing).

If you have further interest, Altera will be hosting a net seminar
describing this benchmarking methodology, specifying the results, and
explaining architectural differences that facilitate the significant
performance advantages. Details on the net seminar are found at
http://www.altera.com/education/net_seminars/current/ns-s2perf.html
[note - there will be marketing participation in this net seminar].

Altera has been shipping Stratix II devices since June and began
shipping the Stratix II 2S130 device (biggest FPGA ever shipped by
50%) last week. The comment on unavailability is misguided.

Dave Greenfield
Altera Marketing

 
> > Varun Jindal wrote: > > > hello all, > > > > i have been reading a lot about performance comparisons between > > leading FPGA chip makers on hteir web-sites. both claim improvement > > upon the other by metrics of 20 - 40 % .... though none has ever > > described what exactly was compared. > > > > are there resources available on the net, which compare different > > architecture in detail (and also impartially) .. !! ?? > > > > -- Varun.
As already discussed by Glen, WHAT and HOW the comparison is done is of
core importance.

Again, Dave, the link provided by you does highlight the methodology of
benchmarking. Discussion regarding the use of sub-optimal benchmark
designs to generate mis-leading comparisons is very true. But, having
discussed this, there is no opinion on the issue of
incorrect/incomplete choice of performance ratio. What if the
comparison parameters are biased against an architecture.

Even if non-disclosure of benchmark designs is valid, what about the
performance ratio !?  My question is what is the equation of your
performance ratio!? Why is it so difficult to disclose this equation. A
user holds a right to know what importance you have given to each
parameter in order to calculate the performance ratio.

i will give you a small example, due to limited funding, the choice of
which FPGA device to purchase is heavily governed by the size of chip
which can implement my design. But i am not aware whether you have
taken this point into account while calculating teh performance ratio.
In case you have, what weightable has been given to it !? All this is
still black box to me. How do you or anybody else for that matter
expect me to rely on the comparison results provided on the
web-sites.!?

Can different people have different reasons for buying a FPGA device!?
How can 'chip size'-based decisions or 'chip performance'-based
decisions be made from one set of performance ratio!?

-- Varun.

Hi Varun,

I think that the simple answer is that a purchase for a single design cannot 
be made purely off of general benchmarking results.  You need to evaluate 
the performance of our chips (and any others) for your design and its 
requirements.  And you need to factor in other chip features and performance 
parameters, the price you can get from your distributor/fae, the packaging 
choices, device availability, etc.

Let's step away from questions of benchmarking validity, averaging methods 
and such.  In the end, we get a spread of results.  If your design happens 
to be one of the designs that experiences equivalent performance (or say you 
are the data point at the extreme left in Figure 1 at 
http://www.altera.com/products/devices/performance/high_performance/per-high_performance_fpga.html), 
then our 39% means nothing to you.

All benchmarking results do is provide you with some guidance of what to 
expect.  Based on our Stratix II benchmarking results, you can expect a chip 
that will likely outperform Virtex-4.  This could mean that you hit your 
performance target in one and not the other.  Or it could mean that you can 
buy a cheaper speed grade in Stratix II but need a more expensive speed 
grade in Virtex-4.  Similarly, you can expect Cyclone II to be ~60% faster 
than Spartan-3.

If you only have time to try one chip, I think it should be Stratix II or 
Cyclone/Cyclone II (depending on your needs), given the average results we 
see.  If you have time to try two chips, I still think you should just buy 
ours ;-) -- but I will grudgingly accept that you will probably try out what 
Xilinx has to offer too :-)

Does that mean Stratix II is the right chip for you?  Not necessarily.


Hi Austin,

I believe Dave has addressed the overall question on benchmarking 
methodology.  I'd like to address a few specific benchmarking questions in 
your posting (which I believe are addressed in the links David provided).

> What speed grades were compared (e.g. their fastest with our mid-grade)?
We always produce at a comparison of the fastest speed grades available in the software, and we will sometimes publish other comparisons with explicit indications of speed grade. Our philosophy is that if a speed grade is not in software, users can't design to it, and thus it is not real. When a new speed grade becomes available (from either vendor), we re-measure our benchmarks. Note that sometimes speed grades appear in the software that are difficult/impossible to actually get from the vendor. We do not factor this into our benchmarking results. This is can be an advantage for competitors, since we haven't had a speed grade availability issue that I know of.
> What were the settings of the synthesis tools (e.g. their default vs our > default -- we default for speed of synthesis, theirs for a compromise of > performance)?
We do apples to apples comparisons. We usually use a 3rd party synthesis tool, same version, same settings for both chips. If we are trying to compare architecture speeds, we push synthesis for speed (for both architectures). We also sometimes publish results using the available integrated sythesis. This is particularly relevant in the low-cost market where CAD tool costs can be a factor. If we are comparing architectural speed, we select the settings (for both tools) that yield the best speed results. We do not cripple either tool, and we go as far as running many experiments to try to determine the best settings for our competitors' tools.
> What effort was made to use device specific features (e.g. theirs a lot, > ours a little)?
We make a fairly signficant effort. We do not go as far as rearchitecting the design to be specifically optimised for a chip. We try to standardize the HDL between the target architectures, with exception of "cores" such as explicitly instantiated memories, multiplier/accumulator blocks, PLL/DCMs, etc. Does this mean for a given design we've extracted the most we can? I'd say no, since that would require an enormous engineering effort on (typically) poorly documented designs (all we get is the HDL, and sometimes it's been anonymized). But our benchmark set comprises designs that were originally targeted to our chips of current and past generations, competitors chips, ASICs, vanilla HDL, etc, so there will probably be headroom left in both architectures under comparison.
> What choice of device was made (e.g. their only one choice, versus our > three options to best fit: LX for logic, SX for DSP, and FX for networking > and comms)?
We select the smallest device that fits the design, since we believe that our customers would likely do so as well. The whole Virtex-4 alphabet soup issue is new. But since only LX is available, its moot for now -- no point comparing to a family that is not available. As for FX, that's a non-issue as we're talking about core fabric performance. Stratix GX and your FX parts offer additional hard IP that will factor into some customer's decisions, and probably in a way that no amount of benchmarking will be able to quantify.
> The "speed superiority" claims appeared three days after we announced the > availability of three V4 parts as engineering samples.....compared to > their unavailability. Hey it ain't fun when your foundry can't supply the > parts to you, is it?
I'm not quite sure which three days you are referring to, but the primary reason for the timing of our release was availablity of ISE support for Virtex-4. We can't benchmark against a chip that doesn't exist in the software. If we knew we had a 39% performance advantage earlier, do you think we would have sat on it? I'm not sure what its like when a foundry can't supply us parts Austin, so I can't feel your pain. Sorry. We have one fabulous fab partner in TSMC, and it's the only one we need. Regards, Paul Leventis Altera Corp.
Paul,

Thanks for the reply.

I disagree with pretty much everything you say, but you are cvertainly 
entitled to offer a defense.

Thanks for admitting that you did not compare similar speed grade parts.

Austin

Paul Leventis (at home) wrote:

> Hi Austin, > > I believe Dave has addressed the overall question on benchmarking > methodology. I'd like to address a few specific benchmarking questions in > your posting (which I believe are addressed in the links David provided). > > >>What speed grades were compared (e.g. their fastest with our mid-grade)? > > > We always produce at a comparison of the fastest speed grades available in > the software, and we will sometimes publish other comparisons with explicit > indications of speed grade. Our philosophy is that if a speed grade is not > in software, users can't design to it, and thus it is not real. When a new > speed grade becomes available (from either vendor), we re-measure our > benchmarks. > > Note that sometimes speed grades appear in the software that are > difficult/impossible to actually get from the vendor. We do not factor this > into our benchmarking results. This is can be an advantage for competitors, > since we haven't had a speed grade availability issue that I know of. > > >>What were the settings of the synthesis tools (e.g. their default vs our >>default -- we default for speed of synthesis, theirs for a compromise of >>performance)? > > > We do apples to apples comparisons. We usually use a 3rd party synthesis > tool, same version, same settings for both chips. If we are trying to > compare architecture speeds, we push synthesis for speed (for both > architectures). > > We also sometimes publish results using the available integrated sythesis. > This is particularly relevant in the low-cost market where CAD tool costs > can be a factor. If we are comparing architectural speed, we select the > settings (for both tools) that yield the best speed results. We do not > cripple either tool, and we go as far as running many experiments to try to > determine the best settings for our competitors' tools. > > >>What effort was made to use device specific features (e.g. theirs a lot, >>ours a little)? > > > We make a fairly signficant effort. We do not go as far as rearchitecting > the design to be specifically optimised for a chip. We try to standardize > the HDL between the target architectures, with exception of "cores" such as > explicitly instantiated memories, multiplier/accumulator blocks, PLL/DCMs, > etc. > > Does this mean for a given design we've extracted the most we can? I'd say > no, since that would require an enormous engineering effort on (typically) > poorly documented designs (all we get is the HDL, and sometimes it's been > anonymized). But our benchmark set comprises designs that were originally > targeted to our chips of current and past generations, competitors chips, > ASICs, vanilla HDL, etc, so there will probably be headroom left in both > architectures under comparison. > > >>What choice of device was made (e.g. their only one choice, versus our >>three options to best fit: LX for logic, SX for DSP, and FX for networking >>and comms)? > > > We select the smallest device that fits the design, since we believe that > our customers would likely do so as well. The whole Virtex-4 alphabet soup > issue is new. But since only LX is available, its moot for now -- no point > comparing to a family that is not available. > > As for FX, that's a non-issue as we're talking about core fabric > performance. Stratix GX and your FX parts offer additional hard IP that > will factor into some customer's decisions, and probably in a way that no > amount of benchmarking will be able to quantify. > > >>The "speed superiority" claims appeared three days after we announced the >>availability of three V4 parts as engineering samples.....compared to >>their unavailability. Hey it ain't fun when your foundry can't supply the >>parts to you, is it? > > > I'm not quite sure which three days you are referring to, but the primary > reason for the timing of our release was availablity of ISE support for > Virtex-4. We can't benchmark against a chip that doesn't exist in the > software. If we knew we had a 39% performance advantage earlier, do you > think we would have sat on it? > > I'm not sure what its like when a foundry can't supply us parts Austin, so I > can't feel your pain. Sorry. We have one fabulous fab partner in TSMC, and > it's the only one we need. > > Regards, > > Paul Leventis > Altera Corp. > >