FPGARelated.com
Forums

July 20th Altera Net Seminar: Stratix II Logic Density

Started by Paul Leventis (at home) July 19, 2005
On Wednesday, July 20th @ 11 AM PST, two of my colleagues (Alex Grbic and 
Paul Ekas) will be giving a net seminar comparing Stratix II and Virtex-4 
logic densities.  They will describe the logic architectures of these two 
families, compare logic densities between these two families, discuss our 
benchmarking methodology and results, and provide software settings to 
maximize logic packing in Stratix II FPGAs.  Details can be found at 
http://www.altera.com/education/net_seminars/all/ns-stratix2_density.html.

Stratix II utilizes an innovative logic element we've called an "Adaptive 
Logic Module" or ALM.  This logic structure can efficiently implement two 
4-LUTs, one 6-LUT, some 7-input functions, a 3-LUT + 5-LUT, plus other 
combinations sharing inputs and/or portions of the LUT mask.  This 
capability translates into increased logic density, but complicates matters 
when it comes to comparing Stratix II results with those of traditional 
4-LUT based architectures such as Stratix I and Virtex.

I should point out that as part of this talk Alex will be providing results 
gathered from publicly available benchmark designs, allowing others to 
replicate the results he will present.

We will answer questions provided during the Net Seminar and I look forward 
to a healthy newsgroup discussion afterwards!

Regards,

Paul Leventis
Altera Corp 


I am glad to see that Altera has joined the world in acknowledging
Virtex-4 as The Gold Standard for FPGAs.
Peter Alfke, Xilinx Applications

Paul

You and the rest of your team at Altera should be absolutely
embarrassed that you continue with this marketing analysis of the
Stratix II logic superiority over the Virtex 4.  And keep in mind that
I have no opinion of one Company over the other.

Looking at the Stratix II 180 vs. the Virtex 4 200, as an example:

In your analysis you claim that the Altera Stratix II "180" with
186,576 "4-input LUTs" is bigger than the Xilinx Virtex 4 "200"
with 178,176 "4-input LUTs".

The point is that the Altera 180 part has 143,520 "ALUTs" and
179,400 "Logic Elements" while the Virtex 200 part has 178,176
"LUTs" and 200,448 "Logic Cells".

The concluding point is that in the Altera uses its higher 179,400
(actually increasing the number to 186,576) number and compares it with
the lower Xilinx 178,176 number in stating its superiority.

You must be kidding that you are using higher Altera count to compare
with the smaller Xilinx count.  You must honestly believe that you are
dealing with idiots.  Anyhow, you should really be comparing the ALUT
number with the LUT number anyhow, because that is your closes
architectural comparison.  Look-up-tables are look-up-tables,
regardless of what you call them.

I could elaborate more on the multiplier and memory comparisons, but I
won't, and the conclusions are the same.

You guys actually had the guts to release a press release with this
analysis:

http://www.altera.com/corporate/news_room/releases/products/nr-density.html

And would you quit marketing your variable input LUT architecture.
Xilinx has had a variable input LUT archtecture since the Virtex was
introduced in 1998.

You really don't think that smart engineers buy this "analysis", do
you?

Just trying to get to the truth.

Tim


Paul Leventis (at home) wrote:
> On Wednesday, July 20th @ 11 AM PST, two of my colleagues (Alex Grbic and > Paul Ekas) will be giving a net seminar comparing Stratix II and Virtex-4 > logic densities. They will describe the logic architectures of these two > families, compare logic densities between these two families, discuss our > benchmarking methodology and results, and provide software settings to > maximize logic packing in Stratix II FPGAs. Details can be found at > http://www.altera.com/education/net_seminars/all/ns-stratix2_density.html. > > Stratix II utilizes an innovative logic element we've called an "Adaptive > Logic Module" or ALM. This logic structure can efficiently implement two > 4-LUTs, one 6-LUT, some 7-input functions, a 3-LUT + 5-LUT, plus other > combinations sharing inputs and/or portions of the LUT mask. This > capability translates into increased logic density, but complicates matters > when it comes to comparing Stratix II results with those of traditional > 4-LUT based architectures such as Stratix I and Virtex. > > I should point out that as part of this talk Alex will be providing results > gathered from publicly available benchmark designs, allowing others to > replicate the results he will present. > > We will answer questions provided during the Net Seminar and I look forward > to a healthy newsgroup discussion afterwards! > > Regards, > > Paul Leventis > Altera Corp
"tim" <tkellis4520@yahoo.com> schrieb im Newsbeitrag
news:1121797823.215527.11170@g44g2000cwa.googlegroups.com...
> Paul > > You and the rest of your team at Altera should be absolutely > embarrassed that you continue with this marketing analysis of the > Stratix II logic superiority over the Virtex 4. And keep in mind that > I have no opinion of one Company over the other. > > Looking at the Stratix II 180 vs. the Virtex 4 200, as an example: > > In your analysis you claim that the Altera Stratix II "180" with > 186,576 "4-input LUTs" is bigger than the Xilinx Virtex 4 "200" > with 178,176 "4-input LUTs". > > The point is that the Altera 180 part has 143,520 "ALUTs" and > 179,400 "Logic Elements" while the Virtex 200 part has 178,176 > "LUTs" and 200,448 "Logic Cells". > > The concluding point is that in the Altera uses its higher 179,400 > (actually increasing the number to 186,576) number and compares it with > the lower Xilinx 178,176 number in stating its superiority. > > You must be kidding that you are using higher Altera count to compare > with the smaller Xilinx count. You must honestly believe that you are > dealing with idiots. Anyhow, you should really be comparing the ALUT > number with the LUT number anyhow, because that is your closes > architectural comparison. Look-up-tables are look-up-tables, > regardless of what you call them. > > I could elaborate more on the multiplier and memory comparisons, but I > won't, and the conclusions are the same. > > You guys actually had the guts to release a press release with this > analysis: > >
http://www.altera.com/corporate/news_room/releases/products/nr-density.html
> > And would you quit marketing your variable input LUT architecture. > Xilinx has had a variable input LUT archtecture since the Virtex was > introduced in 1998. > > You really don't think that smart engineers buy this "analysis", do > you? > > Just trying to get to the truth. > > Tim > > > Paul Leventis (at home) wrote: > > On Wednesday, July 20th @ 11 AM PST, two of my colleagues (Alex Grbic
and
> > Paul Ekas) will be giving a net seminar comparing Stratix II and
Virtex-4 [altera marketing skipped] S180 has more memory then LX200, what is LOGIC optimized, not memory optimized. besides that the large memory blocks can not be loaded from config memory so its not fair comparison anyway. SX and FX devices way smaller than S180 have way more memory. So Virtex beats the S180 on memory S180 has more multipliers than LX200, what again is LOGIC optimized not DSP optimized, SX is DSP optimized and again way smaller device has more DSP slices than the S180 has multipliers on total pin count, well here Altera currently beats Virtex-4 offering. on the overall I am 100% with Tim that all the Altera comparison is totally unfair. of course its hard to have the total truth, the suitability depends on the application and the quality of the tools and many other things. S180 is possible largest single device 1 for all from Altera, where Xilinx has splitted the high end family into 3 offering what is needed for different applications. Antti PS Paul, it would be much more interesting to see the Altera Stratix-2 GX announced, or is really delaying so loong? I would have expected it to be released by now. It still is coming? Could be that Lattice high end FPGAs come out before S2GX, and will beat the S2GX similary as machXO beats MAX2. Not saying that MAX2 isnt nice, it is but the things Altera forgot, they are all in machXO, and try to pronounce it MAX2 machXO sounds even similar :), I bet some Lattice guy made the name deliberatly to sound like MAX2 nice move, ;) and no trademarks violated!
Tim,

The only way I trust to compare the logic capacity of two different 
architectures is to benchmark them against each other.  Modern FPGA 
architectures, like modern processor architectures, are too complex to say 
which is more area-efficient, and by how much, based on a hand analysis. 
Think of trying to guess if a P4, P3 or Athlon is faster based purely on the 
specifications of their pipelines, issue units and clock rates -- it is 
impossible, so you have to benchmark them. FPGAs have hit that level too.

The best thing to do is to do your own comparison using the circuits of 
interest to you in the devices you're considering.  But that's a lot of 
work, especially if you want to test it for multiple circuits (as you really 
should to get statistically valid answers).

Next best is to get someone else's benchmark results.  And that's one of the 
things that will be presented in the NetSeminar tomorrow.

In terms of what should be counted in Stratix II vs. Virtex4 -- this is very 
difficult to do by hand.  Stratix II is fundamentally based on a larger LUT 
(5-LUT with extra circuitry, or a 6-LUT, depending on how you look at it) 
than Virtex4 (4-LUT plus extra circuitry) so counting LUTs doesn't work. 
Academic and industrial research long ago showed that bigger LUTs implement 
more logic, so you can't simply count the number of LUTs in an architecture 
and ignore their size.  But how much more logic can a bigger LUT implement, 
for a typical circuit?  Nobody can tell you accurately, except by running a 
bunch of benchmark circuits and showing the results.

Regards,

Vaughn
Altera
[v b e t z (at) altera.com]



Vaughn Betz wrote:
[...]
> In terms of what should be counted in Stratix II vs. Virtex4 -- this is very > difficult to do by hand. Stratix II is fundamentally based on a larger LUT > (5-LUT with extra circuitry, or a 6-LUT, depending on how you look at it) > than Virtex4 (4-LUT plus extra circuitry) so counting LUTs doesn't work.
^^^^ ^^^^^ ^^^^^^^^^ ^^^^^^^^ ^^^^ ^^^^^^^ Howdy Vaughn, No one (except maybe Xilinx) will fault Altera for trying to show how the 2S180 can pack more logic into the device than the LX200. But engineers ARE likely to fault Altera if they do such a comparison with misleading figures. Your response _*completely ignored*_ the hard facts that Tim presented. Here are Tim's numbers again, since you clipped them: V4 Slices Actual LUTs Logic cells (Xilinx claim) ----------------------------------------------------------- LX200 89088 178176 200448 S2 ALMs ALUTs Equiv_four_input_LUTs (Altera claim) -------------------------------------------------------------------- 2S180 71760 143520 186576 Since the last column is the only one where the funny math comes in, that is the only place Altera has any hope of showing how the 2S180 can pack more logic. To do that, Altera needs to provide a convincing argument that Xilinx's 200k number for their logic isn't just a little overly optimistic, but is so to the tune of at least 7%. And while doing so, it'd probably be good to show why Altera's funny numbers for the S2 are NOT overly optimistic. Until doing that, might I suggest that Figure 1 be fixed on http://www.altera.com/products/devices/stratix2/features/density/st2-vir-density-compare.html which seems to show that it is valid to compare the 178k number against the 186k number. You admit in your response above (where I underlined) that using the 178k number "doesn't work". So why does Altera use it in their comparisons? http://www.altera.com/literature/wp/wpstxiixlnx.pdf also uses the 178k number (even going so far as to claim that it is Xilinx's "equivalent" number, when it is most obviously the *actual* number). This paper is also where the 30% better number is presented without any backup data. Readers might trust this number a bit more if design details (especially the number of designs in each size) were published with the white paper. I have absolutely nothing against Altera, the S2, or the new ALM. But I detest being misled, especially after it has been pointed out a time or two (at which point it becomes obvious that the misleading is being done on purpose rather than it having happened by accident). Marc
Hi Marc,

Comparing different logic architectures is a difficult exercise, and only 
legitimate way to do so is by benchmarking.  That's how we architect our new 
logic architectures -- we build prototype synthesis and place & route tools, 
and measure each candidate architecture on a large suite of designs.  The 
problem is making people outside the company believe our benchmarking is 
correct and impartial.  I'd suggest tuning in to the Net Seminar, and then 
discussing here all the flaws you find in it afterwards.  I'm sure Vaughn, 
Alex and I will be happy to discuss any areas of contention!

Perhaps another way of looking at things is by comparing Stratix II to 
Stratix (lets remove competition for a moment).  When we introduced Stratix 
II, we said that an ALM is equivalent to approximately 2.5 Stratix LEs; this 
result was based on our own internal benchmarking with the tools and circuit 
set we had available at that time.  We also have shown in previous white 
papers that the Stratix LE is more efficient than a Virtex half-slice by a 
margin of ~10% (I'd need to look up the exact number).  This difference 
arises primarily from the increased (routable) register-packing capabilities 
of the Stratix LE architecture.  So *if* you believe these two results, then 
the results we give for Stratix II vs. Virtex-4 are at least consistent with 
our previous claims.

Regards,

Paul Leventis
Altera Corp.


Paul Leventis (at home) wrote:
[...]
> Perhaps another way of looking at things is by comparing Stratix II to > Stratix (lets remove competition for a moment). When we introduced Stratix > II, we said that an ALM is equivalent to approximately 2.5 Stratix LEs; this > result was based on our own internal benchmarking with the tools and circuit > set we had available at that time. We also have shown in previous white > papers that the Stratix LE is more efficient than a Virtex half-slice by a > margin of ~10% (I'd need to look up the exact number). This difference > arises primarily from the increased (routable) register-packing capabilities > of the Stratix LE architecture. So *if* you believe these two results, then > the results we give for Stratix II vs. Virtex-4 are at least consistent with > our previous claims.
Howdy Paul, I thought my post was pretty clear, but let me try one last time. I'm NOT disputing the very real possibility that the ALM allows for more efficient logic packing than the Slice. The *only* thing I'm disputing is the _fact_ that, as Tim pointed out originally and I tried to explain in different words, Altera is happy to boost the S2 actual LUT count by an additional 30% for the "equivilant" logic that can implemented, yet increases Xilinx's actual LUT count by exactly 0% for the stuff surrounding their LUT. As I quoted, even Vaughn admitted that the extra stuff surrounding Xilinx's LUT can't be 0%. To head off Altera claiming that they "don't know how much to add to Xilinx's actual LUT count", either use the inflation figure that Xilinx does (12%), or make up your own and justify it. But it has to be greater than 0%, otherwise all the Altera marketing white papers and websites are comparing apples to oranges. The below columns provide as-near-as-possible apples to apples comparisons: V4 Slices Actual LUTs Logic cells (Xilinx claim) ----------------------------------------------------------- LX200 89088 178176 200448 S2 ALMs ALUTs Equiv_four_input_LUTs (Altera claim) -------------------------------------------------------------------- 2S180 71760 143520 186576 In short, please explain which of the above comparison columns is incorrect, and why. Regards, Marc
Vaughn

This is exactly my point.  In your first paragraph you switch between
the two issues here as if they are the same, which is the point of this
discussion (do they teach you that at Altera).  We are talking two
different issues here, speed and hardware architecture.  Yes, speed is
complex to measure, which is why you say you are faster than Xilinx and
Xilinx says they are faster than you, because of the subjectivity of
the sample universe.

But your entire point of this "marketing" campaign is not the speed but
the hardware.  An LUT is an LUT, no matter how you slice it (granted,
the additional logic to enhance the LUT performance does increase the
complexity, affecting speed, not LUTs).  And your response of the
"larger" LUT for Altera is the last point of my e-mail.  Would you quit
"marketing" your variable input LUT (Xilinx has had a variable input
LUT feature since 1998, in fact you are a little late with this
feature).  If you even look at your own white paper comparing the
schematics between the two; you have two individual 4-input "ALUT"s
feeding into "combinational logic" that provides the flexibility for
the number of inputs, so the foundation is still 2 4-input LUTs.
Interestingly, Altera will not disclose how they combine the 2 4-input
LUTs to provide the flexibility.

Vaughn, again, you are dealing with engineers, not impressionable
consumers.

Tim

Paul

You will also see this same "logic" (sorry for the pun) in my response
to Vaughn.  One things is certain in analyzing the Altera vs Xilinx
debate, and that is that both companies claim to have the performance
advantage, which is speed.  Obviously the outcome is subjective.  My
complaint here is the hardware analysis, where you blatantly use your
higher number and compare it with the Xilinx lower number, which is an
abuse of logic.

Let me give you the basis for this debate:

http://www.fpgajournal.com/articles_2005/20050510_worldsbest.htm

I would welcome your response to the hardware comparison.  See my
comments to Vaughn on the LUT discussion.

Tim