On Fri, 04 Feb 2005 16:16:19 +1300, Jim Granville <no.spam@designtools.co.nz> wrote:>There is plenty of diversity in the marketplace, and users are not >so silly as to buy purely on one benchmark. > >Look at http://www.eembc.hotdesk.com/ - there are a lot of uC/iP listed, >and they are not fearfull that comming 2nd in some benchmark will be the >kiss of death ?EEMBC isn't public; the sources cost ~ $30K. This makes it pretty much useless, because only the processor vendors buy in, and there's no obligation to publish results. The average user can't run a benchmark on two different systems, and it's in the user's system that performance really counts. Another issue with benchmarks is that vendors simply target their processor/FPGA/whatever at the benchmark. It would be relatively easy for an FPGA vendor to increase performance on a known benchmark, either by targetting their software at it, or by introducing dedicated hardware in the next device. In the long run, everybody loses. Besides, how many FPGA end-users actually buy on raw performance? Very few, I suspect, and they're probably the ones who are targetting ASICs anyway. Evan
Re: See Peter's High-Wire Act next Tuesday
Started by ●February 4, 2005
Reply by ●February 4, 20052005-02-04
Evan Lavelle wrote:> On Fri, 04 Feb 2005 16:16:19 +1300, Jim Granville > <no.spam@designtools.co.nz> wrote: > > >>There is plenty of diversity in the marketplace, and users are not >>so silly as to buy purely on one benchmark. >> >>Look at http://www.eembc.hotdesk.com/ - there are a lot of uC/iP listed, >>and they are not fearfull that comming 2nd in some benchmark will be the >>kiss of death ? > > > EEMBC isn't public; the sources cost ~ $30K. This makes it pretty much > useless, because only the processor vendors buy in, and there's no > obligation to publish results. The average user can't run a benchmark > on two different systems, and it's in the user's system that > performance really counts.What you mean is it is not free. That is their busines model, EEMBC are there to make money.> Another issue with benchmarks is that vendors simply target their > processor/FPGA/whatever at the benchmark. It would be relatively easy > for an FPGA vendor to increase performance on a known benchmark, > either by targetting their software at it, or by introducing dedicated > hardware in the next device. In the long run, everybody loses.Why ? - if the benchmarks are application relevant, then then surely the application speed improves, and everyone wins ? Static Icc and package sizes are also design benchmarks.> > Besides, how many FPGA end-users actually buy on raw performance? Very > few, I suspect, and they're probably the ones who are targetting ASICs > anyway.Correct, but benchmarks are not all about speed, they are about a defined set of designs, so you can exercise a device and get mA/MHz, or LUT, or MHz or ns, or whatever parameter matters to you most. A vendors claim of 39% is of little use to anyone. Another use is they can show you how to get more of something, for more effort, in the optimised benchmark category. Of course, the optimised category is not a level playing field, that is the whole point. -jg
Reply by ●February 4, 20052005-02-04
Yeah, the 39% seems cooked to me, especially with no way to check it for the interested public. Where is that Altera guy hiding ? -Che> > Correct, but benchmarks are not all about speed, they are about > a defined set of designs, so you can exercise a device and get > mA/MHz, or LUT, or MHz or ns, or whatever parameter matters to you > most. A vendors claim of 39% is of little use to anyone. > Another use is they can show you how to get more of something, formore> effort, in the optimised benchmark category. > Of course, the optimised category is not a level playing field,that is> the whole point. > > -jg
Reply by ●February 5, 20052005-02-05
> Yeah, the 39% seems cooked to me, especially with no way to check it > for the interested public. > Where is that Altera guy hiding ?I have posted all I need to say on the subject. Clearly, I believe the +39% is real. We have invested years of engineering time to gather benchmark designs, fairly convert them between architectures, and figured out how to get the best out of both tools, and to produce comparisons. We have disclosed how we run our tests, and the results we achieved. But short of releasing the actual designs, which we cannot do, we will never be able to convince those people (such as you) who believe we are cooking the numbers when we are not. I can't blame you for being in disbelief -- trust me, we double- and triple-checked our results because we were so surprised that Virtex-4 came out so poorly. Do I believe 39% tells the whole story of comparing these two device families? No; it is just one (very important) parameter. Regards, Paul Leventis Altera Corp.
Reply by ●February 6, 20052005-02-06
Paul, let me help you. There are three ingredients to this "surprise": 1. Altera used its fastest (of three) speed grade against the middle of three Xilinx speed grades. ( I have previously explained your reason for, and the much stronger reason against doing that.) 2. Altera did not exercise the Xilinx software as strongly as they pushed their own. The software tools are quite different, and require a different approach if absolute highest speed is the goal. Which it was. 3. It is reasonable to assume that Altera's stored designs are more Stratix-friendly. So, don't you guys play the surprised innocent onlookers. Nobody expected Altera to be fair. Hell, I think the whole business of competitive benchmarks being run and promoted by an interested party is a sham and a disgusting deception. That's why I refused to enter the mudbath... Peter Alfke
Reply by ●February 6, 20052005-02-06
Peter Alfke wrote:> Paul, let me help you. > There are three ingredients to this "surprise": > 1. Altera used its fastest (of three) speed grade against the middle of > three Xilinx speed grades. > ( I have previously explained your reason for, and the much stronger > reason against doing that.)Correct me if I am wrong, but didn't Altera use the most current speed file data that was available at the time? Or was the data available in the speed file and just the parts are not available? Lets face it. Even if the speed file data was available, data based on estimates is pretty pointless. We have seen significant changes in speed files even *after* a chip is in production. So the data is pretty meaningless *before* the parts are in production.> 2. Altera did not exercise the Xilinx software as strongly as they > pushed their own. The software tools are quite different, and require a > different approach if absolute highest speed is the goal. Which it was.This is a point that no one can prove either way. Xilinx does not release their benchmark designs and Altera does not either. So the users are left not knowing if any of the info is correct.> 3. It is reasonable to assume that Altera's stored designs are more > Stratix-friendly.That sounds like marketing-speak. Regardless, until we get a set of benchmarks that are open *and* useful, this is all just a tempest in a teapot.> So, don't you guys play the surprised innocent onlookers. Nobody > expected Altera to be fair. > Hell, I think the whole business of competitive benchmarks being run > and promoted by an interested party is a sham and a disgusting > deception. That's why I refused to enter the mudbath...But here you are... :) -- Rick Collins rick.collins@XYarius.com Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design http://www.arius.com 4 King Ave. 301-682-7772 Voice Frederick, MD 21701-3110 GNU tools for the ARM http://www.gnuarm.com
Reply by ●February 6, 20052005-02-06
Paul Leventis (at home) wrote: <snip> > But short of releasing the actual designs, which we cannot do, we will never be able to> convince those people (such as you) who believe we are cooking the numbers > when we are not.And there you have the best possible argument for Public (WEB) Source code. - I cannot believe the designers at Altera feel happy to have "invested years of engineering time", and find themselves unable to publicly verify the numbers. ( and even have them openly laughed at ? ) - To me, that is a total waste of time. You, and your customers deserve better. Simple solution: Get some designs you CAN release !? -jg
Reply by ●February 6, 20052005-02-06
Hi Jim,> And there you have the best possible argument for Public (WEB) Source > code. > - I cannot believe the designers at Altera feel happy to have "invested > years of engineering time", and find themselves unable to publicly verify > the numbers. ( and even have them openly laughed at ? ) - > To me, that is a total waste of time. You, and your customers deserve > better.From a marketing perspective, yes it makes life difficult. That is only a secondary goal of our benchmarking effort. The primary reasons we collect designs and measure our performance is to (a) improve our CAD tools and (b) experiment on new architectures. When developing new cad algorithms and new architectures, we need to be able to compare the new vs. the old to see if the change is a useful one. For example, there is no way we could have ever made the radical change of moving from our old Stratix 4-LUT based LE to the Stratix II decomposable 6-LUT with shared LUT function capability. There is a lot of pain (synthesis effort, IP changes, customer impact, etc.) associated with changing the logic architecture of a family, and we need good, solid data to back it up. Similarly, when we make changes to our synthesis, placement and routing algorithms, every such change must be validated for functionality and quality. Hopefully someone out there will put together some new public domain big benchmarks (like the old MCNC benchmarks, still quoted so often in academic literature). It would do the academic community some good to see what real designs look like these days. Paul Leventis Altera Corp.
Reply by ●February 6, 20052005-02-06
Hi Peter,> There are three ingredients to this "surprise":All are reasonable comments/questions which I will address below. But let's first step away from Altera's direct Stratix II to Virtex-4 result, and like good engineers sanity check the +39% number by trying to compare things another way. Virtex-4 vs. Virtex II-Pro is around 5% faster (I don't have the exact result on hand). Whatever perceived bias there may be, this is the result we get when we run those two chips head-to-head, using the same designs, same software, and same methodology. And we've heard this from Xilinx users who have been surprised at the lack of performance increase when comparing the two chips. There have been postings on this subject in this very newsgroup. Yes, some IP blocks have got faster, and there have been changes to various aspects of the chip, but the basic logic + routing fabric really hasn't improved much. As an architect, I am not surprised at the lack of performance increase. Nothing has changed in V-4 on the logic or routing front vs. VIIpro that would lead to speed. The stripping of SRL16s from the M slices should lead only to some area reduction. And going to 90 nm from 130 nm doesn't automatically confer a speed advantage, since this depends on choices of exactly where you target the process, what gate lengths you use, what threshold voltages you use (and where), and even things like using slow thick-oxide transistors. As an example of this, moving from Cyclone (130 nm) to Cyclone II (90 nm), we're only seeing somewhere in the neighbourhood of a 10% performance advantage. Contrast this to Stratix II. Stratix II is 45-50% faster than Stratix. Again, same designs, same methodology, same tools. Perhaps we cannot be trusted to run Xilinx tools fairly, but we had better know how to run our own tools and chips. Why are we seeing this much advantage? A small part comes from process. But most of it is as a result of the new logic architecture -- to first order, larger LUTs mean fewer logic levels with roughly equivalent delay per level, thus faster overall performance. And there are numerous other changes under-the-hood relating to the routing architecture and electrical design that lead to further performance improvements. If we had not innovated, we also would have been left with a product that was not much faster than its predecessor. So is a performance advantage in the 39% range that difficult to believe? Well, unless you think that Virtex II Pro was way faster than Stratix (numerous head-to-head battles in the field do not support this), a big advantage for Stratix II is reasonable to expect.> 1. Altera used its fastest (of three) speed grade against the middle of > three Xilinx speed grades.I guess you could say that's the marketing of our results. The science behind the +39% is valid -- we clearly state what we are comparing to and why we are comparing that way. And from a customer (today) perspective, that is what they would see too. There are no speed files for -12 and no entries in the data sheet. Assuming your fastest speed grade (whenever it finally comes out) is 10-15% faster than the middle speed grade, we'll still be talking about a ~25-30% advantage for Stratix II.> 2. Altera did not exercise the Xilinx software as strongly as they > pushed their own. The software tools are quite different, and require a > different approach if absolute highest speed is the goal. Which it was.I disagree. We spent months trying the default settings, the best settings, and every combination of settings we could in order to maximize the performance of designs run on ISE. The performance of ISE is difficult to compare against, since it tends to do a crappy job when over-constrained or under-constrained. This means we have to spend a lot of time finding just the right set of constraints to determine the best (per-domain) performance. To get a result for a Virtex-4 design, we run the tool ~15 times on average in order to find the best constraints for that design. Please publicly post the best-effort methodology you would like us (and your customers) to employ, or be more specific about what about our approach leads to a tool bias. I would be happy to discuss the merits of various benchmarking approachs.> 3. It is reasonable to assume that Altera's stored designs are more > Stratix-friendly.I'll agree that there *could* be some unintentional bias here. Its tricky -- a lot of our benchmark designs come from customers who engage with us because they are having troubles, sometimes with meeting performance. This may be on an older chip (say, APEX) or it may be that they were having tool issues. What this means is part of our benchmark set comprises designs that *did not* do well on our chips. Some of our designs are targetted originally at Xilinx devices and we've been called in to try to hit performance in an Altera part because the customer was having availability issues. And yes, some of our designs are just plain old design-to-Stratix designs. So its a bit of a mess to try to decipher whether or not we have a bias based on the benchmark set... One point I would make is that Stratix II's logic architecture is radically different from Stratix's. So even if we had a bias towards Stratix designs, I'm not sure that would mean we should automatically see an advantage for Stratix II vs. Virtex-4 since in many ways Virtex-4 and Stratix are more similar than is Stratix II to either architecture.> So, don't you guys play the surprised innocent onlookers. Nobody > expected Altera to be fair.I don't think I've ever expressed surprise at the response. I sometimes wonder whether we should have screwed being fair and instead just posted totally unfair results like +60% or +70%, or to take a page from Xilinx's books, "up to +230%" so that once people de-rate for assumed unfairness, they'd end up somewhere near the right result.> Hell, I think the whole business of competitive benchmarks being run > and promoted by an interested party is a sham and a disgusting > deception. That's why I refused to enter the mudbath...And Xilinx has never dared to make any performance claims in the past? At least our +39% result is a step forward in that we are using averages (real, geometric averages with a full data set) and not "up to" numbers, and we are posting details on our methodology, and are doing our best not to stack the deck. Regards, Paul Leventis Altera Corp.
Reply by ●February 6, 20052005-02-06
Hi Rick,> Correct me if I am wrong, but didn't Altera use the most current speed > file data that was available at the time? Or was the data available in > the speed file and just the parts are not available? Lets face it.You are correct. No speed files are available for -12. No numbers are in the datasheet. So we compare to -11.> Even if the speed file data was available, data based on estimates is > pretty pointless. We have seen significant changes in speed files even > *after* a chip is in production. So the data is pretty meaningless > *before* the parts are in production.I think performance comparisons based on preliminary timing models are still valid. Regardless of how correct speed files are, that is the performance a customer will see, and is what customers are using to select devices and speed grades. Of course, performance comparisons need to be updated with each release of speed files (and cad software too -- algorithms are always improving). I cannot speak for Xilinx and their speed files. But on the Altera side of things, I would not expect much change in core performance. For all families I've been involved with (Stratix, Cyclone, and beyond), our core performance predictions made in the preliminary timing models have been very close (within 5%) of final production numbers. Stratix II core (logic + routing) speed will not be changing more than a few % in the future. Our models have already been correlated to silicon and compare very well. The toggle rate limitations on DSP and memory blocks will likely increase (again) since we're still in the process of finishing off our characterization of these blocks and we like to stick with conservative limits until that characterization is completed. Regards, Paul Leventis Altera Corp.




