FPGARelated.com
Forums

disappointing 550Mhz performance of V5 DSP slices

Started by Unknown May 17, 2006
Hello,
Can anyone give i explanation for the disappointing 550Mhz performance
of V5 DSP slices? Couldn't we hope 1GHz multipliers with 65nm
technology?

By the way, why are not the multipliers  pipelined to increase the
performance?. Is there any chance to see pipelined multipliers in
virtex-6?

<airtom@gmail.com> wrote in message
news:1147862042.233347.292340@j33g2000cwa.googlegroups.com...
> Hello, > Can anyone give i explanation for the disappointing 550Mhz performance > of V5 DSP slices? Couldn't we hope 1GHz multipliers with 65nm > technology?
You can hope for what you like! :) 1GHz multipliers would be very hard to use if the logic fabric and memories can't keep up. If you can't feed them with input data they will spend a lot of time doing nothing... do you have an application in mind that requires 1GHz multipliers? How would you propose to engineer them into your wider FPGA design?
> By the way, why are not the multipliers pipelined to increase the > performance?. Is there any chance to see pipelined multipliers in > virtex-6?
The DSP48E is already pipelined. Obviously, adding extra (bypassable) pipeline stages within the multiplier would increase the maximum clock speed, but it also adds latency and silicon area, and increases power consumption. Everything's a tradeoff... Note that there are many enhancements in the DSP48E over the Virtex-4 DSP48 block, not the least of which being a larger multiplier (25x18). So a direct MHz-to-MHz comparison with the previous generation is not entirely fair. Cheers, -Ben-
>>1GHz multipliers would be very hard to >>use if the logic fabric and memories can't keep up
in some problems with pipeline techniques, logic fabric can keep up For memories, i am sure u can design 2x faster memories
>>do you have >>an application in mind that requires 1GHz multipliers
Scientific computation
>>The DSP48E is already pipelined. Obviously, adding extra (bypassable) >>pipeline stages within the multiplier would increase the maximum clock >>speed, but it also adds latency and silicon area, and increases power >>consumption. Everything's a tradeoff...
Agreed
>>Note that there are many enhancements in the DSP48E over the Virtex-4 DSP48 >>block, not the least of which being a larger multiplier (25x18). So a direct >>MHz-to-MHz comparison with the previous generation is not entirely fair.
Agreed I am just noticing that moore law is not respected relative to dsp frequency and wanted to know the reason for this(is it technological problems, or strategic problems to satisfy the maximum number of customers) Cheers, Thom
Tom,

Moore's Law actually refers to transistor sizes, not clock frequencies.
 Interconnect delay at these deep submicron sizes has significantly
reduced clock frequency scaling - just look at how Pentium frequencies
have stopped increasing.

Xilinx could significantly pipeline the multiplier to get a much higher
clock rate, but, as mentioned previously, keeping it fed with data from
the configurable fabric would be difficult at higher frequencies.

MathStar has a field programmable array with 1GHz DSP performance, but
it's not bit-level configurable.

Stephen

Stephen Craven schrieb:
> Tom, > > Moore's Law actually refers to transistor sizes, not clock frequencies. > Interconnect delay at these deep submicron sizes has significantly > reduced clock frequency scaling - just look at how Pentium frequencies > have stopped increasing. > > Xilinx could significantly pipeline the multiplier to get a much higher > clock rate, but, as mentioned previously, keeping it fed with data from > the configurable fabric would be difficult at higher frequencies.
Guys, guys. Can you ever get enough? I guess no! :-( First, clock frequency is not necessary processing power. Second, 550 MHz isnt a piece of cake, even if there are a few other fast(er) competitors. Third, I think that all those fency Pentium/Athlon/Whatever CPUS are heavy pipelined. Fourth, blablablablablablablablablablablablablabla And, Iam not a follower of those stock market philosophy of ever (exponential) rising profit, or here in the electronic world ever (exponential) rising processing power/clock frequency. This "law" was suprisingly valid for a long time, it still can be fullfilled, even with the great challenges of deep sub-micron technology. But there is an end to all things. Remember, the transfer function of a diode is also exponentional over many decades, until it ends up in smoke. Regards Falk P.S. Reminds me again of some old basic rules for programmers. 1st Your CPU ist always too slow. 2nd You have always too less RAM. 3rd The compiler is lousy.
All,

A recent Intel presentation at an IEEE Workshop admitted that clock 
frequency has max'd out, and now has to go down (not up) in order to not 
create heat.

We have known that for years now.  So has AMD.

The only choice is "multi."

Intel proposes a future with more than 200 x86 cores on one die, with a 
"communications fabric" and many memories.  All on one die.  Small 
software problem to be solved by the need to have it solved....

One attendee of the conference (not me!) quipped, "sounds like you are 
describing a FPGA..."

Boy did the presenter get mad!  To be ccompared to a lowly FPGA!  He was 
spitting venom back at the attendee.  "There is no comparison!  FPGAs 
are fine grained, and this is not!"

Sounds like if that is the only difference, the FPGA wins.  Again.

Oh, and I can't wait for Intel to stub their toes on that 
"communications fabric" (left as an exercise for the student).  Or the 
software.

I think that we are all dissapointed:  no high K gate dielectric, so the 
supply voltage can't scale anymore, worsening variation in threshold 
voltages, because not only can you count the layers in the gate 
dielectric on your fingers, but you can count the ions that got 
implanted, too.  Not only does the source-drain leak when off, but gates 
leak now (at 65nm and below).

A new fab costing 2B$.  No clear path for lithography.

Good thing we make a standard product, and can afford to keep developing 
it.  ASSP vendors will have to consolodate, and reduce their offerings. 
  Real tough times ahead for some business models.  Only place to get 
the latest IP will be from FPGA vendors....

The future is ~500 MHz, more stuff, and voltages slowly decreasing to 
0.8 volts.

But, we can still get twice as many transistors per unit area, all the 
way down to 22nm.  And that increases thoughput and processing power.

45nm, 35nm, and then 22nm.  Life in the old horse yet.  2B 6 input LUTs 
in V8?  100 Mb of BRAM?  2,000 DSP processors?  Crystal ball is getting 
very hazy....Aunti Em!  Aunti Em!  She is holding her head! (apologies 
to the Wizard of Oz movie).

After that, we really are looking for that disruptive technology with 
which we can make a new FPGA.

Now if I could only get those unobtainium wafers to yield....

Austin
>Guys, guys. Can you ever get enough?
I agree with the sentiment. Most of us aren't pushing the speed of our devices. We use FPGAs for other reasons. Of course, there are always "leading edge" applications that need the highest performance, but it's a small part of the market and a minority interest. Unfortunately, speed is the usual news from manufacturers: "This will run at 800MHz now and 950MHz by the end of next quarter...". Yawn! Maybe the speed freaks can form their own newsgroup ("FPGA overclockers"?) Stand by for photos of water-cooled chips, lit with blue LEDs.
Look at other maturing areas:

Commercial aviation has hardly gotten faster since the 747 arrived more
than 30 years ago.
Automobiles are hardly getting faster, except for at the lunatic
fringe.
100 m dash has improved a few measly percent since Jesse Owens in 1936,
70 years ago !
Baseball records improve mainly through chemistry...

But, as Austin wrote, we are still trying.
It is now easier to make circuits smaller, make bigger chips, improve
yield, and thus lower cost, than it is to make circuits faster.

Peter Alfke

Austin Lesea <austin@xilinx.com> wrote:
>All,
>A recent Intel presentation at an IEEE Workshop admitted that clock >frequency has max'd out, and now has to go down (not up) in order to not >create heat.
>We have known that for years now. So has AMD.
>The only choice is "multi."
>Intel proposes a future with more than 200 x86 cores on one die, with a >"communications fabric" and many memories. All on one die. Small >software problem to be solved by the need to have it solved....
>One attendee of the conference (not me!) quipped, "sounds like you are >describing a FPGA..."
>Boy did the presenter get mad! To be ccompared to a lowly FPGA! He was >spitting venom back at the attendee. "There is no comparison! FPGAs >are fine grained, and this is not!"
>Sounds like if that is the only difference, the FPGA wins. Again.
Maybe we'll see "Xilinx inside" within 20 years ;) Maybe machines with fpgas interconnected in a giant "web of interconnects" will be the feature. And parallell computeing as the only way to harness that capability. One could even take processed silicon plates and have them unmap faulty chips and interconnect the rest to have that functionality in one go.
MikeShepherd564@btinternet.com schrieb:

> Maybe the speed freaks can form their own newsgroup ("FPGA > overclockers"?) Stand by for photos of water-cooled chips, lit with > blue LEDs.
STRIKE! You made my day! (Ok, its evening now, but nevertheless) Regards Falk