FPGARelated.com
Forums

PowerPC soft-core?

Started by Antti Lukats March 22, 2005
In XCELL issue 52, page 19 Xilinx claims that:

V4-PowerPC reduces power 10:1 compared to FPGA Fabric built version (of
PowerPC)

but that means Xilinx  has internally a PowerPC Soft Core IP?
If they dont then could not measure the power difference :)
I wonder why there is no information about the xilinx soft-core PowerPC at
all?

Antti


Hi Antti,

> In XCELL issue 52, page 19 Xilinx claims that: > > V4-PowerPC reduces power 10:1 compared to FPGA Fabric built version (of > PowerPC) > > but that means Xilinx has internally a PowerPC Soft Core IP? > If they dont then could not measure the power difference :) > I wonder why there is no information about the xilinx soft-core PowerPC at > all?
I'm absolutely positive that IBM has produced a synthesizable PowerPC core that Xilinx has a license for. How else would they be able to verify the functionality of the V2Pro and V4 devices? I think that if you contact your local IBM fab and exchange enough money, that you too can have your own shiny, glistening PowerPC softcore, filling half your huge FPGA and running at 32MHz. Best regards, Ben
Ben Twijnstra wrote:

>>but that means Xilinx has internally a PowerPC Soft Core IP? >>If they dont then could not measure the power difference :) >>I wonder why there is no information about the xilinx soft-core PowerPC at >>all?
> I'm absolutely positive that IBM has produced a synthesizable PowerPC core > that Xilinx has a license for.
I agree, but...
>How else would they be able to verify the >functionality of the V2Pro and V4 devices?
Formal verification of a non-synthesizable core? (�quivalence check, model checking, ...) Validation by simulation of a non-synthesizable core? Kolja Sulimma
Hi Kolja Sulimma,

> Ben Twijnstra wrote: > >>>but that means Xilinx has internally a PowerPC Soft Core IP? >>>If they dont then could not measure the power difference :) >>>I wonder why there is no information about the xilinx soft-core PowerPC >>>at all? > >> I'm absolutely positive that IBM has produced a synthesizable PowerPC >> core that Xilinx has a license for. > > I agree, but... > >>How else would they be able to verify the >>functionality of the V2Pro and V4 devices? > > Formal verification of a non-synthesizable core? (�quivalence check, > model checking, ...) > Validation by simulation of a non-synthesizable core?
True - but most ASIC houses I run into prefer to do all of these, especially with larger projects - and I think that a Virtex qualifies as a 'larger' project. Let's wait what Austin says ;-) Ben
There's a big difference between a synthesizable ASIC CPU model and an FPGA 
optimized CPU model. Back in 2000 I did a design study on implementing the 
PPC instruction set architecture. Depending upon specific hard-wired or 
software-emulated feature sets, and small-or-fast settings, an integer PPC 
subset requires between 1000-2000 LUTs and today would run at most of the 
speed of current FPGA optimized soft CPU cores.

Jan Gray


Antti Lukats wrote:
> In XCELL issue 52, page 19 Xilinx claims that: > > V4-PowerPC reduces power 10:1 compared to FPGA Fabric built version (of > PowerPC) > > but that means Xilinx has internally a PowerPC Soft Core IP? > If they dont then could not measure the power difference :) > I wonder why there is no information about the xilinx soft-core PowerPC at > all? > > Antti > >
Yes, we have an soft version of the PowerPC 405. We used this extensively in the development of the Virtex-II Pro family in order to create and verify IP blocks, development system tools, port software and to provide early access system boards to external 3rd party developers. We were able to do this through our contract with IBM and a lot of work within Xilinx. No, there are no plans to release this as we do not have the rights to do so and the size, speed and power would make it unattractive to nearly everyone. The V-4 PowerPC 405 in comparison displaces only 672 slices, consumes 0.29mW/DMIP (0.44mW/MHz), runs up to 450 MHz and places and routes in less then a second. Just try to get a soft processor to match that. :) Ed
"Ed McGettigan" <ed.mcgettigan@xilinx.com> wrote in message 
news:4240455E.1020602@xilinx.com...
> The V-4 PowerPC 405 in comparison displaces only 672 slices, > consumes 0.29mW/DMIP (0.44mW/MHz), runs up to 450 MHz and places > and routes in less then a second. Just try to get a soft processor to > match that. :)
I think the PPC cores are a nice feature and are well executed. That said, 672 displaced slices are sufficient to hold two (or three austere) 32-bit pipelined RISC soft cores (requiring say 1 BRAM each), each running at ~1/3 of the PPC freq. So, for some applications (e.g. small memory footprint code and data 'controllers' that fit in a BRAM), the hard core is not a big (order of magnitude) win on MIPS/area. Can't 'speak to power' -- the hard processor core is surely much lower power. Properly RPM'd, a compact soft processor core will PAR in neglible time. Certainly the PPC core(s) are vastly more attractive targets for COTS software tools and OSs and infrastructure (docs, developer expertise, ...). See also [http://www.fpgacpu.org/log/feb01.html#010210]: "... this counterintuitive rule of thumb: one streamlined 32-bit soft CPU core optimized for programmable logic might need only half the silicon area of an elaborate 32-bit hard CPU core!" Jan Gray
"Jan Gray" <jsgray@acm.org> schrieb im Newsbeitrag
news:%wY%d.1173$z.144@newsread2.news.atl.earthlink.net...
> "Ed McGettigan" <ed.mcgettigan@xilinx.com> wrote in message > news:4240455E.1020602@xilinx.com... > > The V-4 PowerPC 405 in comparison displaces only 672 slices, > > consumes 0.29mW/DMIP (0.44mW/MHz), runs up to 450 MHz and places > > and routes in less then a second. Just try to get a soft processor to > > match that. :) > > I think the PPC cores are a nice feature and are well executed. That
said,
> 672 displaced slices are sufficient to hold two (or three austere) 32-bit > pipelined RISC soft cores (requiring say 1 BRAM each), each running at
~1/3
> of the PPC freq. So, for some applications (e.g. small memory footprint > code and data 'controllers' that fit in a BRAM), the hard core is not a
big
> (order of magnitude) win on MIPS/area. Can't 'speak to power' -- the hard > processor core is surely much lower power. Properly RPM'd, a compact soft > processor core will PAR in neglible time. > > Certainly the PPC core(s) are vastly more attractive targets for COTS > software tools and OSs and infrastructure (docs, developer expertise,
...).
> > See also [http://www.fpgacpu.org/log/feb01.html#010210]: "... this > counterintuitive rule of thumb: one streamlined 32-bit soft CPU core > optimized for programmable logic might need only half the silicon area of
an
> elaborate 32-bit hard CPU core!" > > Jan Gray > >
LOL, mercy mercy, :) MicroBlaze is is defenetly more than 672/2 slices ! but I think I agree that the rule of thumb is OK! btw Jan I guess you are one of the few who could correctly answer the following FPGA-Quiz question: How many slices are needed to implement frequency divider by 2^37 ? ANSWER: Number of Slices: 3 out of 1408 0% Number of Slice Flip Flops: 2 out of 2816 0% Number of 4 input LUTs: 6 out of 2816 0% Number of bonded IOBs: 1 out of 140 0% Number of GCLKs: 1 out of 16 6% the above is synthesis report for divide by 2^n, n=21..37 P&R shows 3 slices for V2Pro or 4 slices for S3 Antti
Hi Antti,

> LOL, mercy mercy, :) > MicroBlaze is is defenetly more than 672/2 slices !
You could use ERIC5... but it does not really compare to a PowerPC ;-)
> > but I think I agree that the rule of thumb is OK! > > btw Jan I guess you are one of the few who could correctly > answer the following FPGA-Quiz question: > > How many slices are needed to implement frequency divider by 2^37 ? > > ANSWER: > Number of Slices: 3 out of 1408 0% > Number of Slice Flip Flops: 2 out of 2816 0% > Number of 4 input LUTs: 6 out of 2816 0% > Number of bonded IOBs: 1 out of 140 0% > Number of GCLKs: 1 out of 16 6% > > the above is synthesis report for divide by 2^n, n=21..37 > P&R shows 3 slices for V2Pro or 4 slices for S3 >
This makes me curious: Is there other stuff like BRAM involved? Otherwise you HAVE to tell us how you do that (I would simply claim that this is not possible...) Thomas www.entner-electronics.com
>> >> but I think I agree that the rule of thumb is OK! >> >> btw Jan I guess you are one of the few who could correctly >> answer the following FPGA-Quiz question: >> >> How many slices are needed to implement frequency divider by 2^37 ? >> >> ANSWER: >> Number of Slices: 3 out of 1408 0% >> Number of Slice Flip Flops: 2 out of 2816 0% >> Number of 4 input LUTs: 6 out of 2816 0% >> Number of bonded IOBs: 1 out of 140 0% >> Number of GCLKs: 1 out of 16 6% >> >> the above is synthesis report for divide by 2^n, n=21..37 >> P&R shows 3 slices for V2Pro or 4 slices for S3 >> > > This makes me curious: Is there other stuff like BRAM > involved? Otherwise you HAVE to tell us how you do that (I would simply > claim that this is not possible...) > > Thomas > > www.entner-electronics.com >
Hi Antti, as a long time Altera-user I just remembered that the Xilinx-slices support distributed RAM and stuff... I suppose you take advantage of that. Still very impressive! Thomas