Hi all My goal was to transfer an Assembler implementation that was running on an ARM chip to the PowerPC 405 architecture on a Xilinx Virtex II Pro. In the meantime this is working but for some reason the PowerPC implementation is more than 10 times!!! slower than the ARM chip implementation. I can only think of 3 reasons: 1) In the ARM implementation I save 16 registers on the stack, whereas I save 32 registers when working with the PowerPC. mflr 0 // save register and set up the stack frame stw 0, 4(1) addi 1, 1, -124 stmw 3, 8(1) // do some stuff lmw 3, 8(1) // restore registers and destroy the stack frame addi 1, 1, 124 lwz 0, 4(1) mtlr 0 This is the way how I set up stack frames and destroyed them in routine calls. Shouldnt really have a bad impact on the performance? 2) The multiplier in the PowerPC architecture. The ARM multiplier has a latency of 5 instructions, but if I am not completly wrong also the multiplier in the PowerPC 405 has also a latency of 5 clock cycles so this should not be the issue? 3) Clock Frequency: I used the EDK BaseSystem Builder where the external clock is 24 MHz and I told the tool that I also wanna have the PowerPC running at this clock frequency. This seems to be the most likely source for the problem which I will have to check now. However, if anybody has an other idea why the PowerPC implementation is so slow I would be thankful for some hints. Thanks!
PowerPC 405 Problem on Xilinx Virtex II FPGA
Started by ●February 16, 2009