FPGARelated.com
Forums

UltraController II + SystemAce

Started by Patrick Dubois August 25, 2006
Hello everyone,

First of all, big thanks to everyone who contribute to this newsgroup.
Most of the time, I can find a solution to my problem just googling
this newsgroup. Not this time however...

I know that this is a long post but if you can help with the
UltraController II + SystemAce or UC2 + xmd, please read on.

I'm currently trying to get a system with the UltraController II up and
running. To make a long story short (hopefully), I started with the
tools at v7.1, the UC2 reference design, and an eval board from Avnet.
I was able to successfully build a small system using the UC2 and a
Wishbone bus. I always ran the UC2 code using the debugger however (XMD
+ GDB).

Confident that the UC2 was a good solution for me (as I don't have
external ram), I spent a few months incorporating the UC2 in my real
design (3 FPGAs, two VP40 and one VP7). I used SystemC to simulate the
UC2 and now I reached the point of trying it out in hardware. That's
when the problems start...

First of all, xmd can't even detect the PowerPCs in the JTAG chain in
the big design. So I took a step back and went back to the Avnet eval
board to play around more and get to understand the flow better. Right
now I'd like to accomplish two things:
1- I'd like to be able to create a SystemAce file that loads both the
bit config file and the elf program file at boot-up (for the Avnet
board with a single VP7).
2- I'd like to understand how xmd can load and run my software without
the GDB debugger.

For the SystemAce, I used the modified genace.tcl script that comes
with the reference design but it doesn't work. To be more precise, I
can generate the ace file, the SystemAce loads everything in the FPGA
fine (done led goes up), but my code doesn't boot. The hardware part
seems okay because one of my debug led turns on (attached directly to
VCC in the fpga). Since the code doesn't boot, I tried to use xmd to
connect to the PowerPC. However, xmd doesn't detect the PowerPC (ERROR:
Unable to connect to PowerPC target. Invalid Processor Version No
0x00000000"). That's strange because if I then proceed to load the bit
file using iMPACT, then xmd can detect the PowerPC fine.

Now for xmd, I'd like know how to load and run my code with it. I
followed the instructions in XAPP571 (to properly use the DEBUGHALT
controller), but it doesn't work. I tried a few things but I'm really
in the dark as I don't have deep understanding of the whole EDK flow
(which is why I choose the UltraController in the first place).

I appologize for the long post but I wanted to give as much details as
possible to whoever could help me... By the way, I opened several
webcases with Xilinx to help me along the way to where I am now, but I
thought I'd give this newsgroup a shot for any UC2 experts out there.

Thanks.

Patrick Dubois

Patrick,

I don't have answers to all your questions, but here is some info:

> For the SystemAce, I used the modified genace.tcl script that comes > with the reference design but it doesn't work. To be more precise, I > can generate the ace file, the SystemAce loads everything in the FPGA > fine (done led goes up), but my code doesn't boot. The hardware part > seems okay because one of my debug led turns on (attached directly to > VCC in the fpga). Since the code doesn't boot, I tried to use xmd to > connect to the PowerPC. However, xmd doesn't detect the PowerPC (ERROR: > Unable to connect to PowerPC target. Invalid Processor Version No > 0x00000000"). That's strange because if I then proceed to load the bit > file using iMPACT, then xmd can detect the PowerPC fine.
I haven't worked with SystemACE, but as was discussed here multiple times in the past, the DONE pin going high doesn't really mean the configuration has been finished, it rather says that it is about to be finished. In other words you might be missing a few more clock cycles to finilize the configuration.
> I tried a few things but I'm really > in the dark as I don't have deep understanding of the whole EDK flow > (which is why I choose the UltraController in the first place).
Ultracontroller is a neat idea, but it might be much easier to do a design that would use BRAMs instead of trying to fit everything in cache. Since PPC is a 32-bit processor, the code tends to grow pretty quickly making it difficult to fit anything useful in cache alone. Also, if I understand correctly, the cache cannot be initialized from the bit file, which makes the initial boot cumbersome. I have designed a board where I originally planned to use UC2, but then switched to a regular PPC design with PLB_BRAM. /Mikhail
MM wrote:

> I haven't worked with SystemACE, but as was discussed here multiple times in > the past, the DONE pin going high doesn't really mean the configuration has > been finished, it rather says that it is about to be finished. In other > words you might be missing a few more clock cycles to finilize the > configuration.
Thanks for the info, I wasn't aware of that. I also have a led that is driven to '1' in the fpga fabric and this led turns on. It's still no garantee that the whole chip is configued I guess...
> Ultracontroller is a neat idea, but it might be much easier to do a design > that would use BRAMs instead of trying to fit everything in cache. Since PPC > is a 32-bit processor, the code tends to grow pretty quickly making it > difficult to fit anything useful in cache alone. Also, if I understand > correctly, the cache cannot be initialized from the bit file, which makes > the initial boot cumbersome. I have designed a board where I originally > planned to use UC2, but then switched to a regular PPC design with PLB_BRAM.
Yes, I'm starting to realize that the UC2 maybe wasn't such a great idea after all. You are correct, the cache cannot be initialized from the bit file, which is the source of most of my problems (for example, you can't use impact to generate the SystemAce file, you need to use a tcl script). I still wish that I could use the UC2, as I don't really want to deal with the OPB/PLB buses right now. Creating a full-blown PowerPC design in EDK doesn't seem like an easy task. 32 inputs and 32 outputs is just what I need (and my understanding is that there is no faster way to toggle such pins than with the UC2). As far as code size is concerned, I already have a pretty complete design and I'm using about 50% of the code cache (plenty of data cache left), with Size Optimization. I also would like to keep my BRAMs free, as I'm doing large FFTs which make heavy use of BRAMs. Another point in favor of the UC2 is that I have a really simple SystemC model of the UC2 right now. I can do relatively fast co-simulations of my design. I heard that using the Swift models to simulate the PowerPC is really slow... Thanks for your input MM. Patrick
Patrick Dubois schrieb:

> MM wrote: > > > I haven't worked with SystemACE, but as was discussed here multiple times in > > the past, the DONE pin going high doesn't really mean the configuration has > > been finished, it rather says that it is about to be finished. In other > > words you might be missing a few more clock cycles to finilize the > > configuration. > > Thanks for the info, I wasn't aware of that. I also have a led that is > driven to '1' in the fpga fabric and this led turns on. It's still no > garantee that the whole chip is configued I guess... > > > Ultracontroller is a neat idea, but it might be much easier to do a design > > that would use BRAMs instead of trying to fit everything in cache. Since PPC > > is a 32-bit processor, the code tends to grow pretty quickly making it > > difficult to fit anything useful in cache alone. Also, if I understand > > correctly, the cache cannot be initialized from the bit file, which makes > > the initial boot cumbersome. I have designed a board where I originally > > planned to use UC2, but then switched to a regular PPC design with PLB_BRAM. > > Yes, I'm starting to realize that the UC2 maybe wasn't such a great > idea after all. You are correct, the cache cannot be initialized from > the bit file, which is the source of most of my problems (for example, > you can't use impact to generate the SystemAce file, you need to use a > tcl script). > > I still wish that I could use the UC2, as I don't really want to deal > with the OPB/PLB buses right now. Creating a full-blown PowerPC design > in EDK doesn't seem like an easy task. 32 inputs and 32 outputs is just > what I need (and my understanding is that there is no faster way to > toggle such pins than with the UC2). As far as code size is concerned, > I already have a pretty complete design and I'm using about 50% of the > code cache (plenty of data cache left), with Size Optimization. I also > would like to keep my BRAMs free, as I'm doing large FFTs which make > heavy use of BRAMs. > > Another point in favor of the UC2 is that I have a really simple > SystemC model of the UC2 right now. I can do relatively fast > co-simulations of my design. I heard that using the Swift models to > simulate the PowerPC is really slow... > > Thanks for your input MM. > > Patrick
Patrick PPC caches __can__ be initialized 1) from BIT file using the USR_ACCESS (and bridge IP to JTAG master) 2) from ACE using PPC ICE registers directly but sure, the documentation and tools todo this are not the very best == means possible months of time wasted to get it all working properly. but it is possible. I looked at the USR_ACCESS to JTAG gateway ip core, and I also know enough about the undocumented PPC ICE registers that I am confident its all doable. Antti
Antti wrote:

> PPC caches __can__ be initialized > 1) from BIT file using the USR_ACCESS (and bridge IP to JTAG master) > 2) from ACE using PPC ICE registers directly > > but sure, the documentation and tools todo this > are not the very best == means possible months > of time wasted to get it all working properly. > > but it is possible. I looked at the USR_ACCESS > to JTAG gateway ip core, and I also know enough > about the undocumented PPC ICE registers that > I am confident its all doable.
Undocumented registers, I sure wish that I don't have to dig that deep ;) The trick that is used with the UC2 + SystemAce is the loading of cache through JTAG commands. I'm not sure of the implementation details though, as the UC2 comes in "prepackaged" ngc files. XAPP575 is the app note dealing with the UC2. It all seems real nice, but it doesn't work for me... Thanks Antti. Patrick
Patrick,

> I still wish that I could use the UC2, as I don't really want to deal > with the OPB/PLB buses right now. Creating a full-blown PowerPC design > in EDK doesn't seem like an easy task.
It is actually very easy to create a basic design using the wizard. It is certainly much easier than to deal with any non-standard configuration such as UC2.
> I also > would like to keep my BRAMs free, as I'm doing large FFTs which make > heavy use of BRAMs.
I can't argue with this, but if your code fits into 8K, your BRAM usage won't be huge...
> Another point in favor of the UC2 is that I have a really simple > SystemC model of the UC2 right now. I can do relatively fast > co-simulations of my design. I heard that using the Swift models to > simulate the PowerPC is really slow...
Being able to run full simulation is cool, but do you really need to simulate it at all? After all nobody runs Pentium Verilog simulation to debug a Windows application... Finally, you can always go back to UC2 when you know more... /Mikhail
MM wrote:

> It is actually very easy to create a basic design using the wizard. It is > certainly much easier than to deal with any non-standard configuration such > as UC2.
It seemed easier to use the UC2 at the time (which is one of the reason the UC2 was created in the first place, I think), but now I would probably agree with you that the standard flow is easier.
> I can't argue with this, but if your code fits into 8K, your BRAM usage > won't be huge...
Agreed, maybe 8-10 BRAMs I guess.
> Being able to run full simulation is cool, but do you really need to > simulate it at all? After all nobody runs Pentium Verilog simulation to > debug a Windows application...
Well, most of my design is hardware. The UC2 is only kind of a big state machine. The simulation is more meant to validate the hardware design. I agree that there is usually no need to simulate the hardware to debug the software. Without the UC2 simulation though, it would be harder for me to debug the hardware design. I guess that I could build a testbench, generating Wishbone bus transactions for each module. But it's just easier to use SystemC to link everything together. Thanks again Mikhail. Patrick
Patrick,

what board are you targeting?

If you can send me your design I'll have a look to see what might be wrong.

It looks like you can download the bitstream with XMD and then connect 
to the PPC with XMD. You can download and run software to the PPC 
through XMD with

$ xmd
XMD% connect ppc hw -debugdevice icachestartadr <adr> dcachestartadr <adr>
XMD% dow <elf file>
XMD% run

You might have a look at the xmd.ini file to see what the correct 
"connect" parameters are.

You can also have a look at XAPP807. It is a small system that makes use 
of the UC2 and the TEMAC.


- Peter



Patrick Dubois wrote:

> Hello everyone, > > First of all, big thanks to everyone who contribute to this newsgroup. > Most of the time, I can find a solution to my problem just googling > this newsgroup. Not this time however... > > I know that this is a long post but if you can help with the > UltraController II + SystemAce or UC2 + xmd, please read on. > > I'm currently trying to get a system with the UltraController II up and > running. To make a long story short (hopefully), I started with the > tools at v7.1, the UC2 reference design, and an eval board from Avnet. > I was able to successfully build a small system using the UC2 and a > Wishbone bus. I always ran the UC2 code using the debugger however (XMD > + GDB). > > Confident that the UC2 was a good solution for me (as I don't have > external ram), I spent a few months incorporating the UC2 in my real > design (3 FPGAs, two VP40 and one VP7). I used SystemC to simulate the > UC2 and now I reached the point of trying it out in hardware. That's > when the problems start... > > First of all, xmd can't even detect the PowerPCs in the JTAG chain in > the big design. So I took a step back and went back to the Avnet eval > board to play around more and get to understand the flow better. Right > now I'd like to accomplish two things: > 1- I'd like to be able to create a SystemAce file that loads both the > bit config file and the elf program file at boot-up (for the Avnet > board with a single VP7). > 2- I'd like to understand how xmd can load and run my software without > the GDB debugger. > > For the SystemAce, I used the modified genace.tcl script that comes > with the reference design but it doesn't work. To be more precise, I > can generate the ace file, the SystemAce loads everything in the FPGA > fine (done led goes up), but my code doesn't boot. The hardware part > seems okay because one of my debug led turns on (attached directly to > VCC in the fpga). Since the code doesn't boot, I tried to use xmd to > connect to the PowerPC. However, xmd doesn't detect the PowerPC (ERROR: > Unable to connect to PowerPC target. Invalid Processor Version No > 0x00000000"). That's strange because if I then proceed to load the bit > file using iMPACT, then xmd can detect the PowerPC fine. > > Now for xmd, I'd like know how to load and run my code with it. I > followed the instructions in XAPP571 (to properly use the DEBUGHALT > controller), but it doesn't work. I tried a few things but I'm really > in the dark as I don't have deep understanding of the whole EDK flow > (which is why I choose the UltraController in the first place). > > I appologize for the long post but I wanted to give as much details as > possible to whoever could help me... By the way, I opened several > webcases with Xilinx to help me along the way to where I am now, but I > thought I'd give this newsgroup a shot for any UC2 experts out there. > > Thanks. > > Patrick Dubois >
Peter,

Thanks for offering your help. I'll send you my design by e-mail
shortly.

I'm targetting our own board but for the time being, I'm just trying to
make something work on the "Avnet Virtex-II Pro Evaluation Board".

I tried the instructions you provided for xmd, and everyting works
except for the "run" part. I get something like that:

XMD% run
PC reset to 0xffff3ffc, Clearing MSR Register
Processor started. Type "stop" to stop processor
RUNNING>
XMD%
Processor stopped at PC: 0xffff0000

The processor stopped instantly. I then tried to reset and start again:

XMD% rst
Target reset successfully
XMD% run
PC reset to 0xffff3ffc, Clearing MSR Register
Processor started. Type "stop" to stop processor
RUNNING>

Now this 2nd time the processor seems to run, but I don't get the
normal "running" behavior that I get in GDB. The program is quite
simple, it mirrors the state of the 8 dip switches to the 8 leds on the
board (infinite loop). The leds all stay off with the above steps
(which indicates that the code is not running properly). I get the same
behavior when I try to load everything through SystemAce.

Is there a special step to do when compiling "release" code?

I'll get in touch with you by e-mail shortly. Thanks Peter.

Patrick

Patrick,

wrt the XMD commands I was a little bit sloppy. Please have a look at 
the xmd.ini that ships as part of XAPP575 to see what the correct 
commands are. If you have the xmd.ini in the directory where you start 
xmd it will automatically pick it up and connect to the UC2.

The correct short form of what is in xmd.ini is:
$ xmd
XMD% connect ppc hw -debugdevice icachestartadr 0xffff0000 
dcachestartadr 0xffff8000
XMD% xwreg 0 msr 0x40000
XMD% xwreg 0 msr 0
XMD% con

The xwreg lines toggle the WE bit on the processor to deassert 
DEBUGHALT. I believe that's what was missing in your commands below, 
i.e. DEBUGHALT is still asserted and the processor stops right on the 
next instruction.

I had a quick look at the design files. The custom board file you use 
does not map the caches to the correct addresses (it does not map them 
at all). It might be easier if you add your board to genace.tcl and use 
one of the existing boards in there as an example.

Unfortunately, I do not have one of the Avnet boards at hand at the 
moment and will hunt for one on Monday if your problems persist.

- Peter


Patrick Dubois wrote:
> Peter, > > Thanks for offering your help. I'll send you my design by e-mail > shortly. > > I'm targetting our own board but for the time being, I'm just trying to > make something work on the "Avnet Virtex-II Pro Evaluation Board". > > I tried the instructions you provided for xmd, and everyting works > except for the "run" part. I get something like that: > > XMD% run > PC reset to 0xffff3ffc, Clearing MSR Register > Processor started. Type "stop" to stop processor > RUNNING> > XMD% > Processor stopped at PC: 0xffff0000 > > The processor stopped instantly. I then tried to reset and start again: > > XMD% rst > Target reset successfully > XMD% run > PC reset to 0xffff3ffc, Clearing MSR Register > Processor started. Type "stop" to stop processor > RUNNING> > > Now this 2nd time the processor seems to run, but I don't get the > normal "running" behavior that I get in GDB. The program is quite > simple, it mirrors the state of the 8 dip switches to the 8 leds on the > board (infinite loop). The leds all stay off with the above steps > (which indicates that the code is not running properly). I get the same > behavior when I try to load everything through SystemAce. > > Is there a special step to do when compiling "release" code? > > I'll get in touch with you by e-mail shortly. Thanks Peter. > > Patrick >