Reply by -jg January 9, 20082008-01-09

Rgr wrote:
> "John_H" <newsgroup@johnhandwork.com> wrote in message > news:lt6dncs0fYiIrx_anZ2dnUVZ_j-dnZ2d@comcast.com... > > Rgr wrote: > >> Hi. > >> > >> I would like to hear your opinion on the possibility of implementing a > >> processor in a CPLD? > >> The functionality does not have to be greater than the old 8051 CPU, but > >> I would like the flexibility and the possibility of adding additional > >> logic to my design. > >> > >> Has someone worked on this issue, or have an opinion on how to complete > >> this task? > >> > >> Looking forward to your replies > >> Best Regards > > > > The Xilinx PicoBlaze or the open-source Mico-8 from Lattice should both be > > achievable in a CPLD but most CPLDs don't have memory. > > > > While it's more of a simple FPGA than an ASIC, the Altera Max-II series of > > "CPLDs" has some user Flash memory available on-chip. Most CPLDs will > > require external memory. > > > > - John_H > > Thank you both for your very useful replies. > I can see the benefits in utilizing the Max-II series, but have they made a > soft-core processor usable for these CPLD's? Like the PicoBlaze?
You can probably find example of Mico8 on Lattice MachXO series. However, if you want low power, using a CPLD is not a good path, and you have to add external code storage in projects of any reasonable size. That's more power, cost, and EMC hits... - much better to choose a small uC that has Code memory on-chip, already power-optimised, and EMC minimised!. Use the FPGA when you have a problem you cannot solve with std devices. -jg
Reply by Rgr January 8, 20082008-01-08
"Andreas Ehliar" <ehliar-nospam@isy.liu.se> wrote in message 
news:slrnfo6obf.luh.ehliar-nospam@sabor.isy.liu.se...
> On 2008-01-07, Rgr <rgrworking@hotmail.com> wrote: >> Hi. >> >> I would like to hear your opinion on the possibility of implementing a >> processor in a CPLD? >> The functionality does not have to be greater than the old 8051 CPU, but >> I >> would like the flexibility and the possibility of adding additional logic >> to >> my design. >> >> Has someone worked on this issue, or have an opinion on how to complete >> this >> task? > > We are actually doing this all the time in a course we are giving here > called > Digital project laboratory. The CPLDs we are using are the XC9572 and > XC95108. > (They are old, but they operate on 5V which is a huge advantage for us > since > we have many other 5V components.) > > Our students commonly use two XC95108 for a complete microcoded processor. > One mostly used for microcode and one mostly used for ALU stuff. Some > students > use many more CPLDs to implement graphics and audio as well but that is > not so common. > > A few students manage to fit a complete RISC-like processor into a single > XC9572. > > Some things to keep in mind when designing a processor (or any logic for > that matter for a XC95xx-CPLD: > > * Budget your design for the number of macroblocks. One macroblock == > one flip-flop plus some logic in front of it. > > * Avoid complex expressions. You will notice that especially adders can > be very expensive (or constructs that infer an adder like less than > or greater than). An example: > > reg [7:0] counter; > always @(posedge clk) begin > counter <= counter + 1; > if((counter > 13) && (counter < 131)) begin > outsignal <= 1; > end else begin > outsignal <= 0; > end > end > > // Refactor this as: > always @(posedge clk) begin > counter <= counter + 1; > if(counter == 13) begin > outsignal <= 1; > end > if(counter == 130) begin > outsignal <= 0; > end > end > > The later version can sometimes reduce the number of macrocells although > there is no guarantee for it. (If we didn't use 13 and 131 but something > like 63 and 192 instead it is likely that the first version is more > optimal than the second version for example) > > > > * Reuse logic (macrocells) if possible > An example of commonly used logic in a simple (non pipelined) processor: > > reg [15:0] pc; > reg [15:0] address_reg; > reg [15:0] external_addr; > // Resets not shown for clarity > always @(posedge clk) begin > if(incpc) pc <= pc + 1; > else if(resetpc) pc <= 0; > else if(jump) pc <= jumpaddr; > end > > always @(posedge clk) begin > if(loadaddr) address_reg <= accumulator; > > always @* begin > if (fetch) external_addr <= pc; > else external_addr <= address_reg; > end > > Assuming address_reg and pc are the same width: > This code will use at least 16 macrocells for pc (actually more > due to the adder), 16 macrocells for address_reg and 16 macrocells > for the MUX in front of external_addr which is going out to the memory. > This code can be changed to something like this: > > reg [15:0] reg1; > reg [15:0] reg2; > always @* external_addr <= reg1; > > always @(posedge clk) begin > if(swap) begin > reg1 <= reg2; > reg2 <= reg1; > end else begin > if(inc1) reg1 <= reg1 + 1; > if(load2_pc) reg2 <= jumpaddr; > else if(load2_addr) reg2 <= accumulator; > else if(zero2) reg2 <= 0; > end > end > > By reorganizing the code as above you will complicate your control > path by quite a lot, but you will save the 16 macrocells caused by > the MUX since reg1 is directly connected to external memory. We also > make sure that we don't complicate the macrocells used in the adder > by putting all other operations into the reg2 macrocells. As a bonus > we get address_reg++ for free... > > * Bit serial arithmetics can be good in CPLDs in some situations. As > an example, I have a small hobby project running where I have fitted > an entire digital watch into a single XC9572 (although it requires a > very non-standard clock frequency to work as I didn't have enough space > left to fit a clock divider). The watch can both show time (hours, > minutes and seconds) and be used as a timer (minutes,seconds,tenths). > There is also an alarm which functions on (hours,minutes). These units > can be used independently (i.e., the watch still ticks in the background > if I'm setting the alarm or using the timer and the alarm will work even > though I'm using the timer). (Actually, I haven't wired up any circuit > yet, but it is working in simulation and the design fits into a > XC9572.) The time is presented on a display driven by 9368 7-segment > decoders so my design don't need to worry about decimal to 7-segment > decoders inside my design. (Although depending on this is a bit of > a cheat...) > > I'm using bit serial arithmetics in order to fit everything into the > XC9572. I do not believe a more parallel solution would work in this > very constrained space although digit-serial might work as well. > > At some point I'm planning to write a small web page to detail this > project, unfortunately I don't have time to describe it in more detail > in this already long post. > > > It is rare that students do anything of the above though, but this is what > I've learned while supervising students and thinking about (and sometimes > exploring) the limits of the hardware we are using. > > /Andreas
Thanks for the tips, there are some nice views among them. And thank you all for the answers - I have searched the web sites and come up with some possible solutions. Best Regards
Reply by Ben Jackson January 8, 20082008-01-08
On 2008-01-08, Andreas Ehliar <ehliar-nospam@isy.liu.se> wrote:
> > * Budget your design for the number of macroblocks. One macroblock == > one flip-flop plus some logic in front of it.
Very important. And as you hint later, with only 36-108 flops, you can't make much of a clock divider. If you want to operate at human- visible speeds, give yourself a slow clock.
> * Avoid complex expressions. You will notice that especially adders can > be very expensive (or constructs that infer an adder like less than > or greater than). An example:
On the other hand, you can use much bigger combinatoral expressions in each macrocell of a CPLD than you can in the LUT of a typical FPGA. My experience is that you can fit a lot more logic and a lot less storage in a CPLD than you might initially expect.
> exploring) the limits of the hardware we are using.
You could look at my website for such an example. I fit a PCI target in a XC95108. -- Ben Jackson AD7GD <ben@ben.com> http://www.ben.com/
Reply by Andreas Ehliar January 8, 20082008-01-08
On 2008-01-07, Rgr <rgrworking@hotmail.com> wrote:
> Hi. > > I would like to hear your opinion on the possibility of implementing a > processor in a CPLD? > The functionality does not have to be greater than the old 8051 CPU, but I > would like the flexibility and the possibility of adding additional logic to > my design. > > Has someone worked on this issue, or have an opinion on how to complete this > task?
We are actually doing this all the time in a course we are giving here called Digital project laboratory. The CPLDs we are using are the XC9572 and XC95108. (They are old, but they operate on 5V which is a huge advantage for us since we have many other 5V components.) Our students commonly use two XC95108 for a complete microcoded processor. One mostly used for microcode and one mostly used for ALU stuff. Some students use many more CPLDs to implement graphics and audio as well but that is not so common. A few students manage to fit a complete RISC-like processor into a single XC9572. Some things to keep in mind when designing a processor (or any logic for that matter for a XC95xx-CPLD: * Budget your design for the number of macroblocks. One macroblock == one flip-flop plus some logic in front of it. * Avoid complex expressions. You will notice that especially adders can be very expensive (or constructs that infer an adder like less than or greater than). An example: reg [7:0] counter; always @(posedge clk) begin counter <= counter + 1; if((counter > 13) && (counter < 131)) begin outsignal <= 1; end else begin outsignal <= 0; end end // Refactor this as: always @(posedge clk) begin counter <= counter + 1; if(counter == 13) begin outsignal <= 1; end if(counter == 130) begin outsignal <= 0; end end The later version can sometimes reduce the number of macrocells although there is no guarantee for it. (If we didn't use 13 and 131 but something like 63 and 192 instead it is likely that the first version is more optimal than the second version for example) * Reuse logic (macrocells) if possible An example of commonly used logic in a simple (non pipelined) processor: reg [15:0] pc; reg [15:0] address_reg; reg [15:0] external_addr; // Resets not shown for clarity always @(posedge clk) begin if(incpc) pc <= pc + 1; else if(resetpc) pc <= 0; else if(jump) pc <= jumpaddr; end always @(posedge clk) begin if(loadaddr) address_reg <= accumulator; always @* begin if (fetch) external_addr <= pc; else external_addr <= address_reg; end Assuming address_reg and pc are the same width: This code will use at least 16 macrocells for pc (actually more due to the adder), 16 macrocells for address_reg and 16 macrocells for the MUX in front of external_addr which is going out to the memory. This code can be changed to something like this: reg [15:0] reg1; reg [15:0] reg2; always @* external_addr <= reg1; always @(posedge clk) begin if(swap) begin reg1 <= reg2; reg2 <= reg1; end else begin if(inc1) reg1 <= reg1 + 1; if(load2_pc) reg2 <= jumpaddr; else if(load2_addr) reg2 <= accumulator; else if(zero2) reg2 <= 0; end end By reorganizing the code as above you will complicate your control path by quite a lot, but you will save the 16 macrocells caused by the MUX since reg1 is directly connected to external memory. We also make sure that we don't complicate the macrocells used in the adder by putting all other operations into the reg2 macrocells. As a bonus we get address_reg++ for free... * Bit serial arithmetics can be good in CPLDs in some situations. As an example, I have a small hobby project running where I have fitted an entire digital watch into a single XC9572 (although it requires a very non-standard clock frequency to work as I didn't have enough space left to fit a clock divider). The watch can both show time (hours, minutes and seconds) and be used as a timer (minutes,seconds,tenths). There is also an alarm which functions on (hours,minutes). These units can be used independently (i.e., the watch still ticks in the background if I'm setting the alarm or using the timer and the alarm will work even though I'm using the timer). (Actually, I haven't wired up any circuit yet, but it is working in simulation and the design fits into a XC9572.) The time is presented on a display driven by 9368 7-segment decoders so my design don't need to worry about decimal to 7-segment decoders inside my design. (Although depending on this is a bit of a cheat...) I'm using bit serial arithmetics in order to fit everything into the XC9572. I do not believe a more parallel solution would work in this very constrained space although digit-serial might work as well. At some point I'm planning to write a small web page to detail this project, unfortunately I don't have time to describe it in more detail in this already long post. It is rare that students do anything of the above though, but this is what I've learned while supervising students and thinking about (and sometimes exploring) the limits of the hardware we are using. /Andreas
Reply by Herbert Kleebauer January 7, 20082008-01-07
Rgr wrote:
 
> I would like to hear your opinion on the possibility of implementing a > processor in a CPLD? > The functionality does not have to be greater than the old 8051 CPU, but I > would like the flexibility and the possibility of adding additional logic to > my design. > > Has someone worked on this issue, or have an opinion on how to complete this > task?
If you have at least 65 flip flops and a few hundred gates available on your CPLD you can try: ftp://137.193.64.130/pub/mproz/mproz3_e.pdf
Reply by Kris Vorwerk January 7, 20082008-01-07
> I would like to hear your opinion on the possibility of implementing a > processor in a CPLD?
As an alternative to CPLDs, have you considered Actel's Igloo FPGAs? They're small-to-medium-sized FPGAs with a very low power footprint (and you can use an ARM Cortex M1 processor on it). http://www.actel.com/products/igloo/ K.
Reply by Jecel January 7, 20082008-01-07
This one is specially designed for small CPLDs:

http://www.opencores.org/projects.cgi/web/mcpu/overview

You will need an external memory, however. And note that it is far
simpler and more limited than a 8051.

-- Jecel
Reply by John_H January 7, 20082008-01-07
On Jan 7, 8:12=A0am, "Rgr" <rgrwork...@hotmail.com> wrote:
> > Thank you both for your very useful replies. > I can see the benefits in utilizing the Max-II series, but have they made =
a
> soft-core processor usable for these CPLD's? Like the PicoBlaze? > > Best Regards
The PicoBlaze is intended for a Xilinx target. The design uses Xilinx primitives that may not map over to an Altera design. The Mico-8 is open source and does not use vendor-specific primitives. CPLDs are typically not well suited for processor instantiation so commonly you'll only see them in FPGAs. It's because the Max-II parts are more like early FPGAs that the fit might be reasonable. Have you considered a tiny FPGA rather than a CPLD? Having embedded RAM can really help out. If you want to access more code than would conveniently fit in the on-board RAM, you can use the same SPI flash that programs the FPGA to store additional user flash that you access through an SPI interface. Perhaps the additional flexibility from an FPGA (versus CPLD) is worth considering. - John_H
Reply by Rgr January 7, 20082008-01-07
"John_H" <newsgroup@johnhandwork.com> wrote in message 
news:lt6dncs0fYiIrx_anZ2dnUVZ_j-dnZ2d@comcast.com...
> Rgr wrote: >> Hi. >> >> I would like to hear your opinion on the possibility of implementing a >> processor in a CPLD? >> The functionality does not have to be greater than the old 8051 CPU, but >> I would like the flexibility and the possibility of adding additional >> logic to my design. >> >> Has someone worked on this issue, or have an opinion on how to complete >> this task? >> >> Looking forward to your replies >> Best Regards > > The Xilinx PicoBlaze or the open-source Mico-8 from Lattice should both be > achievable in a CPLD but most CPLDs don't have memory. > > While it's more of a simple FPGA than an ASIC, the Altera Max-II series of > "CPLDs" has some user Flash memory available on-chip. Most CPLDs will > require external memory. > > - John_H
Thank you both for your very useful replies. I can see the benefits in utilizing the Max-II series, but have they made a soft-core processor usable for these CPLD's? Like the PicoBlaze? Best Regards
Reply by John_H January 7, 20082008-01-07
Rgr wrote:
> Hi. > > I would like to hear your opinion on the possibility of implementing a > processor in a CPLD? > The functionality does not have to be greater than the old 8051 CPU, but I > would like the flexibility and the possibility of adding additional logic to > my design. > > Has someone worked on this issue, or have an opinion on how to complete this > task? > > Looking forward to your replies > Best Regards
The Xilinx PicoBlaze or the open-source Mico-8 from Lattice should both be achievable in a CPLD but most CPLDs don't have memory. While it's more of a simple FPGA than an ASIC, the Altera Max-II series of "CPLDs" has some user Flash memory available on-chip. Most CPLDs will require external memory. - John_H