I got the Spartan-3 Starter Kit yesterday from Xilinx. This board is a really good bargain: A XC3S200 and 1MB SRAM for just $ 99,-. This board makes it hard for guys like Tony Burch or me to sell FPGA boards ;-( Only the Flash is a little bit small.... Not too much space left for application data. However, the board and the documentation is fine. It took me only half a day to port JOP (a Java processor) from the Altera Cyclone to the Spartan (thanks to Ed Anuff who did the hard part and wrote a memory generator for Xilinx). Just two Xilinx specific files for the top-level and the memory interface. You can find a Xilinx ISE project under xilinx/s3sk for JOP on this board. If you have such a board and want to try out JOP: Download the JOP sources from: http://www.jopdesign.com/download.jsp Compile the ISE project under ../xilinx/s3sk Download JOP to the FPGA Connect a serial cabel from your PC to the board Open a command prompt in ../java/target Change the COM-port in doit.bat type: doit test test Clock that's it, a small Java program should now run on the Spartan! Martin ---------------------------------------------- JOP - a Java Processor core for FPGAs: http://www.jopdesign.com/
JOP on Spartan-3 Starter Kit
Started by ●October 1, 2004
Reply by ●October 1, 20042004-10-01
> However, the board and the documentation is fine. It took me only halfa> day to port JOP (a Java processor) from the Altera Cyclone to theSpartan> (thanks to Ed Anuff who did the hard part and wrote a memory generator > for Xilinx). Just two Xilinx specific files for the top-level and the > memory interface. You can find a Xilinx ISE project under xilinx/s3skfor> JOP on this board. >For those who are interested in a short comparison between Cyclone and Spartan-3: Cyclone EP1C6Q240C6: fmax: 98 MHz, 2066 LC/Es (34% out of 5980) Spartan-3 XC3S200-5 fmax: 82 MHz, 2015 LC/Es (52% out of 3840) I mean a 4 input LUT with register for the LC/E comparison. The CLB or slice numbers are just confusing. We can see that JOP needs about the same resources in the A and X devices. Both devices used are the fastest speed grade available. Is the Cyclone, although 'older', faster than the Spartan-3? It's interesting when we compare the two devices with respect to LC/Es and memory (In case of memory I count K-Bytes (not bits) and don't care about a 9th parity bit... Why do I need a parity bit for the block RAM? Is there also a parity protection for the SRAM based configuration?): XC3S50: 1536 LC/Es, 4*2KB=8KB, 4 HW multiplier EP1C3: 2910 LC/Es, 13*0.5KB= 6.5KB XC3S200: 3840 LC/Es, 12*2KB=24KB, 12 HW multiplier EP1C4: 4000 LC/Es, 17*0.5KB= 8.5KB EP1C6: 5980 LC/Es, 20*0.5KB= 10KB XC3S400: 7168 LC/Es, 16*2KB=32KB, 16 HW multiplier EP1C12: 12060 LC/Es, 52*0.5KB= 26KB XC3S1000: 15360 LC/Es, 24*2KB=48KB, 24 HW multiplier EP1C20: 20060 LC/Es, 64*0.5KB=32KB XC3S1500: 26624 LC/Es, 32*2KB=64KB, 32 HW multiplier When we order the parts with respect to LC/E count they alternate in a nice way. Does that mean that our design complexity determines the choice? Not that easy. The X parts have more memory per LC and additional multipliers. However, I don't have prices, a very important 'feature', handy for all these devices :-) Martin
Reply by ●October 1, 20042004-10-01
Martin Schoeberl wrote:> I got the Spartan-3 Starter Kit yesterday from Xilinx. This board is a > really good bargain: A XC3S200 and 1MB SRAM for just $ 99,-. This board > makes it hard for guys like Tony Burch or me to sell FPGA boards ;-( > Only the Flash is a little bit small.... Not too much space left for > application data. However, the board and the documentation is fine.> Good to know that people like it, because I'm also "seriously buying" it! However, being a complete newbie to FPGA's, I would like to know what range of applications this "DO-SPAR3-DK with XC3S200 FT256 Xilinx Spartan-3 FPGA" (just to make sure that we are speaking of the same device!) is good for. For example, at Xilinx's site there is a list of various (mainly third party) processor cores, starting from MC68000 and ending to Z80: http://www.xilinx.com/xlnx/xebiz/search/searchresult.jsp?sGlobalNavPick=&sSecondaryNavPick=&category=-538081125&iLanguageID=1&_ResultsView=Standard&_IPSubcategory=Processor+Core&_IPCategory=Microprocessor+Controller+and+Peripheral&_IPProducts=Core And for example, in CAST Inc.'s C68000's Data Sheet http://www.xilinx.com/bvdocs/ipcenter/data_sheet/CAST_C68000.pdf there is "Table 1: Example Implementation Statistics", where the most low-end device listed is Spartan-IIE XC2S400E-7. Does this mean that it is impossible to fit C68000 into XC3S200 which has only half of the system gates of XC2S400E-7 ? (I don't know whether the gate counts between Spartan-IIE and Spartan-3 series compare linearly.) Same problem with many other CAST's processor cores mentioned: 80C51, TMS32025 and "Z80 Compatible Microprocessor" CZ80CPU, the data sheets mention only Spartan-3 XC3S400-4 and Spartan-IIE XC2S300E-7 and some larger Virtex-II's as Example Devices on which to implement them. Does this mean that XC3S200 has not enough logic to implement ANY of these or just that CAST Inc. didn't have XC3S200-device at hand, and thus haven't tested their designs on it? Also, most of the games and platforms mentioned at: http://www.fpgaarcade.com/ seem to be implemented on at least 300K gate device. So is this 200K-gate XC3S200 thus just a little bit too small for them? (Hmm... although on "Space Invaders" page: http://home.freeuk.com/fpgaarcade/spc_main.htm it mentions: "As so few of the available logic elements are used, a much cheaper FPGA could be used along with external memory device(s)." So there is some hope.) Also, one important question: What is the maximum speed this XC3S200 can be clocked with? Yours, Antti.
Reply by ●October 1, 20042004-10-01
> Good to know that people like it, because I'm > also "seriously buying" it!yes the board is cool, it's just incredible cheap...> Does this mean that it is impossible to fit C68000 > into XC3S200 which has only half of the system gates > of XC2S400E-7 ? > (I don't know whether the gate counts between > Spartan-IIE and Spartan-3 series compare linearly.)The simplest way to check it out is to donwload Xilins ISE software (it's free) and compile your design. You will see how it fit's and if there are some resources left.> > Same problem with many other CAST's processor cores > mentioned: 80C51, TMS32025 and "Z80 Compatible Microprocessor" > CZ80CPU, the data sheets mention only Spartan-3 XC3S400-4 and > Spartan-IIE XC2S300E-7 and some larger Virtex-II's as Example > Devices on which to implement them. > > Does this mean that XC3S200 has not enough logic > to implement ANY of these or just that CAST Inc. > didn't have XC3S200-device at hand, and thus > haven't tested their designs on it?I expect the XC3S200 should do it, since I can easily fit a 32-bit CPU in it.> Also, one important question: > What is the maximum speed this XC3S200 can > be clocked with?That depends really on your design. As above, run it through the (free) synthesizer and you will get the numbers. Martin
Reply by ●October 1, 20042004-10-01
"Martin Schoeberl" <martin.schoeberl@chello.at> wrote in message news:OCl7d.329123$vG5.190330@news.chello.at... [snip]> Both devices used are the fastest speed grade available. Is the Cyclone, > although 'older', faster than the Spartan-3?As a quick aside, Cyclone has three speed grades, Spartan-3 only two. In general, a speed grade represents about a 15% difference in performance. Slowest vs. slowest speed grade would be interesting. --------------------------------- Steven K. Knapp Applications Manager, Xilinx Inc. General Products Division Spartan-3/II/IIE FPGAs http://www.xilinx.com/spartan3 --------------------------------- Spartan-3: Make it Your ASIC
Reply by ●October 1, 20042004-10-01
Hi Martin,> Cyclone EP1C6Q240C6: > fmax: 98 MHz, 2066 LC/Es (34% out of 5980) > Spartan-3 XC3S200-5 > fmax: 82 MHz, 2015 LC/Es (52% out of 3840)By turning on Minimize Area w/Chains under Fitter Settings/More Settings.../Auto Packed Registers - Cyclone you can cut the LE count to 1868 LEs (from 2066). Quartus doesn't try too hard to put registers & LUTs together unless it runs out of room (or you tell it to with this setting). In my compile, this didn't hurt Fmax (Fmax was 99 Mhz). On average, aggressively packing can slightly hurt performance and cause an increase in wiring. By turning on "Area" mapping option in synthesis (instead of Balanced), this drops further to 1775 LEs. Fmax = 95 Mhz. Just pointing out that without even looking at the HDL, there are ways to tweak the LE/Fmax trade-off. I'm sure there are some such tricks for Xilinx too. To automatically try-out the area optimization tricks in Quartus, run the Design Space Explorer tool, and select "Area Optimization" mode under the Advanced settings. It'll take a while, but this will find you the best settings (for area) for your design.> Both devices used are the fastest speed grade available. Is the Cyclone, > although 'older', faster than the Spartan-3?Yes. This performance result is actually pretty poor as far as Cyclone vs. Spartan-3 goes. We see an average of 80% better performance -- yes, that's 1.8X Fmax -- when comparing the fastest speed grades of the two chips with default "push-button" results from Quartus & ISE over a suite of 49 designs. Another way of looking at it is the slowest Cyclone speed-grade out-performs the fastest Spartan-3 speed-grade by a considerable margin. See http://www.altera.com/products/devices/performance/lowcost_performance/per-lowcost_performance_fpga.html for details. In this particular case, your critical path appears to stretch from a RAM to a RAM (configured as a ROM) with little logic in-between. Logic + routing-rich paths tend to accentuate the speed differences between the two devices, while RAM-heavy paths show a smaller advantage.> When we order the parts with respect to LC/E count they alternate in a > nice way. Does that mean that our design complexity determines the > choice? > Not that easy. The X parts have more memory per LC and additional > multipliers. However, I don't have prices, a very important 'feature', > handy for all these devices :-)And it also depends on which speedgrade you need to buy to meet your performance -- can you get by with a slower speed-grade in Cyclone than you'd need in Spartan-3? Or maybe with the faster Cyclone chips you may be able to get away with a wider bus (less demultiplexing) resulting in fewer LEs but a higher clock speed.. Picking a chip ain't easy... so just go with Altera ;-) Regards, Paul Leventis Altera Corp.
Reply by ●October 2, 20042004-10-02
Paul Leventis (at home) wrote:> Hi Martin, > > >>Cyclone EP1C6Q240C6: >> fmax: 98 MHz, 2066 LC/Es (34% out of 5980) >>Spartan-3 XC3S200-5 >> fmax: 82 MHz, 2015 LC/Es (52% out of 3840) ><snip>> To automatically try-out the area optimization tricks in Quartus, run > the Design Space Explorer tool, and select "Area Optimization" mode under > the Advanced settings. It'll take a while, but this will find you the best > settings (for area) for your design.Can you try that, and report back the speed gain, and how long it took to find this ? ( IIRC you mentioned +37% in another post ?) <snip>> In this particular case, your critical path appears to stretch from a RAM to > a RAM (configured as a ROM) with little logic in-between. Logic + > routing-rich paths tend to accentuate the speed differences between the two > devices, while RAM-heavy paths show a smaller advantage.Do you have tips for Martin on how to improve this for Cyclone specific cases ? - ie should the ROM change to logic-based, rather than RAM based, or would a pipeline stage help ? -jg
Reply by ●October 2, 20042004-10-02
> Why do I need a parity bit for the block RAM?They are often useful for other things. On FIFOs/buffers: End of packet flag. In-band vs out-of-band signaling. Used/free flags. Just plane more bits (wider) for things like table driven state machines. -- The suespammers.org mail server is located in California. So are all my other mailboxes. Please do not send unsolicited bulk e-mail or unsolicited commercial e-mail to my suespammers.org address or any of my other addresses. These are my opinions, not necessarily my employer's. I hate spam.
Reply by ●October 2, 20042004-10-02
> > In this particular case, your critical path appears to stretch from aRAM to> > a RAM (configured as a ROM) with little logic in-between. Logic + > > routing-rich paths tend to accentuate the speed differences betweenthe two> > devices, while RAM-heavy paths show a smaller advantage. > > Do you have tips for Martin on how to improve this for Cyclone > specific cases ? - ie should the ROM change to logic-based, rather than > RAM based, or would a pipeline stage help ? >The critical path is from bytecode RAM (the instruction cache for the processor), which has registered address but unregistered data ,out through a 'larg' table. A jump table to map bytecode instructions to microcode addresses. I was thinking to add another pipeline stage in this path. However, than the bytecode branches take one more cycle. When I add a register in this stage the critical path moved to the ALU and fmax was 106MHz. Not a big win and it showed that the pipeline is not so bad balanced. If you have another good idea, I would be happy to make JOP faster :-) Martin -- ---------------------------------------------- JOP - a Java Processor core for FPGAs: http://www.jopdesign.com/
Reply by ●October 2, 20042004-10-02
> As a quick aside, Cyclone has three speed grades, Spartan-3 only two.In> general, a speed grade represents about a 15% difference inperformance.> > Slowest vs. slowest speed grade would be interesting.Ok, here it is: Cyclone slowest (-8): 77.5MHz Spartan slowest (-4): 77.8MHz Looks now better for X.... And now let's throw in some price numbers. Prices are single units from arrow.com and avnet.com, both devices in the same package (tqfp144): Cyclone: EP1C6T144C6: $41.60 Cyclone: EP1C6T144C8: $27.70 Spartan-3: XC3S200-4TQ144C: $19.93 no price for -5 speed grade And relate the price to density and speed in a 'funny' way: price / 1000 LCs / MHz: EP1C6-6: 41.60$ / 5.980 kLC / 98 MHz = 7.1 cent / kLC / MHz EP1C6-8: 5.98 cent / kLC /MHz XC3S200-T: $19.93 / 3.840 kLC / 77.8 MHz = 6.7 cent / kLC /MHz and now it looks again better for A... I did not take into account the multipliers and larger memories in the Spartan, but also not the fact that the Cyclones are available for a longer time (I got my first Cyclone samples 01/2003 and sold the first boards 02/2003 :-) Martin -- ---------------------------------------------- JOP - a Java Processor core for FPGAs: http://www.jopdesign.com/






