FPGARelated.com
Forums

Why 8 clock trees in Xilinx Spartan-3 device?

Started by fp July 22, 2006
Synchronous design, in which all FFs are synchronized by a single
clock, perhaps is the single most important methodology.  In a Xilinx
Spartan-3 FPGA, there are 8 clock trees, even for the smallest device.
What is the purpose of these clocks?
- Could someone provide me few meaningful examples (not sloppy design)
that use multiple clocks?
- Can anyone from Xilinx explain the rational behind these many clock
trees? (to accommodate bad habits?)

To my understanding, multiple clock domains are needed primarily for
1. extremely large design with large clock skew.
2. power awareness design.
Clearly the clock skew is not an issue for Spartan-3.  Are these clock
trees used for power purpose?

Thank you for your help.

S. C.

"fp" <fpga002006@yahoo.com> writes:
> - Could someone provide me few meaningful examples (not sloppy design) > that use multiple clocks?
I have a design with a processor core (one clock domain), two MII interfaces for Ethernet (four clock domains, one for each transmit and receive interface), and a VGA interface (one clock domain). That's a total of six clock domains. Everything gets synchronized to the processor core clock domain by FIFOs, but there is various logic on the interface side of the FIFOs which must be in the interface's clock domain.
> What is the purpose of these clocks? > - Could someone provide me few meaningful examples (not sloppy design) > that use multiple clocks?
Well, sure, there are plenty ... Even if you main core is fully synchronous, you have to "talk" to the external world and that's often where clocks are needed. Let's imagine, a simple application : A PCI video card with some hw acceleration (like decompression, or some drawing primitive or filtering ...) You'll have : - The PCI clock - The pixel clock - The core clock, which may be different because you want the core to run faster than 33 MHz and not having a dynamic frequency like the pixel clock. That's also gonna be the memory main clock. - Probably a 90&#4294967295; shifted clock and a dynamically shifted clock for the DDR logic and data capture. That's already 5 clocks ... and for a quite simple design. It only have a "input" (the PCI) and one output (the VGA or DVI). And here the core is fully synchronous. Sometimes, even in "nice" (i.e. not sloppy ;) designs, you will have multiple clocks. Imagine a design where part of the process implies a bit-by-bit operation (like an entropic coder), you might want to be able to push that part at a higher frequency ... Sylvain
"fp" <fpga002006@yahoo.com> wrote:

>Synchronous design, in which all FFs are synchronized by a single >clock, perhaps is the single most important methodology. In a Xilinx >Spartan-3 FPGA, there are 8 clock trees, even for the smallest device. >What is the purpose of these clocks? >- Could someone provide me few meaningful examples (not sloppy design) >that use multiple clocks? >- Can anyone from Xilinx explain the rational behind these many clock >trees? (to accommodate bad habits?) > >To my understanding, multiple clock domains are needed primarily for >1. extremely large design with large clock skew. >2. power awareness design. >Clearly the clock skew is not an issue for Spartan-3. Are these clock >trees used for power purpose?
Higher clock speeds lead to less efficient use of FPGA resources. The design I'm currently working on uses 4 different clock speeds: 6MHz, 25MHz, 50MHz, 100MHz. The 6MHz clock is used for uC configuration stuff (many registers, but no hurry), the 25MHz clock to run a picoblaze, 50MHz for generic logic and 100MHz for anything that needs to go really fast (like a DDR memory interface and some digital filters). In my past experiences with Spartan 2 and Virtex devices, I've noticed that choosing the lowest clock frequency lead to being able to pack a lot more logic into an fpga because the router can use least efficient leftovers for the slow stuff. The downside is that you'll need to cross clock boundaries every now and then. However, routing time is shortened considerably. A design which uses 64% (equivalent gate count: 285000) of a Spartan3-200k device routes in under 3 minutes on a P4 1.6GHz. -- Reply to nico@nctdevpuntnl (punt=.) Bedrijven en winkels vindt U op www.adresboekje.nl
S. C.

One use I was given is that there is more than one design in the chip. 
Sort of like an apartment building, with different tenants.

Another use is for standard interfaces which you do not get to choose 
the frequency:  SDRAM at 266 MHz, PCI at 33 MHz, and a SONET core at 78 
MHz, all working together to perform some overall function.

Austin

fp wrote:

> Synchronous design, in which all FFs are synchronized by a single > clock, perhaps is the single most important methodology. In a Xilinx > Spartan-3 FPGA, there are 8 clock trees, even for the smallest device. > What is the purpose of these clocks? > - Could someone provide me few meaningful examples (not sloppy design) > that use multiple clocks? > - Can anyone from Xilinx explain the rational behind these many clock > trees? (to accommodate bad habits?) > > To my understanding, multiple clock domains are needed primarily for > 1. extremely large design with large clock skew. > 2. power awareness design. > Clearly the clock skew is not an issue for Spartan-3. Are these clock > trees used for power purpose? > > Thank you for your help. > > S. C. >
8 clock trees implemented because you could often use 9. Many thanks to 
Xilinx for not implementing 16 trees or Murphy would dictate 17 and any 
number > 1 is potential trouble. 


Thank everyone for replies.  The info is helpful.

There is one follow-up question for Austin.  In Spartan-3 family, most
hardware resource increases as the device becomes larger.  If we
compare 3S50 and 3S5000:
 - system gate: 50K vs 5M
 - block ram bit: 72K vs 1872K
 - multiplier: 4 vs 104
 - user I/O: 124 vs 784

However, the number of clock tree remains the same (8) for all devices.
Is there a reason for this?

Thanks in advance.

S. C.


>Austin Lesea wrote: > S. C. > > One use I was given is that there is more than one design in the chip. > Sort of like an apartment building, with different tenants. > > Another use is for standard interfaces which you do not get to choose > the frequency: SDRAM at 266 MHz, PCI at 33 MHz, and a SONET core at 78 > MHz, all working together to perform some overall function. > > Austin > > >
Yes,

It is called "simplicity."

If the clock tree were to vary, it would make the assembly of each 
family member part even more of a hassle (than it already is), and the 
software would also have to be "different" for each member of the fanily 
(more than it already is).

Thus, to make the resource the same for the smallest or largest part may 
be seen as a waste, yet it is the only practical way to manufacture a 
family of devices, and support them.

The same is true for the routing resources:  a small part does not 
"need" all we provide, and a "large" part has perhaps too little (for 
some designs).

Austin

fp wrote:

> Thank everyone for replies. The info is helpful. > > There is one follow-up question for Austin. In Spartan-3 family, most > hardware resource increases as the device becomes larger. If we > compare 3S50 and 3S5000: > - system gate: 50K vs 5M > - block ram bit: 72K vs 1872K > - multiplier: 4 vs 104 > - user I/O: 124 vs 784 > > However, the number of clock tree remains the same (8) for all devices. > Is there a reason for this? > > Thanks in advance. > > S. C. > > > >>Austin Lesea wrote: >>S. C. >> >>One use I was given is that there is more than one design in the chip. >>Sort of like an apartment building, with different tenants. >> >>Another use is for standard interfaces which you do not get to choose >>the frequency: SDRAM at 266 MHz, PCI at 33 MHz, and a SONET core at 78 >>MHz, all working together to perform some overall function. >> >>Austin >> >> >
"fp" <fpga002006@yahoo.com> writes:

> Synchronous design, in which all FFs are synchronized by a single > clock, perhaps is the single most important methodology. In a Xilinx > Spartan-3 FPGA, there are 8 clock trees, even for the smallest device. > What is the purpose of these clocks? > - Could someone provide me few meaningful examples (not sloppy design) > that use multiple clocks?
How about one of our image processing designs: 3x async camera inputs (yes, it would be great to have synchronous cameras, but it doesn't happen) @ 30MHz 1x DSP memory-mapped IO interface @ 100MHz 1x VGA frame-buffer @ 100MHz - synchronous to the DSP clock - yay! 1x VGA output @ (800x600x75)=36MHz I had to move the cameras into the DSP clock domain, process data, pass on to the DSP for further processing, then get "pixels to plot" back from the DSP to put into the VGA frame-buffer. This data was then streamed out through a FIFO to the VGA RAMDAC. I don't think I could have got away with less domains, but with careful planning of the interfaces, it all worked. Cheers, Martin -- martin.j.thompson@trw.com TRW Conekt - Consultancy in Engineering, Knowledge and Technology http://www.trw.com/conekt
Thanks for your time, Austin.

Now I see why 3C50 has 8 clock trees.

S. C.


Austin Lesea wrote:
> Yes, > > It is called "simplicity." > > If the clock tree were to vary, it would make the assembly of each > family member part even more of a hassle (than it already is), and the > software would also have to be "different" for each member of the fanily > (more than it already is). > > Thus, to make the resource the same for the smallest or largest part may > be seen as a waste, yet it is the only practical way to manufacture a > family of devices, and support them. > > The same is true for the routing resources: a small part does not > "need" all we provide, and a "large" part has perhaps too little (for > some designs). > > Austin > > fp wrote: > > > Thank everyone for replies. The info is helpful. > > > > There is one follow-up question for Austin. In Spartan-3 family, most > > hardware resource increases as the device becomes larger. If we > > compare 3S50 and 3S5000: > > - system gate: 50K vs 5M > > - block ram bit: 72K vs 1872K > > - multiplier: 4 vs 104 > > - user I/O: 124 vs 784 > > > > However, the number of clock tree remains the same (8) for all devices. > > Is there a reason for this? > > > > Thanks in advance. > > > > S. C. > > > > > > > >>Austin Lesea wrote: > >>S. C. > >> > >>One use I was given is that there is more than one design in the chip. > >>Sort of like an apartment building, with different tenants. > >> > >>Another use is for standard interfaces which you do not get to choose > >>the frequency: SDRAM at 266 MHz, PCI at 33 MHz, and a SONET core at 78 > >>MHz, all working together to perform some overall function. > >> > >>Austin > >> > >> > >