Hi all, I had a question about Xilinx Virtex II FPGA's. In general is there an easy way to estimate the power increase by using clock enables vs. generating multiple internal clocks. Has anyone had any experience with coding a design both ways and looking at the power increase? I assume that there must be some power increase because the clock is now driving the input stage of all the Flops but internally it is gated by the enable. Thanks Jon
Clock Enables and Power
Started by ●April 19, 2004
Reply by ●April 19, 20042004-04-19
Hi Jon, You could try using the Xilinx Power Estimator tool. I'm not sure that you're right that several separate clocks are better than a clock enabled design. Here's my reasoning. A major part of the power consumption is the energy used when a flip-flop changes state, so this is the same for both designs. So, the only difference between the two methods is the difference in power to charge and discharge the global clock networks for the former case, and the power to charge and discharge the clock enable signals in the latter case. I doubt that there's much difference. Why not try the power estimator and report back? cheers, Syms. "Jon" <jon8spam@yahoo.com> wrote in message news:d68b01eb.0404191107.7fd1adf4@posting.google.com...> Hi all, > I had a question about Xilinx Virtex II FPGA's. In general is there > an easy way to estimate the power increase by using clock enables vs. > generating multiple internal clocks. Has anyone had any experience > with coding a design both ways and looking at the power increase? I > assume that there must be some power increase because the clock is now > driving the input stage of all the Flops but internally it is gated by > the enable. > > Thanks > > Jon
Reply by ●April 19, 20042004-04-19
Symon wrote:> > Hi Jon, > You could try using the Xilinx Power Estimator tool. I'm not sure that > you're right that several separate clocks are better than a clock enabled > design. Here's my reasoning. > A major part of the power consumption is the energy used when a flip-flop > changes state, so this is the same for both designs. So, the only difference > between the two methods is the difference in power to charge and discharge > the global clock networks for the former case, and the power to charge and > discharge the clock enable signals in the latter case. I doubt that there's > much difference. > Why not try the power estimator and report back? > cheers, Syms.Actually, it may be the other way around. Driving the global nets is likely to take more power than driving the inputs of the FFs. By having multiple clocks, multiple sets of clock lines will require more power vs. the extra power of driving the FF inputs. I guess it may depend on the relative speeds of the clocks. Which will take more power, a x N x F1 or (b + a x N) x F2 where a is the power coefficient for a FF input only, N is the number of FFs enabled at the lower speed and F1 is the high speed clock frequency; b is the power coefficient for driving a clock line and F2 is the low speed clock? Another way to express this is A F2 - * N <?> ------- B F1 - F2 Or the breakeven point would be B * F2 N = -------------- A * (F1 - F2) If N is greater than this, the enabled FFs will use more power. If N is less than this, the enabled FFs will use less power. Obviously it is not a simple choice, but depends on several aspects of your design and the FPGA. The design will even affect A and B somewhat since FFs in more columns will require more column lines to be driven. I am not sure the calculator will consider all these effects. But the timing analyzer will. Too bad they can't combine the timing analysis with power estimation. I think their is a lot more design info available in the timing analyzer that could be used to calculate power consumption. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAX
Reply by ●April 19, 20042004-04-19
rickman wrote:> Symon wrote: > >>Hi Jon, >>You could try using the Xilinx Power Estimator tool. I'm not sure that >>you're right that several separate clocks are better than a clock enabled >>design. Here's my reasoning. >>A major part of the power consumption is the energy used when a flip-flop >>changes state, so this is the same for both designs. So, the only difference >>between the two methods is the difference in power to charge and discharge >>the global clock networks for the former case, and the power to charge and >>discharge the clock enable signals in the latter case. I doubt that there's >>much difference. >>Why not try the power estimator and report back? >>cheers, Syms. > > > Actually, it may be the other way around. Driving the global nets is > likely to take more power than driving the inputs of the FFs. By having > multiple clocks, multiple sets of clock lines will require more power > vs. the extra power of driving the FF inputs. I guess it may depend on > the relative speeds of the clocks. > > Which will take more power, a x N x F1 or (b + a x N) x F2 where a is > the power coefficient for a FF input only, N is the number of FFs > enabled at the lower speed and F1 is the high speed clock frequency; b > is the power coefficient for driving a clock line and F2 is the low > speed clock? Another way to express this is > > A F2 > - * N <?> ------- > B F1 - F2 > > Or the breakeven point would be > > B * F2 > N = -------------- > A * (F1 - F2) > > If N is greater than this, the enabled FFs will use more power. If N is > less than this, the enabled FFs will use less power. > > Obviously it is not a simple choice, but depends on several aspects of > your design and the FPGA. The design will even affect A and B somewhat > since FFs in more columns will require more column lines to be driven. > I am not sure the calculator will consider all these effects. But the > timing analyzer will. Too bad they can't combine the timing analysis > with power estimation. I think their is a lot more design info > available in the timing analyzer that could be used to calculate power > consumption.Expanding on this, there was data posted not long ago here, about the relative power of a 'true clock net', vs a signal used as clock. ISTR someone from Altera also mentioned a tool /floorplan approach, that trys to pack logic onto physical clock branches/stubs, and so avoids driving un-used clock lines. Would suit a stable design, and one where the saving was worth the effort It is also good to see IC vendors starting to quote Clock power figures for Enabled and Disabled counters - that gives a feel for ratios of .CLK and .Q power capacitances. They could easily add this to the power estimator/post route analyser. Probably just needs customer demand.... :) -jg
Reply by ●April 19, 20042004-04-19
Clock Enables vs multiple clocks is a trade-off. If you are not concerned about power, then a single low0skew global clock and a "sloppier" network of CEs requires the least amount of thinking. Multiple derived clocks mean that you have to think about clock transfer from one clock domain to the next, you may have to use multiple PLL/DLL/DCMs. In the extreme case, the use of CE will always save power. Think of a design with 10 flip-flops clocked at 200 MHs, the remaining 500 flip-flops clocked at 1 MHz. It sure would reduce power when the fast clock is only routed to the 10 flip-flops and the remaining 500 get that a slow clock (vs 200 MHz all over the chip, plus a1 MHz CE signal to most flip-flops) Peter Alfke> From: "Symon" <symon_brewer@hotmail.com> > Newsgroups: comp.arch.fpga > Date: Mon, 19 Apr 2004 13:39:48 -0700 > Subject: Re: Clock Enables and Power > > Hi Jon, > You could try using the Xilinx Power Estimator tool. I'm not sure that > you're right that several separate clocks are better than a clock enabled > design. Here's my reasoning. > A major part of the power consumption is the energy used when a flip-flop > changes state, so this is the same for both designs. So, the only difference > between the two methods is the difference in power to charge and discharge > the global clock networks for the former case, and the power to charge and > discharge the clock enable signals in the latter case. I doubt that there's > much difference. > Why not try the power estimator and report back? > cheers, Syms. > > > "Jon" <jon8spam@yahoo.com> wrote in message > news:d68b01eb.0404191107.7fd1adf4@posting.google.com... >> Hi all, >> I had a question about Xilinx Virtex II FPGA's. In general is there >> an easy way to estimate the power increase by using clock enables vs. >> generating multiple internal clocks. Has anyone had any experience >> with coding a design both ways and looking at the power increase? I >> assume that there must be some power increase because the clock is now >> driving the input stage of all the Flops but internally it is gated by >> the enable. >> >> Thanks >> >> Jon > >
Reply by ●April 19, 20042004-04-19
"Peter Alfke" <peter@xilinx.com> wrote in message news:BCA9A92A.5F50%peter@xilinx.com...> Clock Enables vs multiple clocks is a trade-off. > If you are not concerned about power, then a single low0skew global clock > and a "sloppier" network of CEs requires the least amount of thinking. > Multiple derived clocks mean that you have to think about clock transfer > from one clock domain to the next, you may have to use multiple > PLL/DLL/DCMs. >Indeed, your customer/boss/well-being is almost always far better served by the 1) reduced time to market 2) improved design accuracy and stability 3) design portability 4) simplicity of having a single clock with CEs for the slower stuff. It's hard to imagine a situation where a multiple clock system would be worth the hassle. (Maybe the use of legacy stuff would be one reason?) IMO, Syms.
Reply by ●April 19, 20042004-04-19
>4) simplicity >of having a single clock with CEs for the slower stuff. It's hard to imagine >a situation where a multiple clock system would be worth the hassle. (Maybe >the use of legacy stuff would be one reason?)How about battery operation where power consumption is critical? (It turns into battery life.) On the high speed/performance end, it might save a bit of heat. Going from X watts to X-2 might be critical. (Look at the games modern CPUs are doing to balance heat and performance.) -- The suespammers.org mail server is located in California. So are all my other mailboxes. Please do not send unsolicited bulk e-mail or unsolicited commercial e-mail to my suespammers.org address or any of my other addresses. These are my opinions, not necessarily my employer's. I hate spam.
Reply by ●April 20, 20042004-04-20
Peter Alfke wrote:> > Clock Enables vs multiple clocks is a trade-off. > If you are not concerned about power, then a single low0skew global clock > and a "sloppier" network of CEs requires the least amount of thinking. > Multiple derived clocks mean that you have to think about clock transfer > from one clock domain to the next, you may have to use multiple > PLL/DLL/DCMs.That reminds me of another issue. When you use a single clock with clock enables, you have to produce timing constraints to allow the enabled parts of the circuit to be routed with lesser constraints. But this is one part of the design process that is prone to error and has no method of verification that I am aware of. Of course some would say that you only need to "pay careful attention" to the timing constraints, but you can make that argument about *any* part of the design process. The point is that if you use separate clocks, the timing constraints are very simple and much harder to mess up. With a single clock and multiple clock enables, the timing constraints are not so simple and very easy to make mistakes. What is really needed is a method of verification of timing constraints, just like we have verification of other aspects of the design process. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAX
Reply by ●April 20, 20042004-04-20
"rickman" <spamgoeshere4@yahoo.com> wrote in message news:408551E5.132C60F3@yahoo.com...> The point is that if you use separate clocks, the timing constraints are > very simple and much harder to mess up. With a single clock and > multiple clock enables, the timing constraints are not so simple and > very easy to make mistakes.Hey Rick, Very true! The only exception could be that bit in the multi-clock design where the signals travel from one clock domain to another. If you go the 'enabled' route you need only worry about getting that enable correct. The multi-clock route could have a lot of places where signals cross domains, each needing attention in the timing constraints, both delay and skew. So, I'm a convert to the 'enabled' way. I use that circuit you stuck on here a few months back (ta very much!) to generate my enables, and I have a Perl script to work out the MAXDELAYs from the two clock rates, net delays and Tckos, Ticks. I've found that the NET "CLK_EN" TNM=FFS "CLK_EN_FFS"; is pretty reliable these days, especially if you use the 'direct_enable' directive in Synplify. It would be even nicer if you could group stuff into timing groups in the source code, but that's not really there yet. Which makes me think, why isn't easy timing constraints part of the RTL HDL languages? cheers, Syms.
Reply by ●April 20, 20042004-04-20
"Symon" <symon_brewer@hotmail.com> wrote in message news:<c63qg2$7ovs9$1@ID-212844.news.uni-berlin.de>...> "rickman" <spamgoeshere4@yahoo.com> wrote in message > news:408551E5.132C60F3@yahoo.com... > > The point is that if you use separate clocks, the timing constraints are > > very simple and much harder to mess up. With a single clock and > > multiple clock enables, the timing constraints are not so simple and > > very easy to make mistakes. > Hey Rick, > Very true! The only exception could be that bit in the multi-clock design > where the signals travel from one clock domain to another. If you go the > 'enabled' route you need only worry about getting that enable correct. The > multi-clock route could have a lot of places where signals cross domains, > each needing attention in the timing constraints, both delay and skew. > So, I'm a convert to the 'enabled' way. I use that circuit you stuck on here > a few months back (ta very much!) to generate my enables, and I have a Perl > script to work out the MAXDELAYs from the two clock rates, net delays and > Tckos, Ticks. I've found that the > NET "CLK_EN" TNM=FFS "CLK_EN_FFS"; > is pretty reliable these days, especially if you use the 'direct_enable' > directive in Synplify. It would be even nicer if you could group stuff into > timing groups in the source code, but that's not really there yet. Which > makes me think, why isn't easy timing constraints part of the RTL HDL > languages?Good question. Recently I've been planing how to incorporate timing constraints in Confluence. Because Confluence has implicit clock enables, I think there is opportunity for semi-automated timing constraints. For example, designers would only have to place constraints on multi-cycle enables; the tools would then automatically determine which paths are multi-cycle. And since clock domains are also implicit, it should be farily straight forward to issue warnings on unconstrained false-paths. Regards, Tom -- Launchbird Design System, Inc. http:www.launchbird.com






