Forums

New(?) fast binary counter for FPGAs without carry logic (e.g. Actel) -- Request For Comment

Started by robotron September 11, 2012
Dear colleagues,

I am sending you a proposal of binary counter, designed to minimize logic
path length in between flip-flops to one gate (MUX) only, at the expense of
not so straightforward binary counting. The reason for this design has
emerged while using Actel (MicroSemi) ProASIC/IGLOO architecture, lacking
any hardwired support for fast carry.

I have placed VHDL code, schematics, testbench and sample C code to OpenCores:

	http://opencores.org/project,pcounter

for further review. If you have GHDL, you can run the test easily by issuing
"make testrun" or "make testvcd" to examine traces.

Background:
During our work on Actel FPGAs (basically, 3-LUT & DFF only), we were aware
of following types of faster counters:
- LFSR counter
- Johnson counter
- "RLA counter" (as tailored using Actel's SmartGen core generator)

Johnson due to its O(2^n) (n as number of bits) can not be used for longer
counts; LFSR's are hard to invert (table lookup seems to be only known
method), therefore also impractical for wider counters. RLA counter is still
too slow and complex for wider counters and moderate speeds (e.g. 
> 24bits @ >100MHz).
As a consequence, the proposed counter uses synchronous divide-by-two blocks, each using 1-bit pipeline and carry by single-clock pulse. Design is simple and fast, preliminary results from Synplify and Actel Designer shows 32bits @200MHz feasible. However, output bit lines are non-proportionaly delayed by discrete number of clock periods. Therefore, to obtain linear bit word, an inversion formula needs to be applied. Fortunately, the inversion is simple (unlike LFSR's), in C (pcount.c): for (k = 1; k < n; k++) if ((y & ((1<<k)-1)) < k) y = y ^ (1<<k); -- it may be implemented in VHDL core, or within CPU as shown, depending on application requirements. I am attaching design files & C language decoder/encoder of counter bit words. If you have GHDL, you can run the test easily by issuing "make testrun" or "make testvcd" to examine traces. ** My questions are: ** - does this design exists, is it being used, and if so, what is its name? - if not, do you find the design useful? Best regards, Marek Peca
On Tue, 11 Sep 2012 11:47:01 -0700, robotron wrote:

> Dear colleagues, > > I am sending you a proposal of binary counter, designed to minimize > logic path length in between flip-flops to one gate (MUX) only, at the > expense of not so straightforward binary counting. The reason for this > design has emerged while using Actel (MicroSemi) ProASIC/IGLOO > architecture, lacking any hardwired support for fast carry. > > I have placed VHDL code, schematics, testbench and sample C code to > OpenCores: > > http://opencores.org/project,pcounter > > for further review. If you have GHDL, you can run the test easily by > issuing "make testrun" or "make testvcd" to examine traces. > > Background: > During our work on Actel FPGAs (basically, 3-LUT & DFF only), we were > aware of following types of faster counters: - LFSR counter > - Johnson counter > - "RLA counter" (as tailored using Actel's SmartGen core generator) > > Johnson due to its O(2^n) (n as number of bits) can not be used for > longer counts; LFSR's are hard to invert (table lookup seems to be only > known method), therefore also impractical for wider counters. RLA > counter is still too slow and complex for wider counters and moderate > speeds (e.g. >> 24bits @ >100MHz). > > As a consequence, the proposed counter uses synchronous divide-by-two > blocks, each using 1-bit pipeline and carry by single-clock pulse. > Design is simple and fast, preliminary results from Synplify and Actel > Designer shows 32bits @200MHz feasible. > > However, output bit lines are non-proportionaly delayed by discrete > number of clock periods. Therefore, to obtain linear bit word, an > inversion formula needs to be applied. Fortunately, the inversion is > simple (unlike LFSR's), in C (pcount.c): > > for (k = 1; k < n; k++) > if ((y & ((1<<k)-1)) < k) > y = y ^ (1<<k); > > -- it may be implemented in VHDL core, or within CPU as shown, depending > on application requirements. > > I am attaching design files & C language decoder/encoder of counter bit > words. If you have GHDL, you can run the test easily by issuing "make > testrun" or "make testvcd" to examine traces. > > ** My questions are: ** > - does this design exists, is it being used, and if so, what is its > name? - if not, do you find the design useful?
So, you've reinvented the ripple counter, only with deterministic carry? I'm not sure what you'd call it, but a quick Google on "ripple counter", possibly with "synchronous" tossed in there, may find you some prior art. -- My liberal friends think I'm a conservative kook. My conservative friends think I'm a liberal kook. Why am I not happy that they have found common ground? Tim Wescott, Communications, Control, Circuits & Software http://www.wescottdesign.com
Hello,

On Tuesday, September 11, 2012 10:20:24 PM UTC+2, Tim Wescott wrote:
> > So, you've reinvented the ripple counter, only with deterministic carry?
yes, nothing more. (Only still unsure about that REinvention by means of easily accessible, documented prior art -- this is why I am bothering the newsgroup.)
> I'm not sure what you'd call it, but a quick Google on "ripple counter", > possibly with "synchronous" tossed in there, may find you some prior art.
Can you be more specific? I have found only asynchronous ripple counters. For synchronous counter searches, I constantly get, what also vendor tools offer: counters without pipeline, i.e. with combinatorial network, spanning over all of the outputs. THIS is what I meant to avoid within the design. Example: http://en.wikipedia.org/wiki/File:4-bit-jk-flip-flop_V1.1.svg Here you can see AND gate spanning three output bits. For wider counter, that means n-1 input wide AND. Of course, may be solved by faster logic. However, do you know about any working solution on *slow* architectures, at least as good as my proposal? I know only about Johnson/one-hot counter (exponential floorplan consumption) and LFSR (exponential lookup-table consumption). Am I missing something? Thank you for your response, Marek
robotron wrote:


> I know only about Johnson/one-hot counter (exponential floorplan > consumption) and LFSR (exponential lookup-table consumption). Am I missing > something?
How about gray code counters? Maybe their only advantage is when sampling them, not the carry propagation, though. Jon
On Tue, 11 Sep 2012 16:30:42 -0500, Jon Elson wrote:

> robotron wrote: > > >> I know only about Johnson/one-hot counter (exponential floorplan >> consumption) and LFSR (exponential lookup-table consumption). Am I >> missing something? > How about gray code counters? Maybe their only advantage is when > sampling them, not the carry propagation, though. > > Jon
Gray code counters have the same carry issues as regular binary counters. I think the idea of a synchronous ripple counter is a clever one. I'd be surprised if it hasn't been thought of before, but it's still clever. -- My liberal friends think I'm a conservative kook. My conservative friends think I'm a liberal kook. Why am I not happy that they have found common ground? Tim Wescott, Communications, Control, Circuits & Software http://www.wescottdesign.com
Tim Wescott <tim@seemywebsite.com> wrote:
> On Tue, 11 Sep 2012 11:47:01 -0700, robotron wrote:
(snip)
>> I am sending you a proposal of binary counter, designed to minimize >> logic path length in between flip-flops to one gate (MUX) only, at the >> expense of not so straightforward binary counting. The reason for this >> design has emerged while using Actel (MicroSemi) ProASIC/IGLOO >> architecture, lacking any hardwired support for fast carry.
(snip)
>> http://opencores.org/project,pcounter
I think it is pretty neat. I wouldn't be surprised if it had been invented in the early days of computing, and then forgotten. My first thought would have been a Gray code counter, and probably also my second thought. For now, I will call it carry anticipation. In generating modulon N counters where N isn't a power of two, it often takes more than one cycle to do the comparison against N-1 to generate a reset. One possible solution is to compare against N-2, then delay the reset. Well, more specifically, pipeline the comparator such that the reset comes out at the right time. That requires a comparison against a smaller number such that the result is right. Also reminds me of Peter Alfke's divide by 2.5 circuit used to build really fast dividers. (snip)
> So, you've reinvented the ripple counter, only with deterministic carry?
The usual ripple counter is asynchronous, but I almost see what you mean.
> I'm not sure what you'd call it, but a quick Google on "ripple counter", > possibly with "synchronous" tossed in there, may find you some prior art.
Hmm. OK, my description above isn't quite right. If it did do carry anticipation, then it would be an ordinary binary counter. If instead you delay the carry at each stage by one cycle, does that describe it? I suppose I still think that it is possible to build a Gray code counter with similar delays in it. -- glen
On Tue, 11 Sep 2012 13:40:40 -0700, robotron wrote:

> Hello, > > On Tuesday, September 11, 2012 10:20:24 PM UTC+2, Tim Wescott wrote: >> >> So, you've reinvented the ripple counter, only with deterministic >> carry? > > yes, nothing more. (Only still unsure about that REinvention by means of > easily accessible, documented prior art -- this is why I am bothering > the newsgroup.) > >> I'm not sure what you'd call it, but a quick Google on "ripple >> counter", possibly with "synchronous" tossed in there, may find you >> some prior art. > > Can you be more specific? I have found only asynchronous ripple > counters. For synchronous counter searches, I constantly get, what also > vendor tools offer: counters without pipeline, i.e. with combinatorial > network, spanning over all of the outputs. THIS is what I meant to avoid > within the design. > > Example: http://en.wikipedia.org/wiki/File:4-bit-jk-flip-flop_V1.1.svg > > Here you can see AND gate spanning three output bits. For wider counter, > that means n-1 input wide AND. Of course, may be solved by faster logic. > However, do you know about any working solution on *slow* architectures, > at least as good as my proposal? > > I know only about Johnson/one-hot counter (exponential floorplan > consumption) and LFSR (exponential lookup-table consumption). Am I > missing something?
Well, your scheme does have the drawback that while the counter is fast, and the sampling thereof is fast, the algorithm to get from the sampled value to a binary one takes some clock cycles. Still, it'd work well as a capture for a microprocessor, or as a compare (for something like a PWM). It'd be better yet if you could make it an up-down counter, but I bet that's harder. If you have access to a university library, do a literature search -- I'd be astonished if this hasn't shown up before in something like the IEEE Circuit Society journal, or in a patent someplace. -- My liberal friends think I'm a conservative kook. My conservative friends think I'm a liberal kook. Why am I not happy that they have found common ground? Tim Wescott, Communications, Control, Circuits & Software http://www.wescottdesign.com
Tim Wescott <tim@seemywebsite.com> wrote:
> On Tue, 11 Sep 2012 16:30:42 -0500, Jon Elson wrote:
(snip on fast counters)
>> How about gray code counters? Maybe their only advantage is when >> sampling them, not the carry propagation, though.
(snip)
> Gray code counters have the same carry issues as regular binary counters.
Are you sure? Having not designed one recently, I wouldn't be sure, but I thoght that they didn't. You do have to consider when each FF will change, though, and so synchronous logic tools might not be able to do the timing.
> I think the idea of a synchronous ripple counter is a clever one. I'd be > surprised if it hasn't been thought of before, but it's still clever.
Yes, many strange things were invented in the early days. One that seems to be forgotten is the Earle latch, which as best I can describe it, combines one level of logic with latch circuitry. But that was before edge triggered logic. -- glen
On Tue, 11 Sep 2012 21:49:42 +0000, glen herrmannsfeldt wrote:

> Tim Wescott <tim@seemywebsite.com> wrote: >> On Tue, 11 Sep 2012 16:30:42 -0500, Jon Elson wrote: > > (snip on fast counters) > >>> How about gray code counters? Maybe their only advantage is when >>> sampling them, not the carry propagation, though. > > (snip) >> Gray code counters have the same carry issues as regular binary >> counters. > > Are you sure? Having not designed one recently, I wouldn't be sure, but > I thoght that they didn't. You do have to consider when each FF will > change, though, and so synchronous logic tools might not be able to do > the timing.
Well, yes an no. Now that you mention it, it strikes me that the "only one thing changes at a time" may make it _easier_ to anticipate carries -- but _when_ any bit in a gray code counter changes is still dependent on _all_ the other bits; if that has to propagate through a bunch of logic, then you're back to slows-ville. -- My liberal friends think I'm a conservative kook. My conservative friends think I'm a liberal kook. Why am I not happy that they have found common ground? Tim Wescott, Communications, Control, Circuits & Software http://www.wescottdesign.com
Tim Wescott <tim@seemywebsite.com> wrote:

(snip)
> If you have access to a university library, do a literature search -- I'd > be astonished if this hasn't shown up before in something like the IEEE > Circuit Society journal, or in a patent someplace.
If you don't have such access, and are interested in the such, I recommend Earl Swartzlander's "Computer Arithmetic, vol. 1" (and then vol. 2). They are IEEE published books with reprints of various papers that were important in the history of computer arithmetic. Volume 2 has more recent papers, many related to fast floating point. I don't see a counter like this, so far, though. -- glen