FPGARelated.com
Forums

how do I minimize the logic in this function?

Started by Brannon January 13, 2006
I have an async function with six bits in and eight bits out (listed
below). I need to minimize the logic usage (in a Virtex2) for this
function. It appears that most kmap tools will support six bits in but
only one bit out. Anyone have a tool or method they would recommend to
help with my problem?

I do it currently with two three bit subtracters (in LUTs, not
CarryChain) and a 6bit x 16 element mux (using muxf prims) running as a
lookup table. The output of one adder drives the mux S input. It just
takes too much time and space.

In      Out
0       0
1       66
2       128
3       0
4       64
5       65
6       66
7       64
8       130
9       128
10      136
11      130
12      0
13      66
14      128
15      0
16      68
17      69
18      65
19      68
20      80
21      81
22      69
23      80
24      64
25      65
26      66
27      64
28      68
29      69
30      65
31      68
32      138
33      136
34      160
35      138
36      130
37      128
38      136
39      130
40      162
41      160
42      168
43      162
44      138
45      136
46      160
47      138
48      0
49      66
50      128
51      0
52      64
53      65
54      66
55      64
56      130
57      128
58      136
59      130
60      0
61      66
62      128
63      0

Thanks for your time.

Brannon,

How about a BRAM?  Simple ROM look up table, 6 bits in, 8 bits out.

Austin

Brannon wrote:

> I have an async function with six bits in and eight bits out (listed > below). I need to minimize the logic usage (in a Virtex2) for this > function. It appears that most kmap tools will support six bits in but > only one bit out. Anyone have a tool or method they would recommend to > help with my problem? > > I do it currently with two three bit subtracters (in LUTs, not > CarryChain) and a 6bit x 16 element mux (using muxf prims) running as a > lookup table. The output of one adder drives the mux S input. It just > takes too much time and space. > > In Out > 0 0 > 1 66 > 2 128 > 3 0 > 4 64 > 5 65 > 6 66 > 7 64 > 8 130 > 9 128 > 10 136 > 11 130 > 12 0 > 13 66 > 14 128 > 15 0 > 16 68 > 17 69 > 18 65 > 19 68 > 20 80 > 21 81 > 22 69 > 23 80 > 24 64 > 25 65 > 26 66 > 27 64 > 28 68 > 29 69 > 30 65 > 31 68 > 32 138 > 33 136 > 34 160 > 35 138 > 36 130 > 37 128 > 38 136 > 39 130 > 40 162 > 41 160 > 42 168 > 43 162 > 44 138 > 45 136 > 46 160 > 47 138 > 48 0 > 49 66 > 50 128 > 51 0 > 52 64 > 53 65 > 54 66 > 55 64 > 56 130 > 57 128 > 58 136 > 59 130 > 60 0 > 61 66 > 62 128 > 63 0 > > Thanks for your time. >
That's what BRAMs are good for. Unless you are using all of them for
something else, it's a no-brainer.

Brannon wrote:
> I have an async function with six bits in and eight bits out (listed > below). I need to minimize the logic usage (in a Virtex2) for this > function. It appears that most kmap tools will support six bits in but > only one bit out. Anyone have a tool or method they would recommend to > help with my problem? > > I do it currently with two three bit subtracters (in LUTs, not > CarryChain) and a 6bit x 16 element mux (using muxf prims) running as a > lookup table. The output of one adder drives the mux S input. It just > takes too much time and space. > > In Out > 0 0 > 1 66 > 2 128 > 3 0 > 4 64 > 5 65 > 6 66 > 7 64 > 8 130 > 9 128 > 10 136 > 11 130 > 12 0 > 13 66 > 14 128 > 15 0 > 16 68 > 17 69 > 18 65 > 19 68 > 20 80 > 21 81 > 22 69 > 23 80 > 24 64 > 25 65 > 26 66 > 27 64 > 28 68 > 29 69 > 30 65 > 31 68 > 32 138 > 33 136 > 34 160 > 35 138 > 36 130 > 37 128 > 38 136 > 39 130 > 40 162 > 41 160 > 42 168 > 43 162 > 44 138 > 45 136 > 46 160 > 47 138 > 48 0 > 49 66 > 50 128 > 51 0 > 52 64 > 53 65 > 54 66 > 55 64 > 56 130 > 57 128 > 58 136 > 59 130 > 60 0 > 61 66 > 62 128 > 63 0 > > Thanks for your time. >
Okay: lets suppose I use eight ROM64x1 prims. Is that really fewer
recourses and/or comparable in speed to a four bit subtracter plugged
into the switch 6b x 16el mux with constants on all its data? What
resources does a ROM64x1 use?

Brannon,

I am proposing you load up a BRAM with the values you need for the 
addresses you have.

A simple large table lookup.  Done in one cycle.  Uses one BRAM block, 
but, do you sue them anyway?  Do you have a spare one?

Sure it is more real estate than just about every other method, but it 
is fast, and simple.  And if you have any unused BRAMs lying about, it 
is done.

Austin

Brannon wrote:

> Okay: lets suppose I use eight ROM64x1 prims. Is that really fewer > recourses and/or comparable in speed to a four bit subtracter plugged > into the switch 6b x 16el mux with constants on all its data? What > resources does a ROM64x1 use? >
"one cycle" is the whole issue. I don't have any spare cycles. This has
to be done asynchronously.

"Brannon" <brannonking@yahoo.com> writes:
> I have an async function with six bits in and eight bits out (listed > below). It appears that most kmap tools will support six bits in but > only one bit out.
So split it into eight functions that each take six bits in and produce one bit out. Then minimize each one separately. Better yet, just write it in an HDL and let the synthesis tools take care of it. For all but the most timing-critical cases, you'll wind up with perfectly acceptable results.
"Brannon" <brannonking@yahoo.com> wrote in message 
news:1137194278.231562.68890@f14g2000cwb.googlegroups.com...
> Okay: lets suppose I use eight ROM64x1 prims. Is that really fewer > recourses and/or comparable in speed to a four bit subtracter plugged > into the switch 6b x 16el mux with constants on all its data? What > resources does a ROM64x1 use?
The ROM64x1 uses 4 16-bit LUTs, two MUXF5s and a MUXF6. You would have 6 bits of address with fanouts of 6-24 with 1 LUT through MUXF5 and MUXF6 as your delay times. The total resources: 16 slices.
Brannon,

Ah.  I see.  No clock.

Seems strange that somewhere in this whole design there is no clock that 
tells you when the data is valid, but then, it isn't something I am 
working on.

Even a strobe that tells you the address is valid could be used to clock 
the BRAM....

But, if you are doing something totally asynchronous, I will bow out 
immediately.

Austin

Brannon wrote:
> "one cycle" is the whole issue. I don't have any spare cycles. This has > to be done asynchronously. >
Hi - 

It's easy to try out.  Here's an inelegantly-written Verilog module:

module comb_function (
   // Outputs
   out_val,
   // Inputs
   in_val,
   );

//-----FPGA I/O

  output [7:0] out_val;
  input  [5:0] in_val;

  reg    [7:0] out_val;

  always @(in_val)
    case(in_val)
       0:  out_val =  0  ; 
       1:  out_val =  66 ;
       2:  out_val =  128;
       3:  out_val =  0  ;
       4:  out_val =  64 ;
       5:  out_val =  65 ;
       6:  out_val =  66 ;
       7:  out_val =  64 ;
       8:  out_val =  130;
       9:  out_val =  128;
      10:  out_val = 136 ;
      11:  out_val = 130 ;
      12:  out_val = 0   ;
      13:  out_val = 66  ;
      14:  out_val = 128 ;
      15:  out_val = 0   ;
      16:  out_val = 68  ;
      17:  out_val = 69  ;
      18:  out_val = 65  ;
      19:  out_val = 68  ;
      20:  out_val = 80  ;
      21:  out_val = 81  ;
      22:  out_val = 69  ;
      23:  out_val = 80  ;
      24:  out_val = 64  ;
      25:  out_val = 65  ;
      26:  out_val = 66  ;
      27:  out_val = 64  ;
      28:  out_val = 68  ;
      29:  out_val = 69  ;
      30:  out_val = 65  ;
      31:  out_val = 68  ;
      32:  out_val = 138 ;
      33:  out_val = 136 ;
      34:  out_val = 160 ;
      35:  out_val = 138 ;
      36:  out_val = 130 ;
      37:  out_val = 128 ;
      38:  out_val = 136 ;
      39:  out_val = 130 ;
      40:  out_val = 162 ;
      41:  out_val = 160 ;
      42:  out_val = 168 ;
      43:  out_val = 162 ;
      44:  out_val = 138 ;
      45:  out_val = 136 ;
      46:  out_val = 160 ;
      47:  out_val = 138 ;
      48:  out_val = 0   ;
      49:  out_val = 66  ;
      50:  out_val = 128 ;
      51:  out_val = 0   ;
      52:  out_val = 64  ;
      53:  out_val = 65  ;
      54:  out_val = 66  ;
      55:  out_val = 64  ;
      56:  out_val = 130 ;
      57:  out_val = 128 ;
      58:  out_val = 136 ;
      59:  out_val = 130 ;
      60:  out_val = 0   ;
      61:  out_val = 66  ;
      62:  out_val = 128 ;
      63:  out_val = 0   ;
    endcase

endmodule

The resource usage is:

Mapping to part: xc2v40fg256-4
LUT2            6 uses
LUT3            3 uses
LUT4            11 uses

Synplify estimates an in-to-out delay of 2.865ns, NOT including I/O
buffers.  Note, too, that I used -4, which should be the lowest speed
grade.

Seems pretty fast and cheap.

Bob Perlman
Cambrian Design Works


On 13 Jan 2006 14:22:20 -0800, "Brannon" <brannonking@yahoo.com>
wrote:

>I have an async function with six bits in and eight bits out (listed >below). I need to minimize the logic usage (in a Virtex2) for this >function. It appears that most kmap tools will support six bits in but >only one bit out. Anyone have a tool or method they would recommend to >help with my problem? > >I do it currently with two three bit subtracters (in LUTs, not >CarryChain) and a 6bit x 16 element mux (using muxf prims) running as a >lookup table. The output of one adder drives the mux S input. It just >takes too much time and space. > >In Out >0 0 >1 66 >2 128 >3 0 >4 64 >5 65 >6 66 >7 64 >8 130 >9 128 >10 136 >11 130 >12 0 >13 66 >14 128 >15 0 >16 68 >17 69 >18 65 >19 68 >20 80 >21 81 >22 69 >23 80 >24 64 >25 65 >26 66 >27 64 >28 68 >29 69 >30 65 >31 68 >32 138 >33 136 >34 160 >35 138 >36 130 >37 128 >38 136 >39 130 >40 162 >41 160 >42 168 >43 162 >44 138 >45 136 >46 160 >47 138 >48 0 >49 66 >50 128 >51 0 >52 64 >53 65 >54 66 >55 64 >56 130 >57 128 >58 136 >59 130 >60 0 >61 66 >62 128 >63 0 > >Thanks for your time.