FPGARelated.com
Forums

FSM state minimization with ISE?

Started by backhus June 13, 2006
Hi everybody,
I just tried to minimize a simple statemachine with ISE8.1 and wondered 
that there is no effect, while the machine itself is an ideal target for 
state minimization. (It's a lecture example fur just that purpose)

I tried several synthesis properties with no effect.

Anyone knows how to do state minimizing with ISE, or what the reasons 
are for not minimizing the given FSM? It works with Design Compiler!

Best regards
   Eilert

-- example FSM source
-- analyzes a serial bitstream and
-- detects non-BCD values in 4-bit segments (tetrades)



LIBRARY ieee;
USE ieee.std_logic_1164.ALL;

ENTITY pseudotetradedetector IS
   PORT (
     X     : IN  STD_LOGIC;
     Reset : IN  STD_LOGIC;
     CLK   : IN  STD_LOGIC;
     Y     : OUT STD_LOGIC
     );
END pseudotetradedetector;

ARCHITECTURE ptd_2p OF pseudotetradedetector IS

   TYPE state IS (S0, S1, S2, S3, S4, S5, S6, S7,
                  S8, S9, S10, S11, S12, S13, S14);
   SIGNAL cs,ns : state;

BEGIN

  PROCESS (Reset, CLK)
   BEGIN
     IF Reset = '1' THEN
       cs <= S0;
     ELSIF rising_edge(CLK) THEN
       cs<= ns;
     END IF;
   END PROCESS;

   process(cs,x)
   BEGIN
     CASE cs IS
         WHEN S0 => IF (X = '0') THEN
			Y <= '0';
			ns<= S1;
                    ELSE
			Y <= '0';
			ns<= S2;
                    END IF;
         WHEN S1 => IF (X = '0') THEN
			Y <= '0';
			ns<= S3;
                    ELSE
			Y <= '0';
			ns<= S4;
                    END IF;
         WHEN S2 => IF (X = '0') THEN
			Y <= '0';
			ns<= S5;
                    ELSE
			Y <= '0';
			ns<= S6;
                    END IF;
         WHEN S3 => IF (X = '0') THEN
			Y <= '0';
			ns<= S7;
                    ELSE
			Y <= '0';
			ns<= S8;
                    END IF;
         WHEN S4 => IF (X = '0') THEN
			Y <= '0';
			ns<= S9;
                    ELSE
			Y <= '0';
			ns<= S10;
                    END IF;
         WHEN S5 => IF (X = '0') THEN
			Y <= '0';
			ns<= S11;
                    ELSE
			Y <= '0';
			ns<= S12;
                    END IF;
         WHEN S6 => IF (X = '0') THEN
			Y <= '0';
			ns<= S13;
                    ELSE
			Y <= '0';
			ns<= S14;
                    END IF;
         WHEN S7 => IF (X = '0') THEN
			Y <= '0';
                    ELSE
			Y <= '0';
                    END IF;
			ns <= S0;
         WHEN S8 => IF (X = '0') THEN
			Y <= '0';
                    ELSE
			Y <= '1';
                    END IF;
			ns <= S0;
         WHEN S9 => IF (X = '0') THEN
			Y <= '0';
                    ELSE
			Y <= '1';
                    END IF;
			ns <= S0;
         WHEN S10 => IF (X = '0') THEN
			Y <= '0';
                     ELSE
			Y <= '1';
                     END IF;
			ns <= S0;
         WHEN S11 => IF (X = '0') THEN
				Y <= '0';
                     ELSE
			Y <= '0';
                     END IF;
			  ns <= S0;
         WHEN S12 => IF (X = '0') THEN
				Y <= '0';
                     ELSE
			Y <= '1';
                     END IF;
			  ns <= S0;
         WHEN S13 => IF (X = '0') THEN
   		      Y <= '0';
                     ELSE
		      Y <= '1';
                     END IF;
	              ns <= S0;
         WHEN S14 => IF (X = '0') THEN
		      Y <= '0';
                     ELSE
    		      Y <= '1';
                     END IF;
		      ns <= S0;
         WHEN OTHERS => Y <= '0';
		       ns <= S0;
       END CASE;
   END PROCESS;
END ptd_2p;

backhus wrote:

> I just tried to minimize a simple statemachine with ISE8.1 and wondered > that there is no effect, while the machine itself is an ideal target for > state minimization. (It's a lecture example fur just that purpose)
If state encoding is set to AUTO you are getting a one-hot encoding which is not optimum in this case. Might be a good subject for a lecture :) Quartus says 13 flops and 17 luts for one-hot 4 flops and 13 luts for binary What did you get? -- Mike Treseler
Mike Treseler wrote:
> backhus wrote: > > > I just tried to minimize a simple statemachine with ISE8.1 and wondered > > that there is no effect, while the machine itself is an ideal target for > > state minimization. (It's a lecture example fur just that purpose) > > If state encoding is set to AUTO > you are getting a one-hot encoding > which is not optimum in this case. > Might be a good subject for a lecture :) > > Quartus says > 13 flops and 17 luts for one-hot > 4 flops and 13 luts for binary > > What did you get? > > -- Mike Treseler
(Interesting post. Eilert asks about ISE 8.1, Mike [who usually gives good answers] responds with wrong info on AUTO setting, and odd info on Quartus...it's way past April 1st) Eilert, I agree with Mike that you should give your results. If you can't get any improvement, maybe you already have the best you can get. For your posted circuit, XST in ISE 8.1 with global setting "-fsm_encoding auto" gives: Number of Slice Flip Flops: 4 Number of 4 input LUTs: 10 when compiled to a virtex2 (Assigned states are binary sequential). Given 15 states, you can't do any better than 4 FFs! Asking for "one-hot" gives: Number of Slice Flip Flops: 15 Number of 4 input LUTs: 19 Eilert, RTFM (Read The Fine Manual). In the XST manual, you can find out how to individually specify the state encoding, with a selection of: "auto, one-hot, compact, sequential, gray, johnson, speed1, and user". Mike, Does Quartus really make a one-hot w/ 13 FFs for a 15 state machine?
JustJohn wrote:
> (Interesting post. Eilert asks about ISE 8.1, Mike [who usually gives > good answers] responds with wrong info on AUTO setting, and odd info on > Quartus...it's way past April 1st)
...
> Given 15 states, you can't do any better than 4 FFs!
... > Does Quartus really make a one-hot w/ 13 FFs for a 15 state machine? I could be wrong, but I think you missed the point. The OP's FSM is redundant (which is the whole point) and can be reduced to a smaller FSM with the same behaviour. This minimization is similar to how you derive a DFSM from an NFSM (widely used in regexp). Without having gone through the whole exercise, the Quartus results suggests that you can merge at least three state of the original machine. Mike post was relevant and informative. Tommy
Tommy Thorn wrote:
> I could be wrong, but I think you missed the point. The OP's FSM is > redundant (which is the whole point) and can be reduced to a smaller FSM > with the same behaviour. This minimization is similar to how you derive > a DFSM from an NFSM (widely used in regexp). > > Without having gone through the whole exercise, the Quartus results > suggests that you can merge at least three state of the original machine. > > Mike post was relevant and informative. > > Tommy
Thanks Tommy! I stand corrected... But note that the state minimization increases LUT usage...I suppose it might decrease decoding H/W elsewhere... but how is does one specify exactly what sort of optimization is wanted? Personally, I wouldn't want states eliminated unless I explicitly said to do it.
JustJohn wrote:
> Thanks Tommy! I stand corrected... > But note that the state minimization increases LUT usage...I suppose it > might decrease decoding H/W elsewhere... but how is does one specify > exactly what sort of optimization is wanted? Personally, I wouldn't > want states eliminated unless I explicitly said to do it.
Where did you get the LUT usage increase from? As we don't have the number of LUTs for the original FSM w/o minimization, I don't follow the conclusion. AFAICT, state minimization should always result in fewer or same number of LUTs, assuming the same encoding strategy. IMO, redundant FSM are unlikely when written by hand, but much more likely when autogenerated. Tommy
backhus wrote:
> Hi everybody, > I just tried to minimize a simple statemachine with ISE8.1 and wondered > that there is no effect, while the machine itself is an ideal target for > state minimization. (It's a lecture example fur just that purpose)
I guess that ISE and Quartus just doesn't do state minimization (probably because real designs aren't redundant). I was waiting for a compile, so I tried it by hand. S8, S9, S10, S12, S13, and S14 are clearly identical so they can all be replaced by S8, which leads to new identical state etc. My final machine had only six states: TYPE state IS (S0, S1, S3, S4, S7, S8); .... WHEN S0 => Y <= '0'; ns<= S1; WHEN S1 => Y <= '0'; IF (X = '0') THEN ns<= S3; ELSE ns<= S4; END IF; WHEN S3 => Y <= '0'; IF (X = '0') THEN ns<= S7; ELSE ns<= S8; END IF; WHEN S4 => Y <= '0'; ns <= S8; WHEN S7 => Y <= '0'; ns <= S0; WHEN S8 => IF (X = '0') THEN Y <= '0'; ELSE Y <= '1'; END IF; ns <= S0; Is this correct and was this the kind of optimization you had expected ISE to perform? Tommy
Hi Tommy,
while your solution is not really correct(in s0 you have to branch into 
two states according to X), at least you got the point.
In fact States 7 to 10 and 11 to 14 are equivalent and each group can be 
reduced to on state. Furtermore states 3,4  and 5,6  are equivalent too.

After all it results in 7 States.

ISE Generates
15 FFs in OneHot and
4FFs (with 15states encoded) in the binary encodings.

Quartus seems to eliminate two states, possibly the pairs 3,4 and 5,6.
(Yes Mike, it is a subject for a lecture. Our students have to do the 
minimization. By pen&paper in the 3rd semester and with DesignCompiler 
in the higher semesters)


So, what I expected from ISE is that it performs at least as good as a 
3rd semester student. :-)

With OneHot the FFs should be reduced to 7 which saves over 50%. And the 
combinatonal Increase, well for this example I dobt that there is any 
significant.

For binary encoding only one FF is saved, but the encoding will fit into 
one LUT (3 Statebits + X) instead of two LUTs (4 Statebits + X) in the 
non-minimized solution.

(John, I mentioned in my first posting that dc does a minimization, so a 
saving in FFs could be expected. Furthermore I read the manual before 
and found a clear statement, that "Xilinx&#4294967295; recommends that you use
enumerated type encoding to specify the states and use the Finite State 
Machine (FSM) extraction commands to extract and encode the state 
machine as well as to perform state minimization and optimization 
algorithms." [ 8.1 synthesis and Simulation Design Guide, p.128])

The mentioned part about "extraction commands" seems to come from some 
older ABEL Manual for CPLDs. Everything else there describes HDL 
designs. Is it a copy&paste error of the manual editor?

So, the original Question still needs an answer. Is ISE capable of 
performing State minimization? The above cited manual says so, but the 
experiance holds against it.

Best regards
  Eilert

Tommy Thorn schrieb:
> backhus wrote: >> Hi everybody, >> I just tried to minimize a simple statemachine with ISE8.1 and >> wondered that there is no effect, while the machine itself is an ideal >> target for state minimization. (It's a lecture example fur just that >> purpose) > > I guess that ISE and Quartus just doesn't do state minimization > (probably because real designs aren't redundant). > > I was waiting for a compile, so I tried it by hand. S8, S9, S10, S12, > S13, and S14 are clearly identical so they can all be replaced by S8, > which leads to new identical state etc. My final machine had only six > states: > > TYPE state IS (S0, S1, S3, S4, S7, S8); > .... > WHEN S0 => Y <= '0'; > ns<= S1; > WHEN S1 => Y <= '0'; > IF (X = '0') THEN > ns<= S3; > ELSE > ns<= S4; > END IF; > WHEN S3 => Y <= '0'; > IF (X = '0') THEN > ns<= S7; > ELSE > ns<= S8; > END IF; > WHEN S4 => Y <= '0'; ns <= S8; > WHEN S7 => Y <= '0'; ns <= S0; > WHEN S8 => IF (X = '0') THEN > Y <= '0'; > ELSE > Y <= '1'; > END IF; > ns <= S0; > > Is this correct and was this the kind of optimization you had expected > ISE to perform? > > Tommy
Hi Tommy, John and Mike
I posted some results discussion as a reply to Tommys direct reply for 
my OP.
Mike, interesting result from Quartus. At least it seems to do some 
optimization. Is there anything in the synthesis report that gives 
information about what exactly was done to the FSM?


John, if a synthesis Tool optimizes away equivalent states you only will 
have to worry about it when you try to compare your timing sim with the 
behavioral sim. It causes some nasty work though, but from minimization 
one should expect area savings or speed increase, which might be useful 
if you are designing on the edge of performance.


Tommy, I mostly agree that handwritten/designed FSMs seldom have 
equivalent states. Because they are often (not always) quite small. But 
when one has to design a large and more complicated algorithm and want 
to keep the models source readable or for an easier analysis of the 
simulation results, then equivalent states may appear intended or 
accidental.
Autogeneration of FSMs is surely a source of such things.

Best regards
   Eilert


Tommy Thorn schrieb:
> JustJohn wrote: >> Thanks Tommy! I stand corrected... >> But note that the state minimization increases LUT usage...I suppose it >> might decrease decoding H/W elsewhere... but how is does one specify >> exactly what sort of optimization is wanted? Personally, I wouldn't >> want states eliminated unless I explicitly said to do it. > > Where did you get the LUT usage increase from? As we don't have the > number of LUTs for the original FSM w/o minimization, I don't follow the > conclusion. > > AFAICT, state minimization should always result in fewer or same number > of LUTs, assuming the same encoding strategy. > > IMO, redundant FSM are unlikely when written by hand, but much more > likely when autogenerated. > > Tommy
backhus wrote:

> Mike, interesting result from Quartus. At least it seems to do some > optimization.
It may be of academic interest, but binary encoding is the clear winner here, as it usually is in small cases. Even in large cases the difference is mostly one of speed, not area.
> Is there anything in the synthesis report that gives > information about what exactly was done to the FSM?
looks like it pitched 7 and 11: http://home.comcast.net/~mike_treseler/pseudo.pdf http://home.comcast.net/~mike_treseler/pseudo_states.pdf -- Mike Treseler