FPGARelated.com
Forums

bizarre state machine behavior

Started by Jon Elson May 19, 2008
Hello,

I ran into a bizarre state machine problem last week.  I had a fairly 
simple state machine written in VHDL, with an enumerated type and 5 
states.  The code is of the form :

if clock'event and clock='1' then
    . . .
   if state = a then
     if inputa = '1' then
       state <= b;
       outputa <= '1';
     end if;
   end if;

   if state = b then
   . . .

the whole thing is synchronous, running at 40 MHz on a Spartan 2E, 
except a couple external inputs such as the "inputa" above.
What I was seeing was the state machine locking up, so I added a process 
to decode the valid states and send them to LEDs.  I could see that the 
lockups left signal state in some non-valid condition, ie. NOT one of 
the enumerated values of the type.

I theorized it was encoding this as "one hot" as the software was set.
So, I forced the enumeration to have specific binary-coded values, and
enumerated all 8 possible codes, providing an if for the unused states 
to go back to the reset state.  This fixed the problem as far as I can 
tell.  I think this is telling me that in the "one hot" mode, it was 
somehow getting more than one bit set true at a time.

Well, I was careful to set up the if's for each specific state to have 
nested ifs in such a way so that any combination of input conditions 
could only satisfy one of the lowest nested ifs.  So, I just can't 
figure out how it could possibly set more than one bit hi at a time.
There was a state where it could assign 2 different states values to 
state, depending on the ifs.  But, if all the ifs were nested, it could 
never try to assign both values at the same time on the same clock.
But, that seems to be what was happening.  The software is ISE 4.2
(I am still supporting some 5V Spartan stuff, and I'm also cheap).

Any hints about what was happening would be greatly appreciated!

Jon

Jon Elson wrote:

 > <snip>
> the whole thing is synchronous, running at 40 MHz on a Spartan 2E, > except a couple external inputs such as the "inputa" above.
> <snip> That right there could be your problem. If those inputs aren't synchronous then you could get into some trouble if they change just before a clock edge happens. Some of your state machine flops get the new message, some get the old one, and you've magically got an illegal state. Can you register those signals for a clock before you use them? -- Rob Gaddi, Highland Technology Email address is currently out of order
Rob Gaddi wrote:
> Jon Elson wrote: > > > <snip> >> the whole thing is synchronous, running at 40 MHz on a Spartan 2E, >> except a couple external inputs such as the "inputa" above. > > <snip> > > That right there could be your problem. If those inputs aren't > synchronous then you could get into some trouble if they change just > before a clock edge happens. Some of your state machine flops get the > new message, some get the old one, and you've magically got an illegal > state. > > Can you register those signals for a clock before you use them?
In addition to registering all inputs, you also should make sure that the state machine is initialized with a synchronous reset after your DLLs have locked. -Jeff
"Jon Elson" <elson@wustl.edu> wrote in message 
news:483200F4.3090002@wustl.edu...
> > the whole thing is synchronous, running at 40 MHz on a Spartan 2E, except > a couple external inputs such as the "inputa" above. >
Hi Jon, That's your problem! You have a one hot state machine with five states. This is implemented as five flipflops(FF). Your external inputs are asynchronous, and so if their transitions happen to be close to the clock transition, you wil have a race condition where the signal can get to one/some FF/s, but not others. Try this link http://en.wikipedia.org/wiki/Race_condition Cheers, Syms. p.s. For completeness, I should mention there is a difference between race hazards and metastability. Your circuit can suffer from both, but in your case the race condition is many orders of magnitude more observable than the 'm' word! See CAF passim!

Jeff Cunningham wrote:
> Rob Gaddi wrote: > >> Jon Elson wrote: >> >> > <snip> >> >>> the whole thing is synchronous, running at 40 MHz on a Spartan 2E, >>> except a couple external inputs such as the "inputa" above. >> >> > <snip> >> >> That right there could be your problem. If those inputs aren't >> synchronous then you could get into some trouble if they change just >> before a clock edge happens. Some of your state machine flops get the >> new message, some get the old one, and you've magically got an illegal >> state. >> >> Can you register those signals for a clock before you use them? > > > In addition to registering all inputs, you also should make sure that > the state machine is initialized with a synchronous reset after your > DLLs have locked.
No DLLs, just a plain single clock. The state machine and all other hardware does initialize perfectly. As for registering the inputs, that DOES seem to be the right thing to do, but the binary-coded state version works fine without. Also, the clock rates on this are so low, it seems that this malfunction is happening too frequently. I hadn't thought about the possibility of there being multiple gating paths from the syntax if state = x then if inputa = '1' then state <= y; to the actual flip-flops of signal "state", but I can see how that would synthesize to such a condition. A pretty narrow window for this to happen, but certainly conceivable. Thanks, I will do the extra registering of the asynch inputs on the next rev of this! Jon
Jon Elson wrote:

> As for registering the inputs, > that DOES seem to be the right thing to do,
It is.
> but the binary-coded state version works fine without.
So far, but wait a while. The temperature may change.
> Also, the > clock rates on this are so low, it seems that this malfunction is > happening too frequently.
It's the the frequency *difference* that sweeps the setup times and throws the race the wrong way. -- Mike Treseler
Jon Elson wrote:
> > No DLLs, just a plain single clock. The state machine and all other > hardware > does initialize perfectly. > > As for registering the inputs, that DOES seem to be the right thing to > do, but the binary-coded state version works fine without.
That may have more to do with the implicit ELSE handling. ie One State engine locks solid, the other will recover in a few clocks (which means you may not notice, or have not yet noticed the effects!) Even with input registering, you should cover ALL states, (including the 'illegal' ones) in your state code.
> Also, the > clock rates on this are so low, it seems that this malfunction is > happening too frequently.
Can you clarify 'too frequently' ? With a 25ns clock, a couple of IPs and 5 choices, lets take a nice round 100ns IP sample rate. (10MHz) An aperture effect of 1ns would be hit 1:100, or average 10us. A more likely 100ps aperture, would hit 1:1000, or average 100us, or 10,000 times a second. (assumes random hits) Take your true IP sample rate, and reported timing skews, and get a more accurate prediction.
> I hadn't thought about the possibility of > there being multiple gating paths from the syntax > > if state = x then > if inputa = '1' then > state <= y; > > to the actual flip-flops of signal "state", but I can see how that would > synthesize to such a condition. A pretty narrow window for this to > happen, but certainly conceivable. > > Thanks, I will do the extra registering of the asynch inputs on the next > rev of this!
There could be a case for the tools to a) Warn on async state conditions b) Warn that illegal/ELSE options are not covered -jg
"Jon Elson" <elson@wustl.edu> wrote in message 
news:483324F1.4030005@wustl.edu...
> > As for registering the inputs, that DOES seem to be the right thing to do, > but the binary-coded state version works fine without.
Hi Jon, You do realise that every build can have different timing? If you're saying DOES because of your P&R results, you MAYBE mistaken. HTH., Syms.
Jon Elson wrote:

>> In addition to registering all inputs, you also should make sure that >> the state machine is initialized with a synchronous reset after your >> DLLs have locked. > No DLLs, just a plain single clock. The state machine and all other > hardware > does initialize perfectly.
Just be aware that even without DLLs to worry about, the internal reset is asynchronous and can cause problems if the state bits see it go away on different clock pulses, i.e. it is another asynchronous input to your state machine. I generally always do something like this: signal resetv: std_logic_vector(2 downto 0) := "000"; process(clk) begin if rising_edge(clk) resetv <= resetv(1 downto 0) & 1; end if; end process; state_machine_reset <= not resetv(2); -Jeff

Jim Granville wrote:
> That may have more to do with the implicit ELSE handling. > ie One State engine locks solid, the other will recover > in a few clocks (which means you may not notice, or have not > yet noticed the effects!) > > Even with input registering, you should cover ALL states, > (including the 'illegal' ones) in your state code. >
That's pretty easy to do with binary coded states, but with one-hot, and enumerating the type, how do you even SPECIFY the illegal states, as those, by definition, would be the ones with two or more bits "hot"?
> Can you clarify 'too frequently' ? > With a 25ns clock, a couple of IPs and 5 choices, lets > take a nice round 100ns IP sample rate. (10MHz) >
The external signals, all two of them are from a mechanical system, and change slowly. Thanks, Jon