Hi all! I'd like to once again bring up the subject of state machines running into illegal states (illegal in the sense that the state vector does not correspond to any of the states defined in the VHDL code), because despite having spent half a day googling and reading related threads, I'm still left with a couple of questions: 1. Most discussions cover how to recover from illegal states, but few cover how it actually happens. What are the (I presume) electrical reasons to that a state machine runs into an illegal state in the first place? Is there anything one can do to reduce the risk? Assume all FSM inputs connected to I/O pins are synchronized with one FF each, and the whole design is synchronous. Does anyone know of a good tutorial on this issue? I could add that in my case, the transition into an illegal state almost always happen immediately upon startup of the system, if it happens. 2. How can I force Xilinx XST (6.2 SP3) to produce a safe FSM that recovers from an illegal state? A "when others => state <= IDLE;" clause doesn't seem to help (which I think is stupid, isn't this problem so well known that they should make XST recognize it instead of optimizing it away?). I realize that changing coding style to "Binary" will reduce the number of illegal state and thus the risk for it to happen, but it's not completely safe, unless the number of states is a power of two. What's more, binary coding style seems to increase slice utilization for the whole design by up to 10%. Thanks in advance! /Jerker
FSM in illegal state
Started by ●July 7, 2004
Reply by ●July 7, 20042004-07-07
<jerkerNO@SPAMdst.se> wrote:>1. Most discussions cover how to recover from illegal states, but few cover >how it actually happens. What are the (I presume) electrical reasons to that >a state machine runs into an illegal state in the first place? Is there >anything one can do to reduce the risk? Assume all FSM inputs connected to >I/O pins are synchronized with one FF each, and the whole design is >synchronous. Does anyone know of a good tutorial on this issue? I could add >that in my case, the transition into an illegal state almost always happen >immediately upon startup of the system, if it happens.How is the reset signal handled? If the FFs are asynchronously reset, then the end of reset can happen at different times to different FFs, leading to an illegal state. -- Phil Hays Phil-hays at posting domain should work for email
Reply by ●July 7, 20042004-07-07
Phil Hays <Spampostmaster@comcast.net> writes:> <jerkerNO@SPAMdst.se> wrote: > >>1. Most discussions cover how to recover from illegal states, but few cover >>how it actually happens. What are the (I presume) electrical reasons to that >>a state machine runs into an illegal state in the first place? Is there >>anything one can do to reduce the risk? Assume all FSM inputs connected to >>I/O pins are synchronized with one FF each, and the whole design is >>synchronous. Does anyone know of a good tutorial on this issue? I could add >>that in my case, the transition into an illegal state almost always happen >>immediately upon startup of the system, if it happens. > > How is the reset signal handled? If the FFs are asynchronously reset, > then the end of reset can happen at different times to different FFs, > leading to an illegal state.Internal noise coupling in the chip (crosstalk), power drops, alpha particles, not properly double-sync'ing an async signal before using it in two different places (BTDT, seen it in a real chip), ... the list goes on! --Kai
Reply by ●July 7, 20042004-07-07
"Jerker Hammarberg (DST)" wrote:> > Hi all! I'd like to once again bring up the subject of state machines > running into illegal states (illegal in the sense that the state vector does > not correspond to any of the states defined in the VHDL code), because > despite having spent half a day googling and reading related threads, I'm > still left with a couple of questions: > > 1. Most discussions cover how to recover from illegal states, but few cover > how it actually happens. What are the (I presume) electrical reasons to that > a state machine runs into an illegal state in the first place? Is there > anything one can do to reduce the risk? Assume all FSM inputs connected to > I/O pins are synchronized with one FF each, and the whole design is > synchronous. Does anyone know of a good tutorial on this issue? I could add > that in my case, the transition into an illegal state almost always happen > immediately upon startup of the system, if it happens.I have never had a design with a state machine which got into illegal states. The only two reasons that I can think of for this happening is 1) electrical noise which would also cause upset of *other* FFs in the system causing other symptoms and 2) timing issues with the FSM. This can be either from async inputs (metastability) or from failing to meet setup time on a reg input. If you have done your static timing analysis correctly, then it must be a metastability issue. The fact that it occurs happens on startup says to me it is a timing issue. If you can chase the problem away by slowing your clock, then it is a static timing issue. If it persists, then you most likely have a metastable issue. Figure out what is wrong and deal with the cause of the problem.> 2. How can I force Xilinx XST (6.2 SP3) to produce a safe FSM that recovers > from an illegal state? A "when others => state <= IDLE;" clause doesn't seem > to help (which I think is stupid, isn't this problem so well known that they > should make XST recognize it instead of optimizing it away?). I realize that > changing coding style to "Binary" will reduce the number of illegal state > and thus the risk for it to happen, but it's not completely safe, unless the > number of states is a power of two. What's more, binary coding style seems > to increase slice utilization for the whole design by up to 10%.I am not a fan of dealing with this type of problem by illegal state recognition. If it gets into an illegal state it has already caused a malfunction of the rest of the circuit most likely. Getting back to a known state is only useful in that it can resume normal operation. But it is not a "fix". I am unclear as to why the others clause would not result in recovery from an illegal state. That could very easily add a lot of extra logic and even slow the max speed of the FSM. But it should not be optimized away since it is a specified part of the machine. I assume the illegal state detection works in simulation, no? If so, it should work in operation. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAX
Reply by ●July 7, 20042004-07-07
> How is the reset signal handled? If the FFs are asynchronously reset, > then the end of reset can happen at different times to different FFs, > leading to an illegal state. Hi Phil! I use no reset signal at all; instead I specify initial values for all signals by the declarations, which is supposed to work fine with XST. But your point is still interesting in case I would need to introduce an asynchronous reset some day. Does that mean one should avoid them if illegal states are a concern? /Jerker
Reply by ●July 7, 20042004-07-07
> Internal noise coupling in the chip (crosstalk), power drops, alpha > particles, not properly double-sync'ing an async signal before using > it in two different places (BTDT, seen it in a real chip), ... the > list goes on! OK, probably I would need the complete list with full descriptions! Do you know of any books or tutorials on this subject? I'm not really an electrical engineer but I have to deal with this, so any pointers would be appreciated. /Jerker
Reply by ●July 7, 20042004-07-07
rickman wrote:> > I am unclear as to why the others clause would not result in recovery > from an illegal state. That could very easily add a lot of extra logic > and even slow the max speed of the FSM. But it should not be optimized > away since it is a specified part of the machine. I assume the illegal > state detection works in simulation, no? If so, it should work in > operation. >The synthesis tools are "smart" enough to recognize that there is no logical way to reach the "others" state. Therefore it is optimized out. Many synthesis tools do this. -- My real email is akamail.com@dclark (or something like that).
Reply by ●July 7, 20042004-07-07
Hi Jerker, Well, you _do_ have a reset. It's just a little hidden. At some point the storage elements stop being held at the initial values. This is the equivalent of your reset being released. If the elements are being clocked asynchronously to the release signal at this time, you could be in trouble. Cheers, Syms. "Jerker Hammarberg (DST)" <jerkerNO@SPAMdst.se> wrote in message news:hzZGc.4458$dx3.36217@newsb.telia.net...> > How is the reset signal handled? If the FFs are asynchronously reset, > > then the end of reset can happen at different times to different FFs, > > leading to an illegal state. > > Hi Phil! I use no reset signal at all; instead I specify initial values > for all signals by the declarations, which is supposed to work fine with > XST. But your point is still interesting in case I would need to > introduce an asynchronous reset some day. Does that mean one should > avoid them if illegal states are a concern? > > /Jerker
Reply by ●July 7, 20042004-07-07
"Jerker Hammarberg (DST)" <jerkerNO@SPAMdst.se> wrote:> > How is the reset signal handled? If the FFs are asynchronously reset, > > then the end of reset can happen at different times to different FFs, > > leading to an illegal state. > >Hi Phil! I use no reset signal at all; instead I specify initial values >for all signals by the declarations, which is supposed to work fine with >XST.You do have an asynchronous reset, you just didn't know that you did. When a Xilinx FPGA finishes the program download, it has all initial values held until an internal signal is released. This release is asynchronous to your clock. To avoid problems with this add a counter that is reset to all zeros. Until that counter counts to 15, keep the state machine in the initial state. (Note: Startup is a messy subject. This is a simplified version.) There is another common issue with DCMs or DLLs that you might also be having a problem with. Are you using a DCM or a DLL?> But your point is still interesting in case I would need to > introduce an asynchronous reset some day. Does that mean one should > avoid them if illegal states are a concern?Yes. Suppose the initial state is "100" and the desired next state is "010". This would be a three state one-hot machine. If the first bit is held until just after the first edge of the clock and the second bit is held until just before the first edge of the clock, then the next state will be illegal, "110". If the first bit is held until just before the first edge of clock and the second bit is held until just after the first edge of the clock, then the next state will be illegal, "000". Does that make it clear? -- Phil Hays Phil-hays at posting domain should work for email
Reply by ●July 7, 20042004-07-07
Duane Clark wrote:> rickman wrote: > >> >> I am unclear as to why the others clause would not result in recovery >> from an illegal state. That could very easily add a lot of extra logic >> and even slow the max speed of the FSM. But it should not be optimized >> away since it is a specified part of the machine. I assume the illegal >> state detection works in simulation, no? If so, it should work in >> operation. > > > The synthesis tools are "smart" enough to recognize that there is no > logical way to reach the "others" state. Therefore it is optimized out. > Many synthesis tools do this.Wow, isn't software clever... and it probably does not tell you it did this either... but never mind, the other state have no logical pathways, so everything will be OK. Back to the real engineering world: Do LOOK at the resultant output of your tools, and HOW it actually built the FSM. It can use .D or .T registers, with .D the most common. Implicit in most .D coding is that state 00000 is the goto state from any illegal ones : Thus for many reasons (hopefully very rare) you MIGHT goto an illegal state, but the one after that will be 00000. This should be a cornerstone state of your legal state list, either the POR state, or the safe-idle state. Choosing gray code related states can reduce the pathways to illegal states, but in complex FSM's, this is not always possible. You should not rely on this recovery pathway in regular system operation, it should be a safety-net. During tests, you could INC a counter when passing through 00000, .T register state engines can be smaller, but they also can literally stick at an illegal state. -jg





