Hi group,
I have a clock coming into my FPGA driving a synchronous state machine.
My investigation would lead me to believe that there is a
spurious edge which my SM reacts to, thus causing erroneous data to be
latched.
I want to sample the clock and verify that the pulse I'm seeing is a
true pulse. I'm thinking about something simple like making sure the
incoming pulse is high for 3 or 4 sample clock periods and if so just
pass the clock through. My implementation would shift in the sampled
clock and the result of that shift register would gate the incoming clock.
Any thoughts?
Thank you,
Rob
Sampling a clock
Started by ●December 9, 2008
Reply by ●December 9, 20082008-12-09
On Dec 9, 8:46=A0pm, Rob <buz...@leavemealone.com> wrote:> Hi group, > > I have a clock coming into my FPGA driving a synchronous state machine. > =A0 =A0 =A0 =A0My investigation would lead me to believe that there is a > spurious edge which my SM reacts to, thus causing erroneous data to be > latched. > > I want to sample the clock and verify that the pulse I'm seeing is a > true pulse. =A0I'm thinking about something simple like making sure the > incoming pulse is high for 3 or 4 sample clock periods and if so just > pass the clock through. =A0My implementation would shift in the sampled > clock and the result of that shift register would gate the incoming clock=.> > Any thoughts? >That would be a waste of time. The clock is coming into the FPGA so you should verify that it is good using an oscilloscope. No matter what you found out if you were to implement some sort of filter as you described you would still need to validate that the clock *really* is doing something odd. Once you've done that the proper solution is to fix the glitch in the clock, not try to filter it. The other thing is to simply re-check your state machine logic. The most common cause for a timing failure is setup time violations. Every state machine input needs to be driven by logic that is clocked by that same clock signal. Any input that is not will eventually cause a state machine to do something unusual with nearly 100% certainty. What symptoms were you seeing that led you to believe that a spurious clock was the culprit and what is your rationale for ruling out setup or hold time violations or crossing clock domains? Kevin Jennings
Reply by ●December 10, 20082008-12-10
KJ wrote:> On Dec 9, 8:46 pm, Rob <buz...@leavemealone.com> wrote: >> Hi group, >> >> I have a clock coming into my FPGA driving a synchronous state machine. >> My investigation would lead me to believe that there is a >> spurious edge which my SM reacts to, thus causing erroneous data to be >> latched. >> >> I want to sample the clock and verify that the pulse I'm seeing is a >> true pulse. I'm thinking about something simple like making sure the >> incoming pulse is high for 3 or 4 sample clock periods and if so just >> pass the clock through. My implementation would shift in the sampled >> clock and the result of that shift register would gate the incoming clock. >> >> Any thoughts? >> > > That would be a waste of time. The clock is coming into the FPGA so > you should verify that it is good using an oscilloscope. No matter > what you found out if you were to implement some sort of filter as you > described you would still need to validate that the clock *really* is > doing something odd. Once you've done that the proper solution is to > fix the glitch in the clock, not try to filter it. > > The other thing is to simply re-check your state machine logic. The > most common cause for a timing failure is setup time violations. > Every state machine input needs to be driven by logic that is clocked > by that same clock signal. Any input that is not will eventually > cause a state machine to do something unusual with nearly 100% > certainty. > > What symptoms were you seeing that led you to believe that a spurious > clock was the culprit and what is your rationale for ruling out setup > or hold time violations or crossing clock domains? > > Kevin JenningsI'm trying to prove my suspicion at this point. I have no access to board clock trace (other than a via) so doing some filtering on there is out of the question. Of course the proper solution would be to fix the glitch in the clock; and if that is the case we will re-spin the board. But at this point I need to ID the root cause. This is an extremely slow interface so I am ruling out setup time violations. The clock is under 6kHz and drives into the global clock net on the FPGA. There are 25 clock pulses per transmission. The incoming clock drives a counter and the SM. The SM transitions based on the current state and the value of the counter. And yes, the whole SM process is clocked by the same clock. This is receiving data from a 3 wire serial bus: clock , enable, and data. The FPGA is a slave in this scenario. Have I seen evidence of glitching on the board net--no. I have a fairly decent scope, too: 5GS / 500MHz bandwidth; so if something was there one would think I would see it. So again I'm just trying to rule things out at this time. Most of our boards work but we have a few that exhibit this problem; which leads me to believe it is not the logic but an environmental cause. I appreciate your time, Rob
Reply by ●December 10, 20082008-12-10
In comp.arch.fpga, Rob <buzoff@leavemealone.com> wrote:> > This is an extremely slow interface so I am ruling out setup time > violations. The clock is under 6kHz and drives into the global clock > net on the FPGA. There are 25 clock pulses per transmission. The > incoming clock drives a counter and the SM. The SM transitions based on > the current state and the value of the counter. And yes, the whole SM > process is clocked by the same clock. This is receiving data from a 3 > wire serial bus: clock , enable, and data. The FPGA is a slave in this > scenario. > > Have I seen evidence of glitching on the board net--no. I have a fairly > decent scope, too: 5GS / 500MHz bandwidth; so if something was there one > would think I would see it. So again I'm just trying to rule things out > at this time. Most of our boards work but we have a few that exhibit > this problem; which leads me to believe it is not the logic but an > environmental cause.Have you looked at the rise and fall times of your signals? I know there are maximum rise and times in xilinx spartan datasheets and suspect this to be the case for other FPGAs as well. -- Stef (remove caps, dashes and .invalid from e-mail address to reply by mail) New Hampshire law forbids you to tap your feet, nod your head, or in any way keep time to the music in a tavern, restaurant, or cafe.
Reply by ●December 10, 20082008-12-10
On 2008-12-10, Mike Treseler <mtreseler@gmail.com> wrote:> Rob wrote: > >> The clock is under 6kHz and drives into the global clock >> net on the FPGA. There are 25 clock pulses per transmission. > > Then this is an input. > The synchronous clock should be constant. > > -- Mike TreselerSo if I understand this corectly you have one internalal FPGA clock running in the MHz range (I guess anyway). Then you have an input clock that you try to use to drive some internal synchronous logic. This is a sure recipe for disaster. First of all remember to synchronize all asynchronous inputs. Async --> [ FF driven with internal clock ] --> To logic. If you need to check for a clock edge on the incoming clock. Do as follows. module(input wire clk_i, //At least double the speed of the slow clock. input wire in_slow_clk, ...); logic sync_slow_clk; logic old_in_clk; logic slow_rising_edge; always_ff@(posedge clk_i) begin sync_slow_clk <= in_slow_clk; old_in_clk <= sync_slow_clk; end assign slow_rising_edge = sync_slow_clk & ~old_in_clk; Then write all of your logic depending in the rising edge of the slow clock like this: always_ff@(posedge clk_i) begin if(rst_i) begin//If needed //Do reset stuff else if(slow_rising_edge) begin //Do synchronous logic depending on slow clock end end
Reply by ●December 10, 20082008-12-10
On Dec 9, 11:27=A0pm, Rob <buz...@leavemealone.com> wrote:> > I'm trying to prove my suspicion at this point. =A0I have no access to > board clock trace (other than a via) so doing some filtering on there is > out of the question. Of course the proper solution would be to fix the > glitch in the clock; and if that is the case we will re-spin the board. > =A0 But at this point I need to ID the root cause.I agree, and a scope is the only thing that will be the smoking gun that proves the clock has a problem that needs fixing. If the scope shows... - The transitions are all getting to valid logic levels - Rising and falling edges are monotonic - No glitches or runt pulses. - Rise and fall times are neither too fast nor too slow for the receiving part Then you would best spend your time looking elsewhere. Each of these things can be specifically triggered on without much difficulty so if you can cause your FPGA's state machine to get into the bad state and none of those conditions occurred, it likely isn't the cause.> > This is an extremely slow interface so I am ruling out setup time > violations. =A0The clock is under 6kHz and drives into the global clock > net on the FPGA. =A0Have you verified with a scope that that the actual system is meeting whatever setup and hold time requirements are reported in the FPGA's timing report? The clock speed being slow does not imply that inputs are arriving at the proper time.> There are 25 clock pulses per transmission. =A0The > incoming clock drives a counter and the SM. =A0The SM transitions based o=n> the current state and the value of the counter. =A0And yes, the whole SM > process is clocked by the same clock. =A0This is receiving data from a 3 > wire serial bus: clock , enable, and data. =A0The FPGA is a slave in this > scenario. >My question in the first post was regarding inputs to the state machine and whether or not they are clocked from the same clock as the state machine. As you've described things, it sounds like you have at least two clocks in your design since you said in the OP... "I want to sample the clock and verify that the pulse I'm seeing is a true pulse. I'm thinking about something simple like making sure the incoming pulse is high for 3 or 4 sample clock periods" If the input clock is the only clock in the FPGA there could be no concept of sampling it to see if it was high for 3 or 4 clock periods since by definition it would only be high for roughly 1/2 of a clock cycle (assuming 50% duty cycle). So if there are two clock domains in the FPGA design, then are there *any* signals that are input signals to the state machine that come from the other clock domain? If so, they are the culprits regardless of how infrequently those signals may or may not occur.> Have I seen evidence of glitching on the board net--no. =A0I have a fairl=y> decent scope, too: 5GS / 500MHz bandwidth; so if something was there one > would think I would see it. =A0So again I'm just trying to rule things ou=t> at this time. =A0The scope would rule it out checking for each of the conditions mentioned earlier. The other thing to try would be to modify the FPGA design (if necessary) to bring out some signal that clearly indicates when the state machine has gone into the bad state. Then look at that signal using it as a trigger and look at the clock on the scope. If there is no evidence of anything unusual on the clock prior to the trigger than you can definitively rule out the clock since the scope will be showing you precisely what the clock looked like at the time when things went bad.> Most of our boards work but we have a few that exhibit > this problem; which leads me to believe it is not the logic but an > environmental cause. >When things work 'most of the time' (i.e. not 'always') or 'some' boards work (but not 'all') the root cause is darn near always timing violations. The other root cause is power delivery. Although this is less frequent it is worth verifying that power to the parts is within the +/- 5% spec. If heating or cooling some part on the circuit boards causes *any* change (i.e. breaks a working board or fixes a broken one for some period of time) then the cause is a timing violation. Kevin Jennings
Reply by ●December 10, 20082008-12-10
Rob wrote:> The clock is under 6kHz and drives into the global clock > net on the FPGA. There are 25 clock pulses per transmission.Then this is an input. The synchronous clock should be constant. -- Mike Treseler
Reply by ●December 10, 20082008-12-10
On Dec 10, 7:48=A0am, Per <perk-nos...@isy.liu.se> wrote:> On 2008-12-10, Mike Treseler <mtrese...@gmail.com> wrote: > > > Rob wrote: > > >> The clock is under 6kHz and drives into the global clock > >> net on the FPGA. =A0There are 25 clock pulses per transmission. > > > Then this is an input. > > The synchronous clock should be constant. > > > =A0 -- Mike Treseler > > So if I understand this corectly you have one internalal FPGA clock > running in the MHz range (I guess anyway). Then you have an input > clock that you try to use to drive some internal synchronous logic. > This is a sure recipe for disaster. > > First of all remember to synchronize all asynchronous inputs. > > Async --> [ FF driven with internal clock ] --> To logic. > > If you need to check for a clock edge on the incoming clock. Do as > follows. > > module(input wire clk_i, > =A0 =A0 =A0 //At least double the speed of the slow clock. > =A0 =A0 =A0 input wire in_slow_clk, ...); > > logic sync_slow_clk; > logic old_in_clk; > logic slow_rising_edge; > > always_ff@(posedge clk_i) begin > =A0 =A0sync_slow_clk <=3D in_slow_clk; > =A0 =A0old_in_clk <=3D sync_slow_clk; > end > > assign slow_rising_edge =3D sync_slow_clk & ~old_in_clk; > > Then write all of your logic depending in the rising edge of the slow > clock like this: > > always_ff@(posedge clk_i) begin > =A0 if(rst_i) begin//If needed > =A0 =A0 =A0//Do reset stuff > =A0 else > =A0 if(slow_rising_edge) begin > =A0 =A0 //Do synchronous logic depending on slow clock > =A0 end > endThere are no asynchronous inputs. All the inputs are sync'd to the incoming slow 6kHz clock. The other fast internal FPGA MHz clock is for something completely unrelated and has nothing to do with the SM. Rob
Reply by ●December 10, 20082008-12-10
On Dec 10, 11:43=A0am, Rob <robns...@frontiernet.net> wrote:> > There are no asynchronous inputs. =A0All the inputs are sync'd to the > incoming slow 6kHz clock. =A0The other fast internal FPGA MHz clock is > for something completely unrelated and has nothing to do with the SM. > > Rob- Hide quoted text - >Maybe add a free running counter that is clocked by the 6kHz clock that counts from 0 to 24 and then goes back to 0 resetting it only during powerup. Bring the counter out to debug pins. If your original hypothesis is correct that "...there is a spurious edge which my SM reacts to, thus causing erroneous data to be latched." then this counter would at some point end up at a non-zero count between transmissions (I'm assuming here that the transmissions are not continuous and there are identifiable gaps in time during which the state machine is waiting for the next 25 clock transmission to occur). With that setup you can also look for times when the counter changes states quicker than the expected 6 kHz rate. It can be a tad difficult depending on what equipment you can bring to bear and how many debug pins you have in your FPGA, but at least it gives you a direct view of flops inside the device that are being clocked and that will change state with every clock. Building somewhat further on that approach, if the state machine already has a signal that identifies the 'waiting for a new transmission to occur' time, then you could gate that signal with the counter being 0 and simply output a '1' when the protocol has been broken. That gives you one specific signal to monitor to trigger the scope. Either of those approaches though is only meant to give you a known reliable trigger mechanism for a scope or logic analyzer so you can investigate further (if it triggers) or reject your hypothesis (if it does not). It's also not really clear whether you have such a trigger condition or if you're relying on perhaps some higher level system observation instead. Your original thought to build a pulse monitor was somewhat along these lines, the problem with that approach though is that you're jumping to a solution that has holes to it. Even if your hypothesis is correct you haven't actually verified the root cause problem. Kevin Jennings
Reply by ●December 10, 20082008-12-10
On Dec 10, 12:14=A0pm, KJ <kkjenni...@sbcglobal.net> wrote:> On Dec 10, 11:43=A0am, Rob <robns...@frontiernet.net> wrote: > > > > > There are no asynchronous inputs. =A0All the inputs are sync'd to the > > incoming slow 6kHz clock. =A0The other fast internal FPGA MHz clock is > > for something completely unrelated and has nothing to do with the SM. > > > Rob- Hide quoted text - > > Maybe add a free running counter that is clocked by the 6kHz clock > that counts from 0 to 24 and then goes back to 0 resetting it only > during powerup. =A0Bring the counter out to debug pins. =A0If your > original hypothesis is correct that "...there is a spurious edge which > my SM reacts to, thus causing erroneous data to be latched." then this > counter would at some point end up at a non-zero count between > transmissions (I'm assuming here that the transmissions are not > continuous and there are identifiable gaps in time during which the > state machine is waiting for the next 25 clock transmission to > occur). =A0With that setup you can also look for times when the counter > changes states quicker than the expected 6 kHz rate. =A0It can be a tad > difficult depending on what equipment you can bring to bear and how > many debug pins you have in your FPGA, but at least it gives you a > direct view of flops inside the device that are being clocked and that > will change state with every clock.I already have a counter (that counts from 0 to 24) tied to the incoming 6kHz clock. It is this counter that determines where I am in the serial stream; and thus determines which flop within my data register gets clocked with the incoming data. But yes, I have already thought about doing this. But I don't need to--read below.> > Building somewhat further on that approach, if the state machine > already has a signal that identifies the 'waiting for a new > transmission to occur' time, then you could gate that signal with the > counter being 0 and simply output a '1' when the protocol has been > broken. =A0That gives you one specific signal to monitor to trigger the > scope. > > Either of those approaches though is only meant to give you a known > reliable trigger mechanism for a scope or logic analyzer so you can > investigate further (if it triggers) or reject your hypothesis (if it > does not). > > It's also not really clear whether you have such a trigger condition > or if you're relying on perhaps some higher level system observation > instead. =A0Your original thought to build a pulse monitor was somewhat > along these lines, the problem with that approach though is that > you're jumping to a solution that has holes to it. =A0Even if your > hypothesis is correct you haven't actually verified the root cause > problem. >I did build the pulse monitor and the problem got worse. This of course told me that I must be fighting a timing that is on the hairy edge, as there is a slight delay through my "filter". So I started looking at the path in much more scrutiny and found a section of the placement that didn't have a lot of margin. I tightened this up and the problems have disappeared. This obviously makes more sense since I never saw any glitches/runts on the scope. And it also explains why some boards worked and some didn't. Again, I appreciate the dialogue....






