FPGARelated.com
Forums

Strange Spartan2 behaviour

Started by Nico Coesel September 3, 2008
I'm working on a Spartan2 design which runs on a PCI card or a PCI
Express card (through a PCI-express to PCI bridge chip). Besides the
PCI express or PCI interface the design is almost the same for both
boards.

In some specific type of PCs (mostly server chassis) the design quits
working right away or after a while. Some of the internal
statemachines quit even when the option 'Safe implementation' is
turned on. In most PCs (say 99.9%) the design works okay though. 

The timing is not very tight. The place&route report shows 8ns timing
margin on the 33MHz PCI clock which also clocks the internal logic. So
I suspect no problems there. 

On the PCI board the Spartan2 is powered by a LM1086 1.5A linear
regulator, on the PCI Express board the Spartan2 is powered by the
PC's 3.3V directly. Core power (2.5V) is provided by another LM1086
regulator. Bypassing consists of 4 4.7uf tantalum caps (2 for 2.5V and
2 for 3.3V) and about 10 100nf ceramic caps (0805 or 0603) again half
for 3.3V and the other half for 2.5V. The PCB is a 4 layer board with
power and ground planes.

Someone already implemented some safeguards around the statemachines
as a quick hack which seems to work. This more or less rules out
corruption of the configuration bits.

Any ideas where to look for? 

-- 
Programmeren in Almere?
E-mail naar nico@nctdevpuntnl (punt=.)
Nico Coesel wrote:

> In some specific type of PCs (mostly server chassis) the design quits > working right away or after a while.
> Any ideas where to look for?
unregistered inputs latch warnings missing external timing constraints -- Mike Treseler
On Wed, 03 Sep 2008 22:33:14 GMT, nico@puntnl.niks (Nico Coesel) wrote:

>I'm working on a Spartan2 design which runs on a PCI card or a PCI >Express card (through a PCI-express to PCI bridge chip). Besides the >PCI express or PCI interface the design is almost the same for both >boards. > >In some specific type of PCs (mostly server chassis) the design quits >working right away or after a while. Some of the internal >statemachines quit even when the option 'Safe implementation' is >turned on. In most PCs (say 99.9%) the design works okay though. > >The timing is not very tight. The place&route report shows 8ns timing >margin on the 33MHz PCI clock which also clocks the internal logic. So >I suspect no problems there.
Have you confirmed that the PCI clock is actually 33MHz? On some machines (mostly server motherboards) PCI can be switched to 66MHz (or 100 or 133 for PCI-X). I have observed a machine booting at 33MHz and switching to 66MHz during the BIOS startup (it was controllable via a BIOS setting). - Brian
We had an issue with a "ground bounce" (Spartan3 design) in the bank
containing the PCI-interface, which 'triggered' the asynchonous reset
of the PCI-core...

Do you use async resets ?

/Jochen
On Wed, 03 Sep 2008 22:33:14 GMT, nico@puntnl.niks (Nico Coesel)
wrote:

>In some specific type of PCs (mostly server chassis) the design quits >working right away or after a while. Some of the internal >statemachines quit even when the option 'Safe implementation' is >turned on. In most PCs (say 99.9%) the design works okay though.
Random data point from the mediaeval era: I had almost identical symptoms with a video processing card, about 15 years ago, in ISA bus. I eventually tracked it down to glitches on the bus reset line, so narrow that all the old-style ISA cards missed it, but the fast-ish FPGAs on my board would sometimes see it. I don't suppose it's the same issue for you. But it's weird how this kind of thing rears its head from time to time. -- Jonathan Bromley, Consultant DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK jonathan.bromley@MYCOMPANY.com http://www.MYCOMPANY.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.
Jochen <JFrensch@harmanbecker.com> wrote:

>We had an issue with a "ground bounce" (Spartan3 design) in the bank >containing the PCI-interface, which 'triggered' the asynchonous reset >of the PCI-core... > >Do you use async resets ?
Yes. The design uses async resets. But it doesn't seem to affect the PCI core. The problem is always in the part with the statemachines. -- Programmeren in Almere? E-mail naar nico@nctdevpuntnl (punt=.)
Jonathan Bromley <jonathan.bromley@MYCOMPANY.com> wrote:

>On Wed, 03 Sep 2008 22:33:14 GMT, nico@puntnl.niks (Nico Coesel) >wrote: > >>In some specific type of PCs (mostly server chassis) the design quits >>working right away or after a while. Some of the internal >>statemachines quit even when the option 'Safe implementation' is >>turned on. In most PCs (say 99.9%) the design works okay though. > >Random data point from the mediaeval era: I had almost identical >symptoms with a video processing card, about 15 years ago, in >ISA bus. I eventually tracked it down to glitches on the bus >reset line, so narrow that all the old-style ISA cards missed >it, but the fast-ish FPGAs on my board would sometimes see it. > >I don't suppose it's the same issue for you. But it's weird >how this kind of thing rears its head from time to time.
Thanks, this sounds interesting. -- Programmeren in Almere? E-mail naar nico@nctdevpuntnl (punt=.)
On 2008-09-03, Nico Coesel <nico@puntnl.niks> wrote:
> > In some specific type of PCs (mostly server chassis) the design quits > working right away or after a while.
Many desktop-class systems don't check PCI parity at all. Maybe that's changing as logic gets cheaper, but I know I implemented a target that didn't output any parity at all which worked fine on the motherboards I had at home. It's possible that your "server" motherboards are checking parity, seeing an error and doing something that you did not anticipate with your state machine. Of course with your PCI-PCIe bridge the problem should be isolated from the motherboard, but I can't tell if it was involved when you saw the problem. -- Ben Jackson AD7GD <ben@ben.com> http://www.ben.com/
Ben Jackson <ben@ben.com> wrote:

>On 2008-09-03, Nico Coesel <nico@puntnl.niks> wrote: >> >> In some specific type of PCs (mostly server chassis) the design quits >> working right away or after a while. > >Many desktop-class systems don't check PCI parity at all. Maybe that's >changing as logic gets cheaper, but I know I implemented a target that >didn't output any parity at all which worked fine on the motherboards I >had at home. > >It's possible that your "server" motherboards are checking parity, seeing >an error and doing something that you did not anticipate with your state >machine. > >Of course with your PCI-PCIe bridge the problem should be isolated from the >motherboard, but I can't tell if it was involved when you saw the problem.
Well, the PCI core is behaving just fine. It is the 'PCI client' side which goes wrong in a particular area which has nothing to do with the PCI core directly. But I guess it doesn't hurt to put a counter on the PERR signal to make sure. Thanks for the hint. -- Programmeren in Almere? E-mail naar nico@nctdevpuntnl (punt=.)
On 4 Sep., 18:06, n...@puntnl.niks (Nico Coesel) wrote:
> Jochen <JFren...@harmanbecker.com> wrote: > >We had an issue with a "ground bounce" (Spartan3 design) in the bank > >containing the PCI-interface, which 'triggered' the asynchonous reset > >of the PCI-core... > > >Do you use async resets ? > > Yes. The design uses async resets. But it doesn't seem to affect the > PCI core. The problem is always in the part with the statemachines. > > -- > Programmeren in Almere? > E-mail naar nico@nctdevpuntnl (punt=.)
well - let's have a look at http://forums.xilinx.com/xlnx/blog/article?message.uid=12856 esp. chapter "Unreliable Sporadic Behaviour!" /Jochen P.S. Ken really knows, what he is talking about !!!