FPGARelated.com
Forums

fifo or sdram bug?

Started by kaz July 30, 2015
Am Donnerstag, 13. August 2015 21:00:07 UTC+2 schrieb kaz:
> The fifo is not home grown. It is altera fifo core. We never discard well > tried cores for home made work. > > DC fifo is built by Altera around dual ram but if (as in my case) the > clock rates are predictable then one can control wr/rd pointers each in > their clock domain without having to cross clock domains thus reducing > risk and resource. That is my point and is well known design > recommendation. > > The fifo in question is just 32 bit wide dc fifo from altera core with > internal pipe set to 3, rd/wr protected, connected to clk 368.64 at write > side enabled 2/3 and connected to 245.76 on the read side always enabled. > Initially the read > enable is delayed to wait for few words (even though it is protected). > > Timing is clean. I imagine the write pointer is working but the read > pointer is toggling between 0 and 15 with two clock delays leading to > samples 0,17,2,19 ...etc. Just a guess. > > I have put a ram to capture few data from this fifo in the field when > problem occurs and I am awaiting results. > --------------------------------------- > Posted through http://www.FPGARelated.com
I have seen random/strange issues with Altera FIFOs in the past more than once (but still seldom). In that cases I replaced the FIFOs with "home grown" ones and the problems were solved. I never tracked that down in detail (maybe it was related to the clear signal in some way...) and I also cannot remember details (one or two clocks, same or different width in/out), I have used so many FIFOs (mostly successfully) over the years... I would suggest to look at all inputs and outputs of the FIFO with SignalTap and see if it behaves correctly. P.S.: Oh, I just read your initial post that you cannot reproduce it in the lab... P.P.S.: I have googled around to find an old post of me regarding this. I failed. But I found this: https://www.altera.com/support/support-resources/knowledge-base/solutions/rd11182011_10.html and this: https://www.altera.com/support/support-resources/knowledge-base/solutions/rd02232015_507.html (Maybe there are more...) So I would suggest to replace the FIFOs with a home grown approach and look if this solves the issues in field... Regards, Thomas www.entner-electronics.com - Home of EEBlaster
Thanks Thomas. These two links could explain it well though my quartus is
older version but I assume Altera have bug in reporting their bug as well.
The link says the possibility of failure increases with speed and logic
size. Our speed of 368.64/245.76MHz is certainly fast and the logic we
have is massive.

Altera says unpredictable behaviour in hardware without telling as what
sort of behaviour. I will certainly go for dual port ram without grey
counter or clock crossing. 

---------------------------------------
Posted through http://www.FPGARelated.com
Hello,

Am Donnerstag, 13. August 2015 23:40:39 UTC+2 schrieb kaz:
> Altera says unpredictable behaviour in hardware without telling as what > sort of behaviour. I will certainly go for dual port ram without grey > counter or clock crossing.
I have no experience with Altera, but remember long time ago a "bug" for Xilinx with RAM, where I heard this was in fact related to weakness in test routines to identify production failures. Microsemi had also ~2 years ago some RAM related issues were I believer long routing of signals addressing RAM was not correct annotated which might lead to STA say clean, but real timing was not clean. I had once a problem with a design, which behaved in rare cases in field strange while lab showed long time no problem and it took very long to track the error to an bug in external USB software, I know therefore how nasty it is to proove over and over again, that fpga internal timing is fine for every clock crossing. regards Thomas
On Thursday, August 13, 2015 at 5:40:39 PM UTC-4, kaz wrote:
> Thanks Thomas. These two links could explain it well though my quartus is > older version but I assume Altera have bug in reporting their bug as well.
What version of Quartus are you using? The one link says it applies only to 11.1 with a fix starting with 11.1 SP1. The other link does not specify a software version number but the report is dated just over one month ago which suggests that the problem has resurfaced. Your best bet then would be get on Altera to see what the latest word is on a fix here
> The link says the possibility of failure increases with speed and logic > size.
I see no mention of that
> I will certainly go for dual port ram without grey counter or clock crossing.
This doesn't make much sense...but anyway, it appears that there is anecdotal evidence of a problem with applying timing constraints that can cause the fifo you are using to fail so you're likely getting to the root cause. Kevin Jennings
On Thursday, August 13, 2015 at 5:40:39 PM UTC-4, kaz wrote:
> Thanks Thomas. These two links could explain it well though my quartus is > older version but I assume Altera have bug in reporting their bug as well.
What version of Quartus are you using? The one report is specific to only 11.1 with a fix in 11.1 SP1. The other report is from a bit over a month ago and doesn't link it to a specific version but obviously the problem seems to have resurfaced.
> The link says the possibility of failure increases with speed and logic > size.
No it doesn't
> I will certainly go for dual port ram without grey counter or clock crossing.
This makes no sense...but in any case you now have anecdotal evidence of how incorrect timing constraints can result in a failure of the fifo that you are using so you are likely close to a solution. Kevin Jennings
>On Thursday, August 13, 2015 at 5:40:39 PM UTC-4, kaz wrote: >> Thanks Thomas. These two links could explain it well though my quartus
is
>> older version but I assume Altera have bug in reporting their bug as >well. > >What version of Quartus are you using? The one report is specific to
only
>11.1 with a fix in 11.1 SP1. The other report is from a bit over a
month
>ago and doesn't link it to a specific version but obviously the problem >seems to have resurfaced. > >> The link says the possibility of failure increases with speed and
logic
>> size. > >No it doesn't >
here is the copy/paste from second link The probability of a hardware failures increases as the logic utilization and DCFIFO clock rates increase. --------------------------------------- Posted through http://www.FPGARelated.com
> >> I will certainly go for dual port ram without grey counter or clock >crossing. > >This makes no sense... > >Kevin Jennings
Kevin, I will be interested to see why it doesn't make sense. Though wr/rd clocks are asynchronous but write rate = read rate regularly. I don't need to worry about pointer/flag crossing. fifo is 16 words deep. I wait till it is half full and start reading non stop each side running its own address generated in its clock domain. It should have been dual port ram only in the first place. --------------------------------------- Posted through http://www.FPGARelated.com
On Friday, August 14, 2015 at 9:17:56 AM UTC-4, kaz wrote:
> > I will be interested to see why it doesn't make sense.
"...without grey counter or clock crossing" didn't make much sense since you still have two clocks and there will still be clock domain crossings. Those crossings will just occur between domains where there is a specified phase relationship which does open up alternative solutions over situations where you don't have such a relationship. It's not worth belaboring here, sounds like you know what you want to implement. Kevin
On 8/14/2015 9:58 AM, KJ wrote:
> On Friday, August 14, 2015 at 9:17:56 AM UTC-4, kaz wrote: >> >> I will be interested to see why it doesn't make sense. > > "...without grey counter or clock crossing" didn't make much sense since you still have two clocks and there will still be clock domain crossings. Those crossings will just occur between domains where there is a specified phase relationship which does open up alternative solutions over situations where you don't have such a relationship. It's not worth belaboring here, sounds like you know what you want to implement.
I'm not clear on how a clock domain crossing can have any timing restrictions. Clock domain crossings can be between async clocks which can have any phase relationship on any clock edge, varying with each cycle. So how can you possibly apply timing constraints to that? -- Rick
On Friday, August 14, 2015 at 1:32:02 PM UTC-4, rickman wrote:
> On 8/14/2015 9:58 AM, KJ wrote: > > On Friday, August 14, 2015 at 9:17:56 AM UTC-4, kaz wrote: > >> > >> I will be interested to see why it doesn't make sense. > > > > "...without grey counter or clock crossing" didn't make much sense sinc=
e you still have two clocks and there will still be clock domain crossings.= Those crossings will just occur between domains where there is a specifie= d phase relationship which does open up alternative solutions over situatio= ns where you don't have such a relationship. It's not worth belaboring her= e, sounds like you know what you want to implement.
>=20 > I'm not clear on how a clock domain crossing can have any timing=20 > restrictions. Clock domain crossings can be between async clocks which=
=20
> can have any phase relationship on any clock edge, varying with each=20 > cycle. So how can you possibly apply timing constraints to that? >=20
Kaz's earlier posts indicated that the two clocks in question had a fixed p= hase relationship so he was going to try to take advantage of that in some = fashion. So there are still multiple clock domains, therefore there are cr= ossings but the two clocks are not truly asynchronous (unless Kaz has been = holding back on his description of the clocks). If the clocks truly have a fixed phase relationship, then a failure that wo= uld occur only one in a bazillion lifetimes if the clocks had been asynchro= nous could tend to show up much more frequently since the failing timing pa= ths could be way more likely to be occurring on almost every clock cycle, n= ot just when the clock phases drifted into just the wrong alignment. Kevin