FPGARelated.com
Forums

Intel plans to tackle cosmic ray threat

Started by Symon April 8, 2008
Dear All, Austin in particular,
I saw this and thought of you!
Cheers, Syms.
http://news.bbc.co.uk/1/hi/technology/7335322.stm 


Symon,

Well, Cypress, Xilinx, IBM, and many others have made it no secret that
neutrons at sea level are causing upsets, and we have done something
about it (and presented the papers, and shown our results).

Intel has also been working very quietly on this, with much less press.

I suggest that if you are not thinking about single event effects, you
should be, and demanding your vendor show you the proof of their design
efforts in this regard.

Virtex 5 is (as of today), 144 FIT/Mb for the config bits, 95%
confidence interval from 100 to 200 FIT/Mb.  This is from our 400
devices located on mountain tops in France (31.029 Giga-bit-years of
test time, 35 events).

Compare this to a 65nm ASSP or ASIC, which is at least 1000 FIT/Mb or
1000 FIT/million gates(!).  Do nothing, and it gets worse.  Do
something, and it gets back to where it should be.  These numbers from
the SELSE II conference a few years back:  the industry numbers are
really a lot worse, but no one will admit it.

There is a reason why Xilinx FPGA devices are finding their way into
many high availability and high reliability applications: we are the
only choice -- there is no competition whatsoever.

Austin
"austin" <austin@xilinx.com> wrote in message 
news:ftg25m$p2m2@cnn.xsj.xilinx.com...
> > Intel has also been working very quietly on this, with much less press. >
Hi Austin, I wondered what were your thoughts on their patent where "The cosmic ray detector [built into the device] is therefore designed to spot when rays have caused interference and then tell the chip to repeat the command." ? I guess in an FPGA it could trigger a readback to ensure the device was still correctly configured and/or issue a user logic reset. Cheers, Syms.
Symon,

Well, that employee should be fired:  that is the stupidest thing I have
ever read.

It isn't even science -- detecting neutrons! Pure BS!  A neutron is an
uncharged particle, that goes through 10 meters of concrete before it
gets stopped.  Detecting one is just......stupid.....idiotic.....

(breathe in, breathe out.)

Their PR folks are probably going nuts on this one!

Was that April 1 dateline?

Anyway, Intel is pretty savvy, and they are not standing still.  If you
use their parts, you need to request their Soft Error Effects roadshow.

It is only given under NDA, so although I know it exists, and I suspect
I know what is in it, I have never seen it.

I have seen IBM's "show" and they certainly have their act together.  As
do we.  IBM's "show" is under NDA, however, so I can't say anything
about its contents.

Our roadshow is available by request from your local friendly FAE, and
it is no NDA is required (why would we hide we are the best?).

Remember:  per the JEDEC89A standard, there are three ways to
characterize soft error effects.  Be sure to ask which ones were used,
and their degree of confidence.

If they won't share this with you (under NDA), then they are hiding
something, something very very bad.

Austin
Symon,

First of all, there is no such thing as a single particle detector.

Secondly, detecting the current spike (from a strike) requires a very
complex circuit, itself subject to spikes (I know, we designed them for
the USAF...)

Thirdly, Intel has done far more than this, and deserved a better PR.

Perhaps they should fire the PR firm?

Austin

Symon wrote:
> "austin" <austin@xilinx.com> wrote in message > news:ftg25m$p2m2@cnn.xsj.xilinx.com... >> Intel has also been working very quietly on this, with much less press. >> > Hi Austin, > I wondered what were your thoughts on their patent where "The cosmic ray > detector [built into the device] is therefore designed to spot when rays > have caused interference and then tell the chip to repeat the command." ? I > guess in an FPGA it could trigger a readback to ensure the device was still > correctly configured and/or issue a user logic reset. > Cheers, Syms. > >
And,

Yes, in S3A, S3AN, S3D, V4, V5 we are able to either reconfigure on
detection of an upset, notify the user (and they decide what to do), or
in V4 and V5, correct the flipped bit without having to reconfigure (or
even go to the config flash/prom).

Basically, in our road show, it is detailed how the user needs to decide
what to do, and at what levels, in order to meet their availability and
reliability numbers.

Mitigation is part hardware, part system architecture, and part
software.  Depending on what you are doing, and how long you can
tolerate being "off-line" there are different solutions.

They are:
-just reconfigure, start fresh
-just fix the bit flip, continue on (as a flip does nothing 90% of the
time, and seldom causes anything to 'crash')
-fix the bit flip and reset or go back to a check point/known states
-use dual redundancy, and check for agreement (if a fault is not
tolerated - like in banking, accounting) repeat if no agreement
-use full triple modular redundancy (when it must be correct, and 100%
available), also scrub to fix bits that may flip so flips are not
allowed to accumulate

All methods are used by our customers, and they all work.  We have
reference designs and support for these models.  And they can be tested
by reconfiguring to flip bits while operating. One heck of a lot cheaper
than using a proton beam, or neutron beam .... and more complete (we
have folks who flip each bit, one by one, and prove their system meets
its requirements).

Austin
"austin" <austin@xilinx.com> wrote in message 
news:ftg4j2$pop1@cnn.xsj.xilinx.com...
> Symon, > > Well, that employee should be fired: that is the stupidest thing I have > ever read. > > It isn't even science -- detecting neutrons! Pure BS! A neutron is an > uncharged particle, that goes through 10 meters of concrete before it > gets stopped. Detecting one is just......stupid.....idiotic..... >
Austin, Are you talking about the link I posted? I didn't see any reference to neutrons, am I missing something? Also, if what you say is true, that neutrons whizz through 10 meters of concrete, aren't you gonna be incredibly unlucky to get a direct neutron hit on a 45nm transistor? (BTW., A cursory web search would suggest some kind of boron based detector, which kinda makes sense as boron is used to absorb thermal neutrons in nuclear reactors. http://en.wikipedia.org/wiki/Neutron_detection) My rudimentary knowledge of cosmic rays is that they are not neutrons but mainly protons (and a few alpha and beta particles). I would expect them to be more detectable. Whatever, I'm confused now... Cheers, Syms.
At sea level,

93% of particles from the cosmic ray shower are neutrons, and 7% are
protons (see JEDEC89A).

There are 12.9 per square cm, every hour, passing through everything
(for New York City, up to 25X more on mountain tops, 300X at 40K feet,
less at the equator, 10X at the poles...).

There are also electrons, muons, pions, and a host of more exotic stuff,
but hose either don' matter (do not affect anything), or they are
absorbed quickly, or decay (even a lone neutron decays in 11 minutes!).

So, like I said, that is the dumbest PR I have read.  It gets the first
prize for ignorance about soft error effects.

Some Real Science:

http://www.xilinx.com/support/documentation/white_papers/wp286.pdf

Austin
"austin" <austin@xilinx.com> wrote in message 
news:ftg9a5$p2n1@cnn.xsj.xilinx.com...
> At sea level, > > 93% of particles from the cosmic ray shower are neutrons, and 7% are > protons (see JEDEC89A). > > There are 12.9 per square cm, every hour, passing through everything > (for New York City, up to 25X more on mountain tops, 300X at 40K feet, > less at the equator, 10X at the poles...). > > There are also electrons, muons, pions, and a host of more exotic stuff, > but hose either don' matter (do not affect anything), or they are > absorbed quickly, or decay (even a lone neutron decays in 11 minutes!). >
Aha, thanks! Now I think I get most of it. It would seem that the cosmic rays, which are charged particles, hurtle into the earth from all directions. They are made of protons mainly, with some alpha and beta particles. The earth's magnetic field means that there are more at the poles than at the equator. The cosmic rays are charged and so interact with the atmosphere a lot, and so very few reach the earth's surface. However, these energetic collisions in the atmosphere produce showers of neutrons. These uncharged particles don't interact with the atmosphere nearly as much as the cosmic rays, so can reach the surface more easily. Ok, here's another question. As the uncharged neutrons don't interact with much, indeed you say they can go through 10 metres of concrete, I can't see why the highly interactive remaining protons aren't the real danger, even though they only comprise 7% of the total, not the 93% neutrons? Maybe none of the original protons reach the surface, but the 7% protons are produced by secondary neutron collisions? Sorry to bombard you with questions! Regards, Syms.
austin wrote:
> > So, like I said, that is the dumbest PR I have read. It gets the first > prize for ignorance about soft error effects.
Expecting quality in a PR document seems to be the triumph of hope over expereince? These thing start in the depths of a company, we assume largely accurate. Then, that companies Media liason/managers work on it. Then the PR firm 'works' on it and finally the publishing media's editors have a go. Like chinese whispers, any semblence to the original, is pure coincidence! ;) -jg