Wonky Computer Chips: Both Google and Facebook are having trouble with chips acting up. From New Scientist mag (England). "Google and Facebook have discovered they are experiencing...

Ugly Hedgehog - Photography Forum

Home Active Topics Newest Pictures Search Login Register ☰

General Chit-Chat (non-photography talk)

Wonky Computer Chips

Jun 18, 2021 09:48:43 #

jerryc41 Loc: Catskill Mts of NY

Both Google and Facebook are having trouble with chips acting up. From New Scientist mag (England).

"Google and Facebook have discovered they are experiencing computer chip failures that can corrupt data or make it difficult to unlock encrypted files. Facebook says hardware manufacturers must take notice of the problem, which has emerged due to the vast scale of computing resources the firms use. The issue surfaced at Google when multiple teams of engineers reported problems with their computations, but the company’s usual diagnostic tools showed no problem. An investigation revealed that individual chips were responsible for repeated faults. In certain cases, researchers could prompt problems by changing a chip’s temperature. These “silent errors” are caused by bits on the chips flipping from 0 to 1 or vice versa. Cosmic radiation can cause bits to flip, so computers destined for space have to be specially designed to prevent this. The errors spotted by Google and Facebook manifest in a similarly sudden way, but are instead due to ever-shrinking chips exhibiting unpredictable behaviours."

Jun 18, 2021 10:49:27 #

downing Loc: Cincinnati

Moore`s law at it's end.

Jun 18, 2021 23:11:10 #

TriX Loc: Raleigh, NC

jerryc41 wrote:

Both Google and Facebook are having trouble with c... (show quote)

“Supercomputers” are often used for modeling, and the models can run for hours or days. To prevent losing the entire run (which can be VERY expensive on a big machine) in the event of a crash, periodically main memory (which can be a petabyte or more) is dumped to disk in a “checkpoint” operation, and the machine can do nothing until that process is completed. Some years ago scientists at Lawrence Livermore National Labs we’re reporting data from long running models where there are many checkpoints, that was simply wrong. It was finally determined that when the checkpoints were reviewed, the data on disk was correct, but was being corrupted by the occasional flipped bit in the disk read cache due to the particles from cosmic rays, which are constantly bombarding us. Over a zillion iterations, it was enough to skew the data. The answer was to design storage with parity checking on both reads and writes. Occasional flipped bits happen all the time in semiconductor memory, but in more “normal” usage, it is rarely if ever noticed unless there are lots of operations. That’s why ECC DRAM, which costs extra, is often used in servers with heavy use, but rarely seen in client machines.

Jun 19, 2021 08:29:42 #

sb Loc: Florida's East Coast

Yeah. As I get older sometimes I have a bit that flips from a 0 to a 1 or vice versa.

Jun 19, 2021 11:03:17 #

markngolf Loc: Bridgewater, NJ

sb wrote:

Yeah. As I get older sometimes I have a bit that flips from a 0 to a 1 or vice versa.

Only one??? You are lucky!!

Mark

Jun 19, 2021 13:47:11 #

TheShoe Loc: Lacey, WA

TriX wrote:

“Supercomputers” are often used for modeling, and ... (show quote)

Parity bits predated the supercomputers. They were used on mag tapes in 1951 because flaking of the oxide layer of the tapes was very common.

The first Cray computers were all about speed and did not have parity checking. Seymour Cray said that the people who used the machines would be able to see if the results were incorrect, rerun the jobs that had errors, and get the answers faster than they would if error correction were in use. The users reaction was to run jobs three times and accept answers that occurred more than once.

If you want to reply, then register here. Registration is free and your account is created instantly, so you can post right away.

General Chit-Chat (non-photography talk)

Home | Latest Digest | Back to Top | All Sections

UglyHedgehog.com - Forum