Advertisement

Archive for Thursday, August 26, 2010

KDHE restoring data after major computer crash

A computer crash that put thousands of state records out of reach for almost a month is close to being resolved.

August 26, 2010, 9:39 a.m. Updated August 26, 2010, 12:16 p.m.

Advertisement

— The repair of a computer meltdown at a large state agency is nearing completion, but Kansans trying to access vital records, such as birth certificates, still face long waits, officials said Thursday.

Kansas Department of Health and Environment Secretary Roderick Bremby (left) and Brian Reagan, chief of marketing for Xiotech, spoke Thursday at a news conference about KDHE's computer system failure. Bremby said the computer services were nearly restored and Reagan said Xiotech, the vendor of the system that failed, would be footing the bill for the recovery.

Kansas Department of Health and Environment Secretary Roderick Bremby (left) and Brian Reagan, chief of marketing for Xiotech, spoke Thursday at a news conference about KDHE's computer system failure. Bremby said the computer services were nearly restored and Reagan said Xiotech, the vendor of the system that failed, would be footing the bill for the recovery.

“By and large, we are pretty much back to operational,” said Roderick Bremby, secretary of the Kansas Department of Health and Environment.

But Bremby said the Office of Vital Statistics’ system probably won’t be fully functional until Monday.

He said KDHE was reassigning 50 employees to help clear a backlog of 8,000 requests, mostly for birth certificates. That essentially doubles the work force in that area.

He said Kansans trying to access a birth certificate may face a four-week wait.

The department has had to retrieve records from its storage facilities in salt mines in central Kansas to handle requests.

The KDHE computer system started failing three weeks ago. Soon after, officials said, they were facing a “never event” with 150 servers affected with 24.7 terabytes of data. There are 20 million records in the Office of Vital Statistics system alone.

So far, the problem has been diagnosed as a disk drive malfunction that cascaded throughout the system and a failure of what is called the system storage area, or SAN.

Xiotech, the Minnesota vendor of the storage area network, replaced the SAN on Aug. 13.

Xiotech announced it will pay for all costs of the system recovery, upgrades and overtime wages for KDHE employees who have been working on the problem. At a news conference, Bremby said the costs would be more than $700,000.

Brian Reagan, chief marketing officer for Xiotech who attended the news conference, said what happened with the KDHE system has never happened before with a storage area network put in by Xiotech.

“We found a faulty disk drive within our system,” Reagan said. He said that systems in place to check the disk drive malfunction didn’t work.

“These do happen in technology, regrettably,” he said.

Xiotech, which has 2,000 customers worldwide, will conduct a “failure analysis” within the next week, Reagan said. Xiotech has contracts with other Kansas state governmental agencies.

Bremby said the system failure couldn’t have happened at a worse time because many people are seeking birth records with the start of school. The Kansas Department of Education has advised schools to be patient if parents are unable to provide birth certificates.

Bremby, who said the incident should prompt people to keep copies of vital records on hand, apologized to Kansans.

“This has been a long, hard road for us. We are truly sorry,” he said. “We don’t anticipate this ever happening again.”

Comments

Michael Stanclift 3 years, 7 months ago

Being in IT, I find Xiotech's explanation of what happened a bit suspect. I deal with drive failures in systems and in SANs on a regular basis. They happen. But I've NEVER seen one drive failure cause an entire array to go down and crash so hard.

Guess I can just their Xiotech off my list of future SAN vendors.

0

gutenberg 3 years, 7 months ago

Roderick Bremby, secretary of the Kansas Department of Health and Environment, “This has been a long, hard road for us. We are truly sorry,” he said. “We don’t anticipate this ever happening again.”

..."We don't anticipate this ever happening again."???

It appears the computer failure wasn't anticipated in the first place.

Please remember children, there are only 2 kinds of computer users: Those who have lost data and those who are going to.

Your data is not secure from loss unless it is replicated at least twice more, with copies located in separate geographic locations.

Also, I recommend testing your data backup(s) on a regular basis by performing a complete restore and verification of data integrity and version confirmation.

I WOULD anticipate a computer system failure, especially with a newly installed, untested system.

Test and replace the batteries on the uninterruptible power supplies, too.

-Johannes Gensfleisch zur Laden zum Gutenberg, texting since 1439

0

jrlii 3 years, 7 months ago

The KDHE systems group has been working very, very hard to recover systems and data. It's as close to a heroic effort as I've ever seen in the Information Technology business in the 35 years I've been involved.

0

LJ Whirled 3 years, 7 months ago

Bremby = incompetent Sebelius lackey (time to resign)

0

Commenting has been disabled for this item.