Saving our digital heritage

It is commonly agreed that the destruction of the ancient Library of Alexandria in Egypt was one of the most devastating losses of knowledge in all of civilization. Today, however, the digital information that drives our world and powers our economy is in many ways more susceptible to loss than the papyrus and parchment at Alexandria.

An estimated 44 percent of Web sites that existed in 1998 vanished without a trace within just one year. The average life span of a Web site is only 44 to 75 days. The gadgets that inform our lives – cellphones, computers, iPods, DVDs, memory cards – are filled with digital content. Yet the lifetime of these media is discouragingly short. Changing file and hardware formats, or computer viruses and hard-drive crashes, can render years of creativity inaccessible.

By contrast, the Library of Congress has in its care millions of printed works, some on stone or animal skin that have survived for centuries. The challenges underlying digital preservation led Congress in 2000 to appropriate $100 million for the Library of Congress to lead the National Digital Information Infrastructure and Preservation Program (NDIIPP), a growing partnership of 67 organizations charged with preserving and making accessible “born digital” information for current and future generations.

Some of the crucial programs funded by NDIIPP include the archiving of important Web sites such as those covering federal elections and Hurricane Katrina; public health, geospatial and map data; public television and foreign news broadcasts; and other vital born-digital content.

Unfortunately, the program is threatened. In February, Congress passed and the president signed legislation rescinding $47 million of the program’s approved funding. This jeopardizes an additional $37 million in matching, non-federal funds that partners would contribute as in-kind donations.

Some of the projects that were to be funded include preservation of important government records at the state level, such as legislative data and court records.

Another new project at risk, “Preserving Creative America,” is an initiative with commercial producers of creative content, such as digital film, music, photography, other forms of pictorial art and even video games.

Responsible preservation of our most valued digital data requires answers to key questions: Which data should we keep and how should we keep it? How can we ensure that we can access it in five years, 100 years or 1,000 years? And, who will pay for it?

The importance of developing sensible plans to preserve our digital heritage cannot be minimized. We can’t save it all, nor do we want to. It’s also critical that we agree on how to save this data. In the next 100 years, we will go through dozens of generations of computers and storage media, and our digital data will need to be transferred from one generation to the next, and by someone we trust to do it.

NDIIPP provides a good start, and Congress has an opportunity to restore $21.5 million requested by the Library of Congress to continue the program and sustain the partnerships needed to fulfill the critical task of preserving our nation’s important born-digital information.

It would be a national and a global shame if our most valuable born-digital knowledge, like the ancient holdings at Alexandria, were lost forever.