Project aims to preserve Internet history

Library of Congress begins technological task

? Here’s the flip side of the digital age’s magic act: It’s also making information disappear.

“The digital history of this nation is imperiled by the very technology that is used to create it,” said Librarian of Congress James Billington.

So Friday the Library of Congress announced the next step in the effort to preserve that history: congressional approval of its plan for the National Digital Information Infrastructure and Preservation Program. The $100 million initiative was launched by Congress in 2000 to do for digital media what the world’s largest library already does for printed matter.

“This plan is the beginning of the creation of a national network to preserve the digital memory of our country,” said Laura Campbell, associate librarian for strategic initiatives.

The Library of Congress faces a daunting task. The popular Internet search engine Google claims access to about 3 billion Web pages, a massive collection of information in a constant state of change.

In fact, according to data from the Library of Congress, the average Web page has a lifespan of just a couple of months. Of all the Web content made in 1998, nearly half had disappeared by 1999.

“Much of what has been created is no longer accessible,” Billington said. “And much of what disappears is important, one-of-a-kind material that can never be recovered, but will be desperately looked for.”

The task of preserving the Internet and other digital information already has been undertaken by some private and nonprofit firms, such as the Internet Archive, which has been saving old Web pages since 1996 and has preserved Web sites from the 2000 presidential campaign. Working with the Library of Congress, the archive also saved online information related to the Sept. 11, 2001, terrorist attacks.

And the library itself, beginning with the 107th Congress, started saving congressional Web pages. It already runs an online collection of 8 million U.S. historical artifacts such as presidential diaries and baseball cards.

On top of the $5 million the library received for planning the initiative in 2000, the plan approved Friday releases $20 million of funding to develop a system for evaluating and storing digital information.

Congress also will match up to $75 million of private fund-raising done by the library for the program. The matching funds were part of the original $100 million appropriation.

Now, with congressional approval of the plan, the library continues work with partners to develop the infrastructure needed to carry out data preservation, to build a network to manage the data, and to assemble people and set guidelines to help choose which data to save.