News Archive

Library of Congress and National Science Foundation Announce Research Awards of $3 Million To Advance Digital Preservation

San Diego Supercomputer Center Involved in Two Key Projects

Published 05/10/2005

The Library of Congress National Digital Information Infrastructure and Preservation Program (NDIIPP) and the National Science Foundation today awarded eleven university teams a total of $3 million to undertake pioneering research to support the long-term management of digital information. These awards are the outcome of a partnership between the two agencies to develop the first digital-preservation research grants program.

Digital materials with research or cultural value are at great risk of loss due to many factors, including dynamic change, insecure storage, and file format obsolescence. Millions of digital objects, such as Web sites documenting the early days of the Internet, have already been lost. With more and more information generated only in digital form, major improvements in digital preservation are crucial if the nation is to continue to rapidly accumulate and generate knowledge.

The Library is implementing a national digital preservation strategy, which involves building a collaborative network of partners to collect at-risk content and to develop new preservation tools. Research supported under these awards will help produce the technological breakthroughs needed to keep very large bodies of digital content securely preserved and accessible over many years.

The projects awarded today will explore challenging topics, such as preserving rich oceanographic data from hundreds of deep-sea submersible missions; automating methods to describe digital objects and place them in secure archival storage; testing how to preserve digital video when it is first created; and preserving complex three-dimensional digital content. All the projects are expected to produce study results in one year.

"This is a critical piece of our national strategy," said Associate Librarian for Strategic Initiatives Laura E. Campbell, who is leading NDIIPP for the Library of Congress. "These research awards will boost the collection and preservation work already under way in our partnership network and will generate practical outcomes that many others around the world can put to effective use."

Today's awards are the result of a National Science Foundation Program Solicitation, called "Digital Archiving and Long-Term Preservation" (DIGARCH), which had an application submission deadline of Sept. 14, 2004. The solicitation can be viewed online. All applications were subjected to a National Science Foundation peer-review process.

Following are the winning lead institutions, their partner institutions and the subject area of the project:

Institutions: Scripps Institute of Oceanography; Woods Hole Oceanographic Institution and The San Diego Supercomputer Center.

Title: Multi-Institution Testbed for Scalable Digital Archiving.
Summary: These two institutions will develop a multi-terabyte digital repository to preserve data from more than 1,600 oceanographic research projects. The collaborating institutions will test processes for automatic archival ingest (acquisition), metadata extraction, validation and access control, and will also explore methods for management of rights-protected data.

Institution: University of Maryland.
Title: Robust Technologies for Automated Ingestion and Long-Term Preservation of Digital Information.
Summary: This project will explore automated ingest and verification for distributed digital collections. It will also develop and test a preservation architecture that can "evolve gracefully" as technology changes and that is interoperable with different computer platforms.

Institution: Drexel University.
Title: Digital Engineering Archives.
Summary: This project will work with decades of three-dimensional Computer Assisted Design (CAD) engineering design and production data that currently have very limited preservation options. Researchers will use international standards to convert complex design data into more readily preservable content and will use the results to educate the engineering community about three-dimensional data preservation options.

Institution: San Diego Supercomputer Center, University of California, San Diego Libraries, UCSD-TV
Title: Digital Preservation Lifecycle Management: Building a Demonstration Prototype for the Preservation of Large Scale Multimedia Collections.
Summary: The project will demonstrate a preservation life cycle management process for video content. Researchers will develop and document a practical preservation process for mixed collection of both legacy and "born digital" video material.

Institution: University of Arizona.
Title: Investigating Data Provenance in the Context of New Product Design and Development.
Summary: This undertaking will investigate ways to automate metadata capture through an innovative partnership with Raytheon, a commercial defense and aerospace systems supplier. Methods to develop "self aware/self describing" production and design digital data will be explored.

Institution: University of Michigan.
Title: Incentives for Data Producers to Create Archive-Ready Data Sets.
Summary: The project will examine incentives for data producers to deposit "archive-ready" data sets. Focus will be on collaboration between producers and archives, including identification of a process for archives to adjust their deposit requirements to better suit producer needs.

Institution: Old Dominion University.
Title: Shared Infrastructure Preservation Models.
Summary: This project will evaluate existing shared Internet infrastructure elements (such as Simple Mail Transfer Protocol or SMTP) to determine if they are suitable for digital preservation purposes. Researchers will explore options to reduce digital preservation costs through use of cheap and widely deployed protocols.

Institution: University of Tennessee at Knoxville.
Title: Planning a Globally Accessible Archive of MODIS Data.
Summary: The intent of this project is to bring together leaders of the Moderate Resolution Imaging Spectroradiometer (MODIS) archive community with computer science researchers to discuss new distributed approaches to managing MODIS satellite data, which currently has a volume of about two petabytes.

Institution: University of North Carolina at Chapel Hill.
Title: Preserving Video Objects and Context: A Demonstration Project.
Summary: Development of rich descriptive terms and a process for applying them to digital objects is the focus of this study. Attention will also be given to demonstrating a cost-benefit methodology.

Institution: Johns Hopkins University.
Title: Securely Managing the Lifetime of Versions in Digital Archives.
Summary: This project will study technologies for secure deletion of information to protect personal privacy and provide a mechanism to ensure that no unwanted data is retained along with preserved data.

BACKGROUND
In December 2000 Congress authorized the Library of Congress to develop and execute a congressionally approved plan for a National Digital Information Infrastructure and Preservation Program. A $99.8 million congressional appropriation was made to establish the program. According to Conference Report (H. Rept. 106-1033), "The overall plan should set forth a strategy for the Library of Congress, in collaboration with other federal and nonfederal entities, to identify a national network of libraries and other organizations with responsibilities for collecting digital materials that will provide access to and maintain those materials.…In addition to developing this strategy, the plan shall set forth, in concert with the Copyright Office, the policies, protocols and strategies for the long-term preservation of such materials, including the technological infrastructure required at the Library of Congress."

The legislation mandates that the Library work with federal entities such as the Secretary of Commerce, the director of the White House Office of Science and Technology Policy, the National Archives and Records Administration, the National Library of Medicine, the National Agricultural Library, the National Institute of Standards and Technology and "other federal, research and private libraries and institutions with expertise in telecommunications technology and electronic commerce policy." The goal is to build a network of committed partners working through a preservation architecture with defined roles and responsibilities.

The Library of Congress digital strategy is formulated in concert with a study, commissioned by the Librarian of Congress and undertaken by the National Research Council Computer Science and Telecommunications Board. "LC 21: A Digital Strategy for the Library of Congress" was issued July 26, 2000, and made several recommendations, including that the Library, working with other institutions, take the lead in the preservation and archiving of digital materials.

The complete text of "Preserving Our Digital Heritage: Plan for the National Digital Information Infrastructure and Preservation Program" is available at www.digitalpreservation.gov. This report includes an explanation of how the plan was developed, who the Library worked with to develop the plan and the key components of the digital preservation infrastructure. Congress approved the plan in December 2002.

The Library of Congress is the largest library in the world. Through its National Digital Library (NDL) Program, it is also one of the leading providers of noncommercial intellectual content on the Internet ( www.loc.gov). The NDL Program's flagship American Memory project, in collaboration with other institutions nationwide, makes freely available more than 10 million American historical items.