Skip to main content
Open Access Publications from the University of California


The California Digital Library supports the assembly and creative use of the world's scholarship and knowledge for the University of California libraries and the communities they serve.

In addition, the CDL provides tools that support the construction of online information services for research, teaching, and learning, including services that enable the UC libraries to effectively share their materials and provide greater access to digital content, including

California Digital Library

There are 71 publications in this collection, published between 2006 and 2023.
iPRES 2009: the Sixth International Conference on Preservation of Digital Objects (31)

Integrating Metadata Standards to Support Long-Term Preservation of Digital Assets: Developing Best Practices for Expressing Preservation Metadata in a Container Format

This paper explores the purpose and development of best practice guidelines for the use of preservation metadata as detailed in the PREMIS Data Dictionary for Preservation Metadata within documents conforming to the Metadata Encoding and Transmission Standard (METS). METS is an XML schema that provides a container format integrating various forms of metadata with digital objects or links to digital objects. Because of the flexibility of METS to serve many different functions within digital systems and to support many different metadata structures, integration guidelines will facilitate common practices among institutions. There is constant tension between tighter control over the METS package to support object exchange versus each implementation's unique preservation metadata requirements given the different contexts and implementation models among PREMIS implementers. The PREMIS in METS Guidelines serve primarily as a standard for submission and dissemination information packages. This paper details the issues encountered in using the standards together, and how the METS document changes as events pertaining to the lifecycle of digital assets are recorded for future preservation purposes. The guidelines have enabled the implementation of an exchange format and creation/validation tools based on the PREMIS in METS guidelines.

  • 1 supplemental PDF

Chronopolis: Preserving our Digital Heritage

The Chronopolis Digital Preservation Initiative, one of the Library of Congress' latest efforts to collect and preserve atrisk digital information, has completed its first year of service as a multi-member partnership to meet the archival needs of a wide range of cultural and social domains. In this paper we will explore the major themes within Chronopolis.

  • 1 supplemental PDF

Significant Properties, Authenticity, Provenance, Representation Information and OAIS Information

The term "Significant Properties" has been given a variety of definitions and used in various ways over the past several years. The relationship between Significant Properties and the OAIS term Representation Information has been a puzzle. This paper proposes a definition of Significant Properties which provides a way to clarify this relationship and indicates how the concept can be used in a coherent way. We believe that this approach is consistent with the actual use of the concept and does not invalidate the previous pieces of work but rather provides a clear and consistent view of the concept. It also links together Authenticity and Provenance which are also key concepts in digital preservation.

  • 1 supplemental PDF
28 more worksshow all
CDL Staff Publications (31)

Securing the Future of Federal Research: Mirroring as a Vital Scholarly Resource

The recent transition of US presidential administrations has raised awareness and concern regarding the continuity of access to federal research data.  These data are part of the vital public record of federally-funded research, and their continued availability is critically important to scientific integrity and advancement, governmental accountability, and informed public policy.  The portal was created in 2009 as a central repository of government research data, and currently hosts over 135,000 datasets.  This information is, according to the 2013 federal open data policy, “a valuable national resource and a strategic asset to the Federal Government, its partners, and the public.”  As such, it is imperative that these data are subject to effective long-term stewardship.  Best practice within the preservation community calls for redundancy, at both a technical and organizational level, as a primary strategy for higher preservation assurance.  Consequently, California Digital Library (CDL) and Code for Science & Society (CSS) collaborated with the development team on, a full dynamic mirror of holds descriptive metadata and links to the dataset copies of record on federal agency websites, as well as alternative links to local datamirror-managed replicas (41 TB), and soon, to other known copies that may emerge through the efforts of the national data rescue movement, in which CDL and CSS are active participants.  While instigated by recent political events, the stewardship provided by is merely an expression of prudent research data management that is clearly called for to ensure permanent access to the nation’s rich digital patrimony.

  • 1 supplemental file

Cobweb: Collaborative Collection Development for Web Archives

A presentation to staff of the California Digital Library, providing an update on Cobweb development progress as of January, 2018.

Advancing Scholarship through Digital Critical Editions: Mark Twain Project Online

Digital critical editions hold the promise of supporting new scholarly research activities not previously possible or practical with print critical editions. This promise resides in the specific ability to integrate corpora, their associated editorial material and other related content into system architectures and data structures that exploit the strengths of the digital publishing environment. The challenge is to do more than simply create an online copy of the print publication, but rather to provide the kind of resource that both eases and extends the research activities of scholars. Authoritative collections published online in this manner, and with the same rigor brought to the print publishing process, offer scholars: the ability to discover more elusive, granular pieces of information with greater facility; tighter, more obvious and more accessible connections between authoritative versions of texts, editorial matter and primary source material; and continually corrected and expanded "editions," no longer dependent upon the print lifecycle. This paper will explore these benefits and others as they are instantiated in the recently released Mark Twain Papers Online (MTPO) (, created and published as a joint project of the Mark Twain Papers & Project at The Bancroft Library of UC Berkeley (the Papers), the University of California Press (UC Press), and the California Digital Library of the University of California (CDL). This current release of MTPO is comprised of more than twenty three hundred letters written between 1853 and 1880; over twenty eight thousand records of other letters with text not held by the Papers; nearly one hundred facsimiles; and makes available the many decades of archival research on the part of the editors at the Papers. Of particular focus in this discussion will be several key features of the system which, despite the many challenges they presented in development, were felt to be essential pieces of a digital publication that could support scholarship in new and significant ways. Those features include facets, which create intellectual structure and support serendipity; advanced search, which provides a means for researchers to apply their own analytical frameworks; citation support functionality, which serves to secure and record the outcomes of research exploration; and complex displays of individual letters, which allow detailed inspection by collocating the pieces of the authoritative object. These features together maintain the integrity and stability of the collection, while concurrently allowing for fluidity in the continued expansion of the material. In this way, MTPO hopes to succeed as a digital critical edition that will support and extend the research activities of scholars.

28 more worksshow all
CDL Staff Presentations (1)

Integrating Multiple Platforms with OpenID Connect, A Shared Authentication Service

A conference presentation regarding OpenID Connect (OIDC) and other similar single sign on technologies, how it works, why you would want to use it, what projects are currently using it, and how to integrate it into your own work. Presented at Open Repositories 2022, in Denver, CO, on June 8. 2022. Supplemental material includes a recording of the presentation, made after the conference, as well as a powerpoint version of the slides.

  • 1 supplemental video
  • 1 supplemental ZIP
CDL and Partner Organizations - Project Publications (8)

Increasing discovery of archives: A project to provide better pathways to archival records in cultural heritage collections

This report summarizes input from participants in the "Increasing Discovery of Archives" workshop hosted by Shift Collective on December 14 and 15, 2021. It is intended to help inform the research and development phase of the National Finding Aid Network (NAFAN) project led by the California Digital Library (CDL).

Toward a National Archival Finding Aid Network - From Planning Initiative to Project and Program: An Action Plan

This action plan is a key deliverable of "Toward a National Finding Aid Network," a one-year planning initiative supported by the U.S. Institute of Museum and Library Services under the provisions of the Library Services and Technology Act (LSTA), administered in California by the State Librarian. The plan was prepared by a Task Force comprising representatives from the Core Partner group of aggregators, who contributed time between July-September 2019 to formulate and develop these recommendations. At the heart of the action plan are recommendations for and principles to guide next steps to implement a national-level finding aid network. The Task Force recommends a phased, incremental approach that moves this effort from a research and demonstration project to a program; is informed by a research agenda; and (from the beginning) includes work to establish business and governance models that fit the infrastructure and service model.

Summary of Research: Findings from the Building a National Archival Finding Aid Network Project

This report contextualizes and synthesizes the findings from across all OCLC’s research activities on the Building a National Finding Aid Network (NAFAN) grant, with a focus on how findings relate to future phases of work on the NAFAN project. The findings indicate that there is significant value to be drawn from a national aggregation of archival description. They also identify challenges that must be overcome to build the community of participation that a national finding aid aggregation will require to be sustainable. 

From 2020–2023, OCLC conducted research as a partner on Building a National Finding Aid Network (NAFAN), an IMLS-supported research and demonstration project to build the foundation for a national archival finding aid network to address the inconsistency and inequity of the current archival discovery landscape (LG-246349-OLS-20). The project was led by California Digital Library (CDL), with partners at OCLC, the University of Virginia Library, Shift Collective, and Chain Bridge Group.   

5 more worksshow all