Skip to main content
eScholarship
Open Access Publications from the University of California

Metadata Realities for Cyberinfrastructure: Data Authors as Metadata Creators

Abstract

As digital data creation technologies become more prevalent, data and metadata management are necessary to make data available, usable, sharable, and storable. Researchers in many scientific settings, however, have little experience or expertise in data and metadata management. In this dissertation, I explore the everyday data and metadata management practices of researchers through a multi-sited ethnographic study of metadata creation by researchers in the Center for Embedded Networked Sensing (CENS). In studying metadata practices, I focused on the ways that researchers document, describe, annotate, organize, and manage their data, both for their own use and the use of researchers outside of their project. This study illustrates how researchers within CENS rarely create documentation that is not directly tied to their own use of their data, and correspondingly, they rarely share data with users from outside of their immediate projects. From these observations, I develop a metadata typology that includes six components, including metadata for: data identity, data characteristics, data quality, data collection equipment, data collection methods, and data analysis methods. I use a framework of accountability to discuss the ways that metadata practices fit within social research settings. Metadata are situated in regimes of mutual accountability in which researchers learn what is important to document, what counts as sufficient documentation, and how documentation practices are to be accounted for in social research settings. Researchers work within social ontologies in which “metadata-for-data sharing” have very low visibility. As a consequence, when asked to create metadata descriptions of the data for a shared CENS metadata registry, researchers lack specific data users, and thus describe their data for members of their most likely “imagined public:” other researchers with shared research interests and methods. I argue that the cyberinfrastructure vision of wide-spread data sharing is fundamentally mis-aligned with the realities of the day-to-day metadata practices of researchers in small-scale field sciences.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View