Skip to main content
Open Access Publications from the University of California

School of Information

School of Information bannerUC Berkeley


The School of Information creates knowledge and advances practice wherever people interact with information and technology.

Our research explores the implications for individuals and society as information and digital technologies are increasingly embedded in all aspects of human experience. Our professional master’s degrees prepare students to design and build the systems that will shape the way humans live and interact in the future.

Our research and teaching are interconnected; both are urgent, because our understanding of the consequences for individuals and society of their interactions with information and machines remains critical, contentious, and inadequate.

School of Information

There are 182 publications in this collection, published between 1972 and 2022.
Recent Work (52)

Location Management for Mobile Devices

Location-awareness, in the form of location information about clients and location-based services provided by servers, is becoming increasingly important for networked communications in general, and wireless and mobile devices in particular. The current fragmented landscape of location concepts and location-awareness, however, is not suitable for handling location information on a Web scale. Providing users with mechanisms which allow them to control how they want to expose their location information, and thus allow control over how to share location information with others and services, is a crucial step for better location management for mobile devices. This paper presents a concept for representing location vocabularies, matching and mapping them, how these vocabularies can be used to support better privacy for users of location-based services, and better location sharing between users and services. The concept is based on a language for describing place name vocabularies, which we call "Place Markup Language (PlaceML)", and on various ways how these vocabularies can be used in a location-aware infrastructure of networked devices.

Destination Services: Tourist media and networked places

Tourism exists in the interplay between places and stories. In making sense of travel, we are also making sense of ourselves and the world around us. Indeed, the global tourist industry produces places as “destinations” through stories and souvenirs. The audience for tourism stories has changed greatly with changes in technologies of communication and representation, with one of the most radical changes the introduction of networked media. With the rise of web-based services, tourist experiences have acquired a digital penumbra of content available in ever more formats and locations. This paper examines these technological changes, and the potential consequences for digital storytelling, travel, and the production of destinations.

Practical Obscurity in the Digital Age: Public Records in the Private Sector

In this paper, I outline the legislative framework governing information privacy practices in the public and private sectors in the United States and, more narrowly, the state of California, with particular attention paid to criminal justice system information. I will explore the relationship between the courts, which maintain public criminal records, and Corporate Data Brokers (CDBs), which aggregate and sell information from court records, as well as the accuracy and privacy of their systems. While legislation guiding the government's handling of information may need to be extended to the private sector, state governments have a role to play in improving their technology infrastructure to ensure that accurate, timely information is available in the public records. This is particularly important for the criminal justice system, the source of data brokers collecting. In making this argument, I look at one state, Colorado, that did a great deal early on to improve their criminal records technology infrastructure.

49 more worksshow all
Open Access Policy Deposits (134)

Decibel: The Relational Dataset Branching System.

As scientific endeavors and data analysis become increasingly collaborative, there is a need for data management systems that natively support the versioning or branching of datasets to enable concurrent analysis, cleaning, integration, manipulation, or curation of data across teams of individuals. Common practice for sharing and collaborating on datasets involves creating or storing multiple copies of the dataset, one for each stage of analysis, with no provenance information tracking the relationships between these datasets. This results not only in wasted storage, but also makes it challenging to track and integrate modifications made by different users to the same dataset. In this paper, we introduce the Relational Dataset Branching System, Decibel, a new relational storage system with built-in version control designed to address these shortcomings. We present our initial design for Decibel and provide a thorough evaluation of three versioned storage engine designs that focus on efficient query processing with minimal storage overhead. We also develop an exhaustive benchmark to enable the rigorous testing of these and future versioned storage engine designs.

Web of microbes (WoM): a curated microbial exometabolomics database for linking chemistry and microbes.


As microbiome research becomes increasingly prevalent in the fields of human health, agriculture and biotechnology, there exists a need for a resource to better link organisms and environmental chemistries. Exometabolomics experiments now provide assertions of the metabolites present within specific environments and how the production and depletion of metabolites is linked to specific microbes. This information could be broadly useful, from comparing metabolites across environments, to predicting competition and exchange of metabolites between microbes, and to designing stable microbial consortia. Here, we introduce Web of Microbes (WoM; freely available at: ), the first exometabolomics data repository and visualization tool.


WoM provides manually curated, direct biochemical observations on the changes to metabolites in an environment after exposure to microorganisms. The web interface displays a number of key features: (1) the metabolites present in a control environment prior to inoculation or microbial activation, (2) heatmap-like displays showing metabolite increases or decreases resulting from microbial activities, (3) a metabolic web displaying the actions of multiple organisms on a specified metabolite pool, (4) metabolite interaction scores indicating an organism's interaction level with its environment, potential for metabolite exchange with other organisms and potential for competition with other organisms, and (5) downloadable datasets for integration with other types of -omics datasets.


We anticipate that Web of Microbes will be a useful tool for the greater research community by making available manually curated exometabolomics results that can be used to improve genome annotations and aid in the interpretation and construction of microbial communities.

Futzing and Moseying: Interviews with Professional Data Analysts on Exploration Practices.

We report the results of interviewing thirty professional data analysts working in a range of industrial, academic, and regulatory environments. This study focuses on participants' descriptions of exploratory activities and tool usage in these activities. Highlights of the findings include: distinctions between exploration as a precursor to more directed analysis versus truly open-ended exploration; confirmation that some analysts see "finding something interesting" as a valid goal of data exploration while others explicitly disavow this goal; conflicting views about the role of intelligent tools in data exploration; and pervasive use of visualization for exploration, but with only a subset using direct manipulation interfaces. These findings provide guidelines for future tool development, as well as a better understanding of the meaning of the term "data exploration" based on the words of practitioners "in the wild."

131 more worksshow all