Skip to main content
Open Access Publications from the University of California

School of Information

Recent Work bannerUC Berkeley

The UC Berkeley School of Information prepares leaders and pioneers solutions to the challenges of transforming information--ubiquitous, abundant, and evolving--into knowledge.

Through our Master's program, focused in five areas of concentration, we train students for careers as information professionals and entrepreneurs. Through our Ph.D. program and faculty research, we explore and develop solutions and shape policies that influence how people seek, use, and share information to create knowledge. Our work takes us wherever information touches lives, often bringing us into partnership with diverse disciplines, from law, sociology, and business to publishing, linguistics, and computer science.

Cover page of Information-intensive innovation: the changing role of the private firm in the research ecosystem through the study of biosensed data

Information-intensive innovation: the changing role of the private firm in the research ecosystem through the study of biosensed data


In a world instrumented with smart sensors and digital platforms, some of our most intimate and information-rich data are being collected and curated by private companies. The opportunities and risks derived from potential knowledge carried within these data streams are undeniable, and the clustering of data within the private sector is challenging traditional data infrastructures and sites of research. The role of private industry in research and development (R&D) has traditionally been limited—especially for earlier stage research—given the high risk, long time horizons, and uncertain returns on investment. However, the information economy has changed the way Silicon Valley and other technology firms operate their business models, which has vast implications for how they respectively innovate. Information drives competitive advantage, and builds upon the emergence of technical infrastructure for collecting, storing, and analyzing data at scale.

Basic research and fundamental inquiry are becoming important innovation priorities for private firms as they tailor algorithms and customize services, and these changes have vast implications for individual privacy and research ethics. This information-intensive innovation does not simply introduce a new source of inquiry, but a shift in the possibilities and boundaries that enable market edge.

This shift challenges prior models of innovation and reconsiders the role of the private firm within the research ecosystem—specifically in regards to Vannevar Bush’s Linear Model of Innovation and Donald Stokes’ Quadrant Model of Scientific Research. This change builds upon prior Silicon Valley innovation models outlined by AnnaLee Saxenian and Henry Chesbrough, but features additional key changes within industry R&D that are fundamentally reshaping the role of the firm within the broader ecosystem. No longer can industry be cast as a place only equipped to grapple exclusively with narrowly applied or developmental research and fully separated or agnostic from users, customers, and citizens. Within this information and data abundant moment, the research and innovation ecosystem is at an inflection point that could alter decades of embedded beliefs and assumptions on who should conduct research and ask fundamental questions, not to mention who should govern and grant access to research data.

This dissertation studies how the rise of data science infrastructure is changing the role of the private firm in the R&D ecosystem. This research works to understand how and under what conditions private sector firms are synthesizing user data (e.g., those picked up by sensors) internally and/or shared externally for research purposes. This dissertation specifically looks at applications of biosensed data for the purposes of social, behavioral, health, or public health research applications. Qualitative and mixed methods are used to research, document, and examine practices within the lens of existing research and innovation theoretical models. Historical frameworks are used to ground and place contemporary practices within broader context.

This research presents three illustrative cases on firms that exemplify different aspects of strategies to adapt to the competitive pressures of information-intensive innovation. The firms include the Lioness smart vibrator, Kinsa smart thermometer, and Basis smart watch. This research establishes findings about how firms are working within the data and R&D landscape, and how new pressures are influencing emerging practices and strategies. Findings outline the changing definitional boundaries of research within the private firm, and evolving practices relating to knowledge sharing and research activities within the firms. This analysis also points to two key emerging challenges firms are coping with, including how to grapple with research ethics and the rise of secrecy practices that may impede collaboration and research strategies implicit with information-intensive innovation.

Research is occurring at many levels within firms, breaking free of any traditional laboratory structure. Collaborations and data sharing with academics for mutually beneficial research partnerships are taking new, largely unstructured forms to meet rising demand and interest. There is fresh demand for new kinds of collaboration models derived from data sharing needs, and exploration into ways of leveraging research practices and incorporating academic research curiosity across firms.

This dissertation concludes by summarizing the importance of reconsidering the role of the firm within the broader R&D ecosystem and broader policy considerations. Programs to help structure and incentivize private/academic research collaborations should be considered, and private firms should evaluate their internal protocols and strategies in light of this changing landscape.  

Cover page of Context, Causality, and Information Flow: Implications for Privacy Engineering, Security, and Data Economics

Context, Causality, and Information Flow: Implications for Privacy Engineering, Security, and Data Economics


The creators of technical infrastructure are under social and legal pressure to comply with expectations that can be difficult to translate into computational and business logics. This dissertation bridges this gap through three projects that focus on privacy engineering, information security, and data economics, respectively. These projects culminate in a new formal method for evaluating the strategic and tactical value of data: data games. This method relies on a core theoretical contribution building on the work of Shannon, Dretske, Pearl, Koller, and Nissenbaum: a definition of situated information flow as causal flow in the context of other causal relations and strategic choices.

 The first project studies privacy engineering's use of Contextual Integrity theory (CI), which defines privacy as appropriate information flow according to norms specific to social contexts or spheres. Computer scientists using CI have innovated as they have implemented the theory and blended it with other traditions, such as context-aware computing. This survey examines computer science literature using Contextual Integrity and discovers, among other results, that technical and social platforms that span social contexts challenge CI's current commitment to normative social spheres. Sociotechnical situations can and do defy social expectations with cross-context clashes, and privacy engineering needs its normative theories to acknowledge and address this fact.  This concern inspires the second project, which addresses the problem of building computational systems that comply with data flow and security restrictions such as those required by law. Many privacy and data protection policies stipulate restrictions on the flow of information based on that information's original source. We formalize this concept of privacy as Origin Privacy. This formalization shows how information flow security can be represented using causal modeling. Causal modeling of information security leads to general theorems about the limits of privacy by design as well as a shared language for representing specific privacy concepts such as noninterference, differential privacy, and authorized disclosure.

 The third project uses the causal modeling of information flow to address gaps in current theory of data economics. Like CI, privacy economics has focused on individual economic contexts and so has been unable to comprehend an information economy that relies on the flow of information across contexts. Data games, an adaptation of Multi-Agent Influence Diagrams for mechanism design, are used to model the well known economic contexts of principal-agent contracts and price differentiation as well as new contexts such as personalized expert services and data reuse. This work reveals that information flows are not goods but rather strategic resources, and that trade in information therefore involves market externalities.

Cover page of Tensions of Data-Driven Reflection: A Case Study of Real-Time Emotional Biosensing

Tensions of Data-Driven Reflection: A Case Study of Real-Time Emotional Biosensing


Biosensing displays, increasingly enrolled in emotional reflection, promise authoritative insight by presenting users’ emotions as discrete categories. Rather than machines interpreting emotions, we sought to explore an alternative with emotional biosensing displays in which users formed their own interpretations and felt comfortable critiquing the display. So, we designed, implemented, and deployed, as a technology probe, an emotional biosensory display: Ripple is a shirt whose pattern changes color responding to the wearer’s skin conductance, which is associated with excitement. 17 participants wore Ripple over 2 days of daily life. While some participants appreciated the ‘physical connection’ Ripple provided between body and emotion, for others Ripple fostered insecurities about ‘how much’ feeling they had. Despite our design intentions, we found participants rarely questioned the display’s relation to their feelings. Using biopolitics to speculate on Ripple’s surprising authority, we highlight ethical stakes of biosensory representations for sense of self and ways of feeling.

Cover page of Emotional Biosensing: Exploring Critical Alternatives

Emotional Biosensing: Exploring Critical Alternatives


Emotional biosensing is rising in daily life: Data and categories claim to know how people feel and suggest what they should do about it, while CSCW explores new biosensing possibilities. Prevalent approaches to emotional biosensing are too limited, focusing on the individual, optimization, and normative categorization. Conceptual shifts can help explore alternatives: toward materiality, from representation toward performativity, inter-action to intra-action, shifting biopolitics, and shifting affect/desire. We contribute (1) synthesizing wide-ranging conceptual lenses, providing analysis connecting them to emotional biosensing design, (2) analyzing selected design exemplars to apply these lenses to design research, and (3) offering our own recommendations for designers and design researchers. In particular we suggest humility in knowledge claims with emotional biosensing, prioritizing care and affirmation over self- improvement, and exploring alternative desires. We call for critically questioning and generatively re- imagining the role of data in configuring sensing, feeling, ‘the good life,’ and everyday experience.

Cover page of Strange and Unstable Fabrication

Strange and Unstable Fabrication


In the 1950’s a group of artists led by experimental composer John Cage actively engaged chance as a means to limit their control over the artworks they produced. These artists described a world filled with active and lively forces, from the sounds of rain to blemishes in paper, that could be harnessed in creative production to give rise to new aesthetics and cultivate new sensitivities to the everyday. This approach to making was not simply act of creative expression but active attempt at creative expansion—a way of submitting to a world of creative forces beyond the self for the sake of seeing, hearing, or feeling things anew. I use these practices as a lens to reflect on the way human-computer interaction (HCI) researchers think about and design for making, specifically as it relates to the present day “maker movement.” I focus on how the design of digital fabrication systems, like 3D printers, could make room for creative forces beyond the maker and why such modes of making are worth considering in HCI research. Since digital fabrication technologies have catalyzed the maker movement and are often described as key instruments for “democratizing” manufacturing, this project joins broader efforts to reflect on values in maker technology as a means of expanding the design space of digital fabrication in ways that could potentially increase the diversity of participants associated with the movement.

By weaving through post-anthropocentric theories of the new materialisms, design practice, art history, and HCI, I contribute a theory of making that accounts for the creative capacity of nonhumans as well as design tactics to make room for nonhuman forces in the design of digital fabrication systems. I argue that nonhumans exert material-semiotic forces upon makers that shape their perspectives on stuff and culture in tandem. I then suggest that tools that are both strange and unstable create a space for makers to perceive and work with these forces in ways that honor the unique life and agency of nonhuman matter. As a whole, this work adds dimensionality to HCI’s existing focus on making as a process of self-expression by suggesting new design territories in fabrication design, crossings between critical reflection and creative production. I close this work by speculating on how tools that trade control, mastery, and predictability for chance, compromise, labor, and risk could become valuable within a broader landscape of making.

Cover page of Feed Subscription Management

Feed Subscription Management


An increasing number of data sources and services are made available on the Web, and in many cases these information sources are or easily could be made available as feeds. However, the more data sources and services are exposed through feed-based services, the more it becomes necessary to manage and be able to share those services, so that users and uses of those services can build on the foundation of an open and decentralized architecture. In this paper we present the Feed Subscription Management (FSM) architecture, which is a model for managing feed subscriptions and supports structured feed subscriptions. Based on FSM, it is easy to build services that manage feed-based services so that those feed-based services can easily create, change and delete feed subscriptions, and that it is easily possible to share feed subscriptions across users and/or devices. Our main reason for focusing on feeds is that we see feeds as a good foundation for an ecosystem of RESTful services, and thus our architectural approach revolves around the idea of modeling services as interactions with feeds.

Cover page of From RESTful Services to RDF: Connecting the Web and the Semantic Web

From RESTful Services to RDF: Connecting the Web and the Semantic Web


RESTful services on the Web expose information through retrievable resource representations that represent self-describing descriptions of resources, and through the way how these resources are interlinked through the hyperlinks that can be found in those representations. This basic design of RESTful services means that for extracting the most useful information from a service, it is necessary to understand a service's representations, which means both the semantics in terms of describing a resource, and also its semantics in terms of describing its linkage with other resources. Based on the Resource Linking Language (ReLL), this paper describes a framework for how RESTful services can be described, and how these descriptions can then be used to harvest information from these services. Building on this framework, a layered model of RESTful service semantics allows to represent a service's information in RDF/OWL. Because REST is based on the linkage between resources, the same model can be used for aggregating and interlinking multiple services for extracting RDF data from sets of RESTful services.

Cover page of Improving Federal Spending Transparency: Lessons Drawn from

Improving Federal Spending Transparency: Lessons Drawn from


Information about federal spending can affect national priorities and government processes, having impacts on society that few other data sources can rival. However, building effective open government and transparency mechanisms holds a host of technical, conceptual, and organizational challenges. To help guide development and deployment of future federal spending transparency systems, this paper explores the effectiveness of accountability measures deployed for the American Recovery and Reinvestment Act of 2009 ("Recovery Act" or "ARRA"). The Recovery Act provides an excellent case study to better understand the general requirements for designing and deploying "Open Government" systems. In this document, we show specific examples of how problems in data quality, service design, and systems architecture limit the effectiveness of ARRA's promised transparency. We also highlight organizational and incentive issues that impede transparency, and point to design processes as well as general architectural principles needed to better realize the goals advanced by open government advocates.

Cover page of Privacy Issues of the W3C Geolocation API

Privacy Issues of the W3C Geolocation API


The W3C's Geolocation API may rapidly standardize the transmission of location information on the Web, but, in dealing with such sensitive information, it also raises serious privacy concerns. We analyze the manner and extent to which the current W3C Geolocation API provides mechanisms to support privacy. We propose a privacy framework for the consideration of location information and use it to evaluate the W3C Geolocation API, both the specification and its use in the wild, and recommend some modifications to the API as a result of our analysis.