Skip to main content
eScholarship
Open Access Publications from the University of California

School of Information

Open Access Policy Deposits bannerUC Berkeley

Open Access Policy Deposits

This series is automatically populated with publications deposited by UC Berkeley School of Information researchers in accordance with the University of California’s open access policies. For more information see Open Access Policy Deposits and the UC Publication Management System.

Cover page of Geographic microtargeting of social assistance with high-resolution poverty maps.

Geographic microtargeting of social assistance with high-resolution poverty maps.

(2022)

Hundreds of millions of poor families receive some form of targeted social assistance. Many of these antipoverty programs involve some degree of geographic targeting, where aid is prioritized to the poorest regions of the country. However, policy makers in many low-resource settings lack the disaggregated poverty data required to make effective geographic targeting decisions. Using several independent datasets from Nigeria, this paper shows that high-resolution poverty maps, constructed by applying machine learning algorithms to satellite imagery and other nontraditional geospatial data, can improve the targeting of government cash transfers to poor families. Specifically, we find that geographic targeting relying on machine learning-based poverty maps can reduce errors of exclusion and inclusion relative to geographic targeting based on recent nationally representative survey data. This result holds for antipoverty programs that target both the poor and the extreme poor and for initiatives of varying sizes. We also find no evidence that machine learning-based maps increase targeting disparities by demographic groups, such as gender or religion. Based in part on these findings, the Government of Nigeria used this approach to geographically target emergency cash transfers in response to the COVID-19 pandemic.

Cover page of Mobile phone data reveal the effects of violence on internal displacement in Afghanistan.

Mobile phone data reveal the effects of violence on internal displacement in Afghanistan.

(2022)

Nearly 50 million people globally have been internally displaced due to conflict, persecution and human rights violations. However, the study of internally displaced persons-and the design of policies to assist them-is complicated by the fact that these people are often underrepresented in surveys and official statistics. We develop an approach to measure the impact of violence on internal displacement using anonymized high-frequency mobile phone data. We use this approach to quantify the short- and long-term impacts of violence on internal displacement in Afghanistan, a country that has experienced decades of conflict. Our results highlight how displacement depends on the nature of violence. High-casualty events, and violence involving the Islamic State, cause the most displacement. Provincial capitals act as magnets for people fleeing violence in outlying areas. Our work illustrates the potential for non-traditional data sources to facilitate research and policymaking in conflict settings.

Cover page of Machine learning and phone data can improve targeting of humanitarian aid.

Machine learning and phone data can improve targeting of humanitarian aid.

(2022)

The COVID-19 pandemic has devastated many low- and middle-income countries, causing widespread food insecurity and a sharp decline in living standards1. In response to this crisis, governments and humanitarian organizations worldwide have distributed social assistance to more than 1.5 billion people2. Targeting is a central challenge in administering these programmes: it remains a difficult task to rapidly identify those with the greatest need given available data3,4. Here we show that data from mobile phone networks can improve the targeting of humanitarian assistance. Our approach uses traditional survey data to train machine-learning algorithms to recognize patterns of poverty in mobile phone data; the trained algorithms can then prioritize aid to the poorest mobile subscribers. We evaluate this approach by studying a flagship emergency cash transfer program in Togo, which used these algorithms to disburse millions of US dollars worth of COVID-19 relief aid. Our analysis compares outcomes-including exclusion errors, total social welfare and measures of fairness-under different targeting regimes. Relative to the geographic targeting options considered by the Government of Togo, the machine-learning approach reduces errors of exclusion by 4-21%. Relative to methods requiring a comprehensive social registry (a hypothetical exercise; no such registry exists in Togo), the machine-learning approach increases exclusion errors by 9-35%. These results highlight the potential for new data sources to complement traditional methods for targeting humanitarian assistance, particularly in crisis settings in which traditional data are missing or out of date.

Cover page of Copy theory

Copy theory

(2022)

In information science, writing, printing, telecommunication, and digital computing have been central concerns because of their ability to distribute information. Overlooked is the obvious fact that these technologies fashion copies, and the theorizing of copies has been neglected. We may think a copy is the same as what it copies, but no two objects can really be the same. “The same” means similar enough as an acceptable substitute for some purpose. The differences between usefully similar things are also often important, in forensic analysis, for example, or inferential processes. Status as a copy is only one form of relationship between objects, but copies are so integral to information science that they demand a theory. Indeed, theorizing copies provides a basis for a more complete and unified view of information science.

Cover page of Microestimates of wealth for all low- and middle-income countries.

Microestimates of wealth for all low- and middle-income countries.

(2022)

Many critical policy decisions, from strategic investments to the allocation of humanitarian aid, rely on data about the geographic distribution of wealth and poverty. Yet many poverty maps are out of date or exist only at very coarse levels of granularity. Here we develop microestimates of the relative wealth and poverty of the populated surface of all 135 low- and middle-income countries (LMICs) at 2.4 km resolution. The estimates are built by applying machine-learning algorithms to vast and heterogeneous data from satellites, mobile phone networks, and topographic maps, as well as aggregated and deidentified connectivity data from Facebook. We train and calibrate the estimates using nationally representative household survey data from 56 LMICs and then validate their accuracy using four independent sources of household survey data from 18 countries. We also provide confidence intervals for each microestimate to facilitate responsible downstream use. These estimates are provided free for public use in the hope that they enable targeted policy response to the COVID-19 pandemic, provide the foundation for insights into the causes and consequences of economic development and growth, and promote responsible policymaking in support of sustainable development.

Cover page of Public mobility data enables COVID-19 forecasting and management at local and global scales.

Public mobility data enables COVID-19 forecasting and management at local and global scales.

(2021)

Policymakers everywhere are working to determine the set of restrictions that will effectively contain the spread of COVID-19 without excessively stifling economic activity. We show that publicly available data on human mobility-collected by Google, Facebook, and other providers-can be used to evaluate the effectiveness of non-pharmaceutical interventions (NPIs) and forecast the spread of COVID-19. This approach uses simple and transparent statistical models to estimate the effect of NPIs on mobility, and basic machine learning methods to generate 10-day forecasts of COVID-19 cases. An advantage of the approach is that it involves minimal assumptions about disease dynamics, and requires only publicly-available data. We evaluate this approach using local and regional data from China, France, Italy, South Korea, and the United States, as well as national data from 80 countries around the world. We find that NPIs are associated with significant reductions in human mobility, and that changes in mobility can be used to forecast COVID-19 infections.

Cover page of Reconfiguring Diversity and Inclusion for AI Ethics

Reconfiguring Diversity and Inclusion for AI Ethics

(2021)

Activists, journalists, and scholars have long raised critical questions about the relationship between diversity, representation, and structural exclusions in data-intensive tools and services. We build on work mapping the emergent landscape of corporate AI ethics to center one outcome of these conversations: the incorporation of diversity and inclusion in corporate AI ethics activities. Using interpretive document analysis and analytic tools from the values in design field, we examine how diversity and inclusion work is articulated in public-facing AI ethics documentation produced by three companies that create application and services layer AI infrastructure: Google, Microsoft, and Salesforce. We find that as these documents make diversity and inclusion more tractable to engineers and technical clients, they reveal a drift away from civil rights justifications that resonates with the managerialization of diversity by corporations in the mid-1980s. The focus on technical artifacts, such as diverse and inclusive datasets, and the replacement of equity with fairness make ethical work more actionable for everyday practitioners. Yet, they appear divorced from broader DEI initiatives and other subject matter experts that could provide needed context to nuanced decisions around how to operationalize these values. Finally, diversity and inclusion, as configured by engineering logic, positions firms not as ethics owners but as ethics allocators; while these companies claim expertise on AI ethics, the responsibility of defining who diversity and inclusion are meant to protect and where it is relevant is pushed downstream to their customers.

Micro-Estimates of Wealth for all Low- and Middle-Income Countries

(2021)

Many critical policy decisions, from strategic investments to the allocation of humanitarian aid, rely on data about the geographic distribution of wealth and poverty. Yet many poverty maps are out of date or exist only at very coarse levels of granularity. Here we develop the first micro-estimates of wealth and poverty that cover the populated surface of all 135 low and middle-income countries (LMICs) at 2.4km resolution. The estimates are built by applying machine learning algorithms to vast and heterogeneous data from satellites, mobile phone networks, topographic maps, as well as aggregated and de-identified connectivity data from Facebook. We train and calibrate the estimates using nationally-representative household survey data from 56 LMICs, then validate their accuracy using four independent sources of household survey data from 18 countries. We also provide confidence intervals for each micro-estimate to facilitate responsible downstream use. These estimates are provided free for public use in the hope that they enable targeted policy response to the COVID-19 pandemic, provide the foundation for new insights into the causes and consequences of economic development and growth, and promote responsible policymaking in support of the Sustainable Development Goals.