Skip to main content
eScholarship
Open Access Publications from the University of California

Department of Informatics

Open Access Policy Deposits bannerUC Irvine

Open Access Policy Deposits

This series is automatically populated with publications deposited by UC Irvine Donald Bren School of Information and Computer Sciences Department of Informatics researchers in accordance with the University of California’s open access policies. For more information see Open Access Policy Deposits and the UC Publication Management System.

Cover page of Introduction

Introduction

(2017)

This issue of the California Journal of Politics and Policy is produced in collaboration with the Kem C. Gardner Policy Institute at the David Eccles School of Business at the University of Utah.Drawing on the expertise of political scientists, economists, and practitioners from 13 west-ern states, the reports summarize each state’s budget for the 2017‒2018 fiscal year. These reports delve into how the states’ financial well-being affected legislation and just as importantly how legislation affected the states’ financial well-being.While most states seem to be financially sound, if not thriving, each report highlights possi-ble threats in the coming years, whether they be political, economic, or natural concerns. One theme across this year’s budget reports is how the 2016 election of President Donald Trump has affected legislation and fiscal health of the states. A second theme in the budget papers is the need to plan for the next recession.

Cover page of Reconciling the contrasting narratives on the environmental impact of large language models.

Reconciling the contrasting narratives on the environmental impact of large language models.

(2024)

The recent proliferation of large language models (LLMs) has led to divergent narratives about their environmental impacts. Some studies highlight the substantial carbon footprint of training and using LLMs, while others argue that LLMs can lead to more sustainable alternatives to current practices. We reconcile these narratives by presenting a comparative assessment of the environmental impact of LLMs vs. human labor, examining their relative efficiency across energy consumption, carbon emissions, water usage, and cost. Our findings reveal that, while LLMs have substantial environmental impacts, their relative impacts can be dramatically lower than human labor in the U.S. for the same output, with human-to-LLM ratios ranging from 40 to 150 for a typical LLM (Llama-3-70B) and from 1200 to 4400 for a lightweight LLM (Gemma-2B-it). While the human-to-LLM ratios are smaller with regard to human labor in India, these ratios are still between 3.4 and 16 for a typical LLM and between 130 and 1100 for a lightweight LLM. Despite the potential benefit of switching from humans to LLMs, economic factors may cause widespread adoption to lead to a new combination of human and LLM-driven work, rather than a simple substitution. Moreover, the growing size of LLMs may substantially increase their energy consumption and lower the human-to-LLM ratios, highlighting the need for further research to ensure the sustainability and efficiency of LLMs.

Cover page of Xylem: An Energy-efficient, Globally Redistributive, Financial Infrastructure Using Proof-by-Location

Xylem: An Energy-efficient, Globally Redistributive, Financial Infrastructure Using Proof-by-Location

(2024)

The Proof-of-Work algorithm that underlies Bitcoin, Ethereum w , 1 and many other cryptocurrencies is well known for its energy-intensive requirements. The Proof-of-Stake algorithm that underlies Ethereum and various other cryptocurrencies is less impactful environmentally, but it has a second, looming issue: the problem of wealth inequality. We have developed an alternative to Proof-of-Work and Proof-of-Stake, called Proof-by-Location, that has the potential to address both of these issues. This article describes Proof-by-Location and a financial platform called Xylem that is based on it. This platform seeks to distribute transaction fees to billions of cryptocurrency “Notaries” around the world (essentially, anyone with a smartphone), who work together to establish a distributed consensus about financial transactions. In this article, we demonstrate that this platform can scale to more than 3.9 trillion transactions per year (more than triple the number of digital payments per year currently occurring). We show a reduction of electricity usage per transaction of 99.9999914% compared to Bitcoin, 99.999905% compared to Ethereum w , 99.83% compared to Ethereum, and 95.9% compared to the Visa financial services company. We demonstrate that this platform would have a redistributive rather than consolidatory effect on wealth compared to any of these platforms, leading to a source of income for more than 1 billion people around the world, including more than 110 million in the bottom 10th to 20th percentile by income, with income for that group equivalent to 8.8 million full-time jobs. Finally, this currency provides a positive, non-compulsory mechanism for shaping human habitation patterns in ways that can slow global biodiversity loss and enable ecological restoration. Using Xylem as a global financial infrastructure could lead to significantly better social and environmental outcomes than existing financial platforms. 2

Cover page of Intergenerational effects of a casino-funded family transfer program on educational outcomes in an American Indian community.

Intergenerational effects of a casino-funded family transfer program on educational outcomes in an American Indian community.

(2024)

Cash transfer policies have been widely discussed as mechanisms to curb intergenerational transmission of socioeconomic disadvantage. In this paper, we take advantage of a large casino-funded family transfer program introduced in a Southeastern American Indian Tribe to generate difference-in-difference estimates of the link between childrens cash transfer exposure and third grade math and reading test scores of their offspring. Here we show greater math (0.25 standard deviation [SD], p =.0148, 95% Confidence Interval [CI]: 0.05, 0.45) and reading (0.28 SD, p = .0066, 95% CI: 0.08, 0.49) scores among American Indian students whose mother was exposed ten years longer than other American Indian students to the cash transfer during her childhood (or relative to the non-American Indian student referent group). Exploratory analyses find that a mothers decision to pursue higher education and delay fertility appears to explain some, but not all, of the relation between cash transfers and childrens test scores. In this rural population, large cash transfers have the potential to reduce intergenerational cycles of poverty-related educational outcomes.

Cover page of Closing the gap between open source and commercial large language models for medical evidence summarization.

Closing the gap between open source and commercial large language models for medical evidence summarization.

(2024)

Large language models (LLMs) hold great promise in summarizing medical evidence. Most recent studies focus on the application of proprietary LLMs. Using proprietary LLMs introduces multiple risk factors, including a lack of transparency and vendor dependency. While open-source LLMs allow better transparency and customization, their performance falls short compared to the proprietary ones. In this study, we investigated to what extent fine-tuning open-source LLMs can further improve their performance. Utilizing a benchmark dataset, MedReview, consisting of 8161 pairs of systematic reviews and summaries, we fine-tuned three broadly-used, open-sourced LLMs, namely PRIMERA, LongT5, and Llama-2. Overall, the performance of open-source models was all improved after fine-tuning. The performance of fine-tuned LongT5 is close to GPT-3.5 with zero-shot settings. Furthermore, smaller fine-tuned models sometimes even demonstrated superior performance compared to larger zero-shot models. The above trends of improvement were manifested in both a human evaluation and a larger-scale GPT4-simulated evaluation.

Cover page of Hidden flaws behind expert-level accuracy of multimodal GPT-4 vision in medicine.

Hidden flaws behind expert-level accuracy of multimodal GPT-4 vision in medicine.

(2024)

Recent studies indicate that Generative Pre-trained Transformer 4 with Vision (GPT-4V) outperforms human physicians in medical challenge tasks. However, these evaluations primarily focused on the accuracy of multi-choice questions alone. Our study extends the current scope by conducting a comprehensive analysis of GPT-4Vs rationales of image comprehension, recall of medical knowledge, and step-by-step multimodal reasoning when solving New England Journal of Medicine (NEJM) Image Challenges-an imaging quiz designed to test the knowledge and diagnostic capabilities of medical professionals. Evaluation results confirmed that GPT-4V performs comparatively to human physicians regarding multi-choice accuracy (81.6% vs. 77.8%). GPT-4V also performs well in cases where physicians incorrectly answer, with over 78% accuracy. However, we discovered that GPT-4V frequently presents flawed rationales in cases where it makes the correct final choices (35.5%), most prominent in image comprehension (27.2%). Regardless of GPT-4Vs high accuracy in multi-choice questions, our findings emphasize the necessity for further in-depth evaluations of its rationales before integrating such multimodal AI models into clinical workflows.

Cover page of Linguistic Features of Secondary School Writing: Can Natural Language Processing Shine a Light on Differences by Sex, English Language Status, or Higher Scoring Essays?

Linguistic Features of Secondary School Writing: Can Natural Language Processing Shine a Light on Differences by Sex, English Language Status, or Higher Scoring Essays?

(2024)

This article provides three major contributions to the literature: we provide granular information on the development of student argumentative writing across secondary school; we replicate the MacArthur et al. model of Natural Language Processing (NLP) writing features that predict quality with a younger group of students; and we are able to examine the differences for students across language status. In our study, we sought to find the average levels of text length, cohesion, connectives, syntactic complexity, and word-level complexity in this sample across Grades 7-12 by sex, by English learner status, and for essays scoring above and below the median holistic score. Mean levels of variables by grade suggest a developmental progression with respect to text length, with the text length increasing with grade level, but the other variables in the model were fairly stable. Sex did not seem to affect the model in meaningful ways beyond the increased fluency of women writers. We saw text length and word level differences between initially designated and redesignated bilingual students compared to their English-only peers. Finally, we see that the model works better with our higher scoring essays and is less effective explaining the lower scoring essays.

Cover page of DermaVision

DermaVision

(2024)

Approximately 10 million people in the United States suffer from domestic violence annually, with 4 out of 10 cases affecting people of color. Traditional coloration guides remain the primary forensic strategy to evaluate bruise injuries, which are highly subjective and inaccurate for monitoring bruises. Additionally, this approach fails to consider bruise pigmentation in darker skin tones, and the results of this qualitative method vary by the medical professional conducting the inspection. There is a need for reliable, quantitative bruise information across all skin tones that can be utilized in both medicine and justice. DermaVision aims to address this need by designing a portable multi-spectral camera to quantitatively analyze bruises in diverse skin tones. By correlating the reflective spectra of a bruise with its age and healing progression, our camera will provide an accurate timeline for when bruises occur irrespective of patients’ skin color. This technology will assist forensics and medical professionals in improving their analyses and treatments and can provide valuable, admissible evidence in courts. For Validation purposes, a prototype imaging device has been developed to gather preliminary clinical data in collaboration with the University of California, Irvine Trauma Center. The team remains motivated to build a secure and reliable imaging tool that can serve the diverse population of domestic violence survivors. Faculty advisor Professor Elliot Botvinick

Cover page of Variability in the Integration of Peers in a Multi-site Digital Mental Health Innovation Project.

Variability in the Integration of Peers in a Multi-site Digital Mental Health Innovation Project.

(2024)

Peer support specialists (peers) who have the lived experience of, and are in recovery from, mental health challenges are increasingly being integrated into mental health care as a reimbursable service across the US. This study describes the ways peers were integrated into Help@Hand, a multi-site innovation project that engaged peers throughout efforts to develop and offer digital mental health interventions across counties/cities (sites) in California. Using a mixed methods design, we collected quantitative data via quarterly online surveys, and qualitative data via semi-annual semi-structured phone interviews with key informants from Help@Hand sites. Quantitative data were summarized as descriptive findings and qualitative data from interviews were analyzed using rapid qualitative analysis methods. In the final analytic phase, interview quotes were used to illustrate the complex realities underlying quantitative responses. 117 quarterly surveys and 46 semi-annual interviews were completed by key informants from 14 sites between September 2020 and January 2023. Peers were integrated across diverse activities for support and implementation of digital mental health interventions, including development of training and educational materials (78.6% of sites), community outreach (64.3%), technology testing (85.7%), technology piloting (90.9%), digital literacy training (71.4%), device distribution (63.6%), technical assistance (72.7%), and cross-site collaboration (66.7%). Peer-engaged activities shifted over time, reflecting project phases. Peer-provided digital literacy training and technology-related support were key ingredients for project implementations. This study indicates the wide range of ways peers can be integrated into digital mental health intervention implementations. Considering contextual readiness for peer integration may enhance their engagement into programmatic activities.