Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Barbara

UC Santa Barbara Electronic Theses and Dissertations bannerUC Santa Barbara

UC Santa Barbara Electronic Theses and Dissertations

Sparse and Low-rank Matrix Decomposition – Application in Finance

(2024)

The field of machine learning is witnessing a rapid expansion in the literature that explores techniques and applications of sparse and low-rank matrix decompositions. Typically formulated as an optimization problem involving nuclear norm minimization, this paradigm offers computational efficiency and robust statistical recovery guarantees, contrasting with the NP-hard nature of rank-based objectives. This thesis dedicates attention to the development of new methodology (Chapter 2) and also its application to finance (Chapter 3), as described below. Chapter 1 furnishes the necessary background and conducts a comprehensive survey of the related literature.

Chapter 2 concerns dimensionality reduction methods such as principal component analysis (PCA) and factor analysis, which are central to many problems in data science. There are, however, serious and well-understood challenges to finding robust low dimensional approximations for data with significant heteroscedastic noise. This Chapter introduces a relaxed version of Minimum Trace Factor Analysis (MTFA), a convex optimization method with roots dating back to the work of Ledermann in 1940. This relaxation is particularly effective at not overfitting to heteroskedastic perturbations and addresses the commonly cited Heywood cases in factor analysis and the recently identified ``curse of ill-conditioning" for existing spectral methods. We provide theoretical guarantees on the accuracy of the resulting low rank subspace and the convergence rate of the proposed algorithm to compute that matrix. We develop a number of interesting connections to existing methods, including Hetero PCA, Lasso, and Soft-Impute, to fill an important gap in the already large literature on low rank matrix estimation. Numerical experiments benchmark our results against several recent proposals for dealing with heteroskedastic noise.

In Chapter 3, we shift focus to factor analysis applied to security returns. Traditionally, commercially successful factor analysis relies on fundamental models, despite a rich academic literature exploring statistical models. Traditional statistical approaches like PCA and maximum likelihood exhibit success but suffer from drawbacks, such as a lack of robustness and insensitivity to narrow factors. To address these limitations, we propose convex optimization methods inspired by the techniques from Chapter 2. These methods aim to decompose a security return covariance matrix into its low-rank and sparse components. The low-rank component captures broad factors affecting most securities, while the sparse component accounts for narrow factors and security-specific effects. We illustrate the efficacy of this approach by measuring the variance forecasting accuracy of a low-rank plus sparse covariance matrix estimator through simulations and an empirical analysis of global equity data, showcasing improvements over PCA-based methods.

Cover page of Unveiling Covert Threats: Towards Physically Safe and Transparent AI Systems

Unveiling Covert Threats: Towards Physically Safe and Transparent AI Systems

(2023)

This thesis examines the ethical implications of artificial intelligence through the lens of physical safety. It scrutinizes the various ways AI systems can instigate unsafe behavior in users, emphasizing the underexplored domain of covertly unsafe language. To improve the safety-related reasoning ability of large language models, we propose FARM to systematically generate rationales attributed to credible sources for physical safety scenarios. Lastly, we close with a broader discussion on AI transparency, delineating its differing research threads and associated considerations such as safety, and call toward a human-centered approach to evaluate future research, centering on the foundational debate of whether humans should trust intelligent systems.

Effort Towards the Total Synthesis of Portimine A

(2023)

Natural products are varieties of naturally occurring compounds that possess wide range of useful biological activities. Their intricate structures and activities are often the inspiration behind designs of medicinally significant molecules and new methodologies to perform chemical transformations. Through total synthesis of natural products, organic chemists discover how complex molecules behave in flasks and develop creative ways to build them. This dissertation describes findings and challenges from the effort towards total synthesis of portimine A, a complex marine natural product that is a highly selective inducer of apoptosis towards several cancer cells.

The first part of this dissertation details a new method in creating contiguous quaternary and tertiary carbon stereocenters through remote stereocontrol in Ireland-Claisen rearrangement. This new method enables stereocontrol in the rearrangement through a distant acetonide as an internal stereocontrol unit instead of transferring chirality from a chiral secondary alcohol, providing a rapid access to an extensively functionalized framework.

The second part of this dissertation describes efficacy of employing enyne metathesis in constructing cyclohexene diene motif commonly present in many members of cyclic imine family. This strategy is elegantly staged by the proceeding remote Ireland-Claisen rearrangement and is a convenient way to gain access to conjugated cyclic dienes.

The third part of this dissertation demonstrates the unprecedented use of N-heterocyclic carbene catalysis in complex macrocyclization. This strategy, with a large potential to be further optimized, is a useful method for large-membered macrocyclization baring various functionalities. The most recent discoveries on the total synthesis of portimine A following the successful macrocyclization are also discussed, and the remainder of synthesis is in progress.

Cover page of Adaptive Sequential Decision Making: Bandit Optimization and Active Learning

Adaptive Sequential Decision Making: Bandit Optimization and Active Learning

(2023)

Deep neural networks usually have many hyperparameters that need to be tuned. Modern material design problems usually require material scientists to sequentially select processing parameters and conduct experiments to observe material performances. To save privacy cost, the learning system needs to carefully choose queries to answer under the differential privacy framework. To train a robot under video guidance, engineers need to carefully choose video samples for training. However, in all cases, people cannot observe performances of unselected actions and the experimental cost can be huge. These two challenges hinder efficient neural network training, new material design, privacy protection, and robot training and call for actions. In this thesis, I present my research on optimization, bandits, and active learning under the adaptive sequential decision making framework. My algorithms are able to solve black box function optimization without the curse of dimensionality, achieve no regret under the function class misspecification, reduce privacy cost under the differential privacy framework, and significantly reduce video sample complexity for robot training. All of them come with theoretical or empirical analysis.

Cover page of Towards Bridging the Divide: Enhancing Understanding of Digital Inequity

Towards Bridging the Divide: Enhancing Understanding of Digital Inequity

(2023)

The Internet has become crucial for communication, education, commerce, and civicengagement, but not everyone has equal opportunities to benefit from it, leading to digital inequity. This inequity stems from various aspects of Internet access, such as availability, quality, and affordability. Policymakers and stakeholders must understand the presence and extent of digital inequity to develop strategies that can bridge the gaps and ensure equal Internet access for all.

Acquiring relevant data that sheds light on all aspects of digital inequity is imperativefor building a complete understanding of the issue. Unfortunately, such data is currently either non-existent or too noisy to be of any use. Policymakers in the US have long relied on imprecise data obtained either from the Federal Communication Commission or through crowdsourced network measurements to estimate the availability and quality of Internet services in different regions, and allocate funding accordingly to improve Internet access. However, due to the limitations of these datasets, funding initiatives that rely on them may not achieve their intended objectives. Additionally, there are no publicly available sources of data that can provide accurate information on the cost of Internet access across the nation. As a result, it is extremely challenging to understand Internet affordability and how that contributes to digital inequity.

This dissertation aims to address these challenges in several ways. Firstly, we charac-terize existing Internet access datasets to gain insights into current digital inequity trends. Additionally, we develop methodology and tools that can provide comprehensive data on various dimensions of digital inequity. Leveraging our solutions, we enhance the usability of crowdsourced network measurements to better understand Internet quality. Moreover, we curate multiple novel datasets that provide insights into Internet availability and af- fordability nationwide. This work is crucial in helping policymakers and organizations make informed decisions to address digital inequity and create a more equitable digital society.

Cover page of La Mafia Global: Global Capitalism and the Struggle against Hyper-Incarceration

La Mafia Global: Global Capitalism and the Struggle against Hyper-Incarceration

(2023)

This dissertation focuses on the links between global capitalism, the hyper-incarceration of poor and racialized working-class communities, and surplus humanity. It explores the social control mechanisms used against poor communities in Southern California. In an effort to draw out the links between the micro-, meso-, and macro-levels of analysis, I undertake a macro-analysis of the crisis of global capitalism by examining existing data and then turn to 37 interviews with self-identified activists, immigrants, homeless individuals, formerly incarcerated and system-impacted people, and street vendors, all as part of a three-year ethnographic approach. I show how the above participants are part of a social control mechanism of surveillance, policing, and criminalization – systems that funnel people into the prison system and that form part of what Robinson calls the global police state. Specifically, I look at Robinson’s (2020) militarized accumulation and accumulation by repression in an effort to show how transnational capital is more and more dependent on hyper-incarceration as a means of capital accumulation worldwide. The dissertation calls for a systemic upheaval and a revolution that rallies for the abolition of the prison–industrial complex and the criminal injustice system. In addition, the final chapter provides a strong critique of identitarian paradigms and argues that these paradigms lack a critique of and struggle against global capitalism.

Cover page of Molecular Beam Epitaxy of β-Ga2O3: Growth and Doping

Molecular Beam Epitaxy of β-Ga2O3: Growth and Doping

(2023)

Efficient power electronic devices are essential for minimizing power losses during power conversion due to the growing worldwide energy consumption and global warming. Ultrawide bandgap semiconductors show high breakdown voltage to achieve efficient power conversion. In particular, β-Ga2O3 has been considered as a promising ultrawide bandgap semiconductor material for next-generation power electronics due to its large bandgap (4.8 eV) and breakdown field (8 MV/cm). The availability of melt-based growth method enables manufacturing of extremely high-quality single-crystal bulk substrates.The optimization of growth orientations is critical toward high-quality β-Ga2O3 epitaxial films. In this study, epitaxial growth of β-Ga2O3 films on (110) substrates have been performed via plasma-assisted molecular beam epitaxy (PAMBE). Atomic force microscopy (AFM) scan shows a very low RMS roughness of 0.08 nm for the surface of the as received (110) substrates. High-resolution X-ray diffraction measurements reveal a 2.5 nm/min growth rate of β-Ga2O3 films on (110) substrates for conventional PAMBE growth conditions (~700 ℃) which is comparable to that of on (010) substrates. The surface morphology of β-Ga2O3 epitaxial films is smooth and has a similar dependence on Ga flux as (010) growth. However, the (110) plane does not have tendency to show a well-defined step-terrace structure in spite of the appearance of (110) facets in growth of (010) β-Ga2O3. Indium catalyzed growth was also demonstrated to improve the growth rate up to 4.5 nm/min and increase the maximum growth temperature up to 900 ℃ of (110) β-Ga2O3. The continuous Si doping in β-Ga2O3 epitaxial films grown by PAMBE through the utilization of a valved effusion cell for the Si elemental source. Secondary ion mass spectroscopy (SIMS) results exhibit that the Si doping profiles in β-Ga2O3 are flat and have sharp turn on/off depth profiles. The Si doping concentration was able to be controlled by either varying the cell temperatures or changing the aperture of the valve of the Si effusion cell. Additionally, the High crystal quality and smooth surface morphologies were confirmed on Si-doped β-Ga2O3 epitaxial films grown on (010) and (001) substrates. The electronic properties of Si-doped (001) β-Ga2O3 epitaxial film showed an electron mobility of 67 cm2/Vs at the Hall concentration of 3.0×1018 cm-3. β-Ga2O3 epitaxial film grown by PAMBE shows outstanding crystal quality. However, the residual nitrogen in the oxygen gas source results in nitrogen incorporation into the β-Ga2O3 epitaxial films. Since nitrogen is a deep acceptor in the β-Ga2O3 materials system, the incorporation of nitrogen will affect the transport properties of β-Ga2O3 films. To identify the nitrogen incorporation level, nitrogen incorporation in β-Ga2O3 films was measured by SIMS with low detection limit of nitrogen. The PAMBE-grown β-Ga2O3 epaxial films showed a nitrogen concentration of 1.0×1017 cm-3 either by conventional MBE growth and MOCATAXY growth. To prevent the nitrogen incorporation, pure ozone source was used as the oxygen source for the growth β-Ga2O3. The ozone concentration was improved to up to 80% by adding a recirculating line between the MBE and the pure ozone generator.

Koopman Representations in Control

(2023)

The Koopman operator describes the time-evolution of scalar-valued functions under the action of a dynamical system. These functions are called observables, and their evolution is always linear, even if the underlying dynamical system is nonlinear. The linearity of the Koopman operator framework is attractive to both dynamical systems theorists who study the spectral properties of these operators as well as to control theorists who leverage linearity to simplify control design. Recently, the theory of Koopman representations has emerged, with researchers gradually exploring the benefits of alternate, potentially nonlinear ways of representing these systems. In this thesis, we explore three distinct ways of representing the Koopman operator and explore their application to control design.

In the first part of this dissertation, we develop the mathematical underpinning of Koopman representation theory. The evolution from the state-space representation of a dynamical system to the Koopman operator is described, and its spectral content is explored. Next, nonlinear representations of the Koopman operator and its extension to systems with input is described. Finally, we introduce some of the numerical approximation schemes for the representations that are used in this paper. This chapter is meant to give the reader the mathematical background necessary to appreciate the results presented in the remaining chapters.

In the second part of this dissertation, we demonstrate a linear representation of the Koopman operator which fully leverages spectral objects such as eigenfunctions and eigenvalues. The eigenfunctions are special observables which evolve under the action of the Koopman operator via multiplication by a complex scalar, the eigenvalues. This is analogous to the eigenvectors and eigenvalues of a linear transformation. A collection of eigenfunctions forms a finite-dimensional, linear representation of a dynamical system, and their evolution spans a Koopman-invariant subspace. Finding this finite-dimensional representation allows for the application of well-developed linear systems methodologies to nonlinear systems such as spectral analysis and linear optimal control methods. In this work, we introduce a deep learning architecture that learns the Koopman eigenfunctions of a dynamical system from data and constructs the resulting finite-dimensional, linear representation of the Koopman operator. In numerical examples, the eigenfunctions learned using this framework exhibit a predictive performance superior to popular fixed-basis methods such as Extended Dynamic Mode Decomposition (EDMD). Finally, we extend the architecture to controlled dynamical systems by simultaneously learning the eigenfunctions of the natural dynamics with special system-decoupling observables on the inputs. Numerical examples show that the linear predictors obtained in this way can be readily used to design controllers that act directly on the Koopman modes of the system.

In the third part of this dissertation, we introduce our first use of the static Koopman operator in control. In our application, this is a linear operator which maps the space of functions of static poses of a soft robotic arm to the space of functions of the pressures in the arm's actuating muscles. We use static Koopman operator as a pregain term in our optimal control implementation alongside a traditional dynamic Koopman operator. Using both Koopman representations, we advance the modeling and control of soft robots into the inertial, nonlinear regime. We control motions of a soft, continuum arm with velocities 10x larger and accelerations 40x larger than those of previous work, and do so for high-deflection shapes with over 110 degrees of curvature. This work advances rapid modeling and control for soft robots from the realm of quasi-static to inertial, laying the groundwork for the next generation of compliant and highly dynamic robots.

Lastly, in the fourth chapter, we introduce a nonlinear Koopman representation which leverages so-called input-parameterized Koopman eigenfunctions. In the control of systems with multiple fixed points, it is typical to use piecewise control methods and local Koopman models. In contrast, our input-parameterized eigenfunction representation is accurate globally and enables a finite dimensional model which can handle these control problems without ad-hoc piecewise methods. We illustrate this on the control between the basins of attraction of the Duffing oscillator with dissipation.

  • 1 supplemental ZIP
Cover page of Essays in the Economics of Wildfire

Essays in the Economics of Wildfire

(2023)

This dissertation explores the economic consequences of wildfire and smoke in the United States. The third chapter, Wildfire smoke in the United States, is joint work with Matthew Wibbenmeyer, and examines regional and temporal trends in wildfire smoke impacts. It synthesizes research on health, economic, and behavioral impacts, proposing modifications to federal air quality regulations to address wildfire smoke. The second chapter, Wildfire, smoke, and outdoor recreation in the western United States, is a collaboration with Margaret Walls and Matthew Wibbenmeyer. It focuses on the effects of wildfire and smoke for outdoor recreation. The paper combines millions of administrative campground reservation records with daily satellite data on wildfire, smoke, and air pollution, finding that more than ten percent of available recreation days are affected by severe smoke in some regions. The first chapter, Non-market damages of wildfire smoke: evidence from administrative recreation data, is also a collaboration with Margaret Walls and Matthew Wibbenmeyer. This chapter exploits the dataset of the second chapter to provide among the first revealed preference estimates of smoke damages. A structural model of sequential recreation decisions finds that smoke reduces welfare by $107 per person per trip. Annually, more than 21.5 million outdoor visits in the western United States are affected by wildfire smoke, with welfare losses of $2.3 billion. These findings contribute to a growing body of evidence on the costs of wildfire smoke.

Cover page of Collective Reputations and Business Sustainability

Collective Reputations and Business Sustainability

(2023)

Environmental and social issues involving firms often arise due to difficulties in observing their environmental and social performance. Firms’ greenhouse gas emissions, toxic releases, accounting scandals, insider trading, and child labor abuse are all such cases where stakeholders face challenges in detecting and addressing them beforehand. Therefore, it is often helpful for stakeholders of firms (such as customers, government, investors, etc.) to have reliable and observable information available in normal times to accurately assess firms’ unobservable qualities.

When reliable and direct information about a firm’s unobservable qualities is unavailable, stakeholders might turn to a collective reputation of firms to evaluate difficult-to-observe qualities. A collective reputation refers to stakeholders’ beliefs about which firms belong to a specific group and the stereotype of the qualities and characteristics common to that group. By utilizing a collective reputation, stakeholders associate a firm with a broader group through a common observable trait, and then use their stereotype about that group to infer the firm's other, more difficult-to-observe qualities.

In my dissertation, I propose that a collective reputation is a combination of three attributes: group membership, group stereotype, and salience. Each of these attributes is the subject of my three dissertation studies, where I investigate how changes in each of them affect stakeholders’ evaluation of firms. The first study examines how changes in a firm’s group membership affect its financial performance, using the case of South Korea’s business group firms, known as Chaebol. The second study investigates how changes in stakeholders’ stereotypes about the group influence their investment in a new venture start-up within the context of entrepreneurship. The third study examines how changes in the salience of a collective reputation due to information disclosure affect the financial performance of firms, utilizing the case of the US EPA’s TRI (Toxic Release Inventory) program.

Changes in a collective reputation can significantly impact firms’ financial performance, and thus it is crucial for business practitioners to clearly understand the mechanism of how a collective reputation works and how stakeholders utilize it to evaluate firms’ unobservable qualities. Throughout my dissertation and each chapter, I provide business and policy implications on how practitioners can better use a collective reputation to enhance business sustainability and financial performance.