Skip to main content
Open Access Publications from the University of California

UC Santa Cruz

UC Santa Cruz Electronic Theses and Dissertations bannerUC Santa Cruz

UC Santa Cruz Electronic Theses and Dissertations

Cover page of Detecting and Predicting Hot Moments of Methane Emissions from Coastal Wetlands

Detecting and Predicting Hot Moments of Methane Emissions from Coastal Wetlands


Coastal wetlands are highly productive ecosystems and can store large amounts of carbon (C). However, decomposition processes in coastal wetlands also produce and emit greenhouse gasses (GHG), such as methane (CH4) - a potent greenhouse gas that could offset C storage in the wetland soil. Often a patchwork of vegetation and open water, coastal wetlands exhibit strong biogeochemical heterogeneity, resulting in elevated CH4 flux (FCH4) at certain times and locations. These points of elevated FCH4, termed “hot spots and hot moments" (HSHM), experience biogeochemical rates so high they can disproportionally contribute to annual flux rates. Despite the broad utilization of the term HSHM, there is no standardized, statistically rigorous method for identifying HSHM and quantifying their impact on ecosystem processes. Furthermore, the conditions that trigger HSHM of FCH4 are poorly understood, and hot moments are often excluded from wetland FCH4 upscaling and predictive modeling. This study presents a comparative analysis of standard HM identification techniques to find the best HM detection method for coastal wetlands and formalize HM identification best practices. We found that using a rolling Z-score threshold to identify hot moments from eddy covariance (EC) flux data was most suitable for coastal wetlands. Using this approach, we flagged hot moments at nine wetlands in the San Francisco Bay-San Joaquin River Delta (Bay-Delta). We then used the identified HMs to train several data-driven Random Forest (RF) models that leverage EC data to predict the occurrence of HMs. The best performing RF accurately (79%) captured HM absence/presence in the Bay-Delta region, and the relative importance of predictive environment parameters in the model shed light on the best predictors for HM. The method comparison in this study provides a best practices workflow for researchers when defining HSHM, and the RF HM model provides an upscaling methodology that could be used to predict the occurrence of HM FCH4 at sites without EC towers. Thus, the HM identification methodology and the predictive model present a valuable tool for wetland managers and restoration planners who can use the information to prioritize time and resources for mitigating and preventing these rare but high-impact emission events.

Cover page of Bridging Imaging and Training: Open Source Device for Biological Imaging and an Interactive Virtual Lab Training

Bridging Imaging and Training: Open Source Device for Biological Imaging and an Interactive Virtual Lab Training


This dissertation explores the utilization of internet-connected devices to addressthe accessibility challenges associated with microscopy in the field of biology. Tra- ditional microscopy devices are often costly, limiting their widespread adoption in laboratories. To overcome this barrier, we developed a cost-effective multi-well imaging device integrated with an open-source pump and stimulation system. This device facilitates longitudinal cell studies and reduces the time required for capturing cell images, thereby enhancing efficiency for biologists. Moreover, our internet-connected imaging system serves as an innovative ed- ucational tool for aspiring scientists. Through remote project-based learning ac- tivities and engaging serious games, complex biological concepts become more accessible, fostering enthusiasm and understanding among high school and under- graduate students. The dissertation discusses the application of this system in an educational setting and the implementation of an algorithm for determining organoid size through computer vision. Additionally, it introduces an interactive virtual lab, designed to provide a low-stakes environment for students to learn scientific and laboratory safety protocols. Overall, this work highlights the potential of internet-connected devices in revolutionizing both biological research and education. By making microscopy more accessible and engaging, it empowers scientists and students alike to explore the intricacies of biology with enthusiasm and curiosity.

Visceral Landscapes: Recuperating Migrant Narratives in Contemporary Photography of the U.S.-Mexico Border


From its conception as a nation, the U.S. pushed from east to west and north to south expanding and establishing new borders, a process reinforced by nineteenth and twentieth century U.S. landscape photography. By contrast, twentieth and twenty-first century Chicanx and Latinx artists addressed the border as a site of political turmoil, violence, and resistance as well as a metaphorical and transportable space of identity formation. My dissertation draws upon these two fields, landscape studies and Chicanx/Latinx visual culture, to addresses contemporary landscape photography at the U.S.-Mexico border. In doing so, I established a comparative analysis of nineteenth and twentieth century landscapes against the decolonizing potential of the contemporary works that expose, subvert, and expand traditional landscape vernaculars. My dissertation examines closely David Taylor and Marcos Ramirez ERRE’s DeLIMITations (2014), the Border Film Project (2005-2007), Richard Misrach and Guillermo Galindo’s Border Cantos (2016), and Delilah Montoya’s Sed: Trail of Thirst (2004-2008). Each of these projects delve into a critical set of questions around what is happening at the border and why, but also how the border crisis impacts migrants and migrant communities. To address these questions, I approach the photographs in search of narratives that are often omitted from history, art history, and contemporary discourse on the U.S.-Mexico border. I then use visual clues in the images to generate new narratives about who has existed in the border landscape, who is currently attempting to cross, and what these experiences might be like. I argue that their work transgresses old forms and pushes against the limiting nature of the border itself. In doing so they bring attention to the double erasure of Indigenous and migrant memory, history, and bodies from the landscape; expose the lasting colonial legacy of earlier landscape practices that contributed to the making of the border; and explore the possibility for innovative visions that deviate from those of territorial mastery, and instead attend to migrant subjectivities.

Cover page of Smartphone-Based Pedestrian Tracking System for Visually Impaired People

Smartphone-Based Pedestrian Tracking System for Visually Impaired People


Current smartphone-based localization systems, primarily designed towards sighted individuals, offer wayfinding services by tracking a user's path. However, this design overlooks the unique navigation needs of blind individuals who utilize long canes or guide dogs and have distinct movement patterns. To bridge this gap, this thesis introduces novel localization techniques tailored for blind pedestrians in both indoor and outdoor settings. These techniques avoid the need for BLE beacons and Wi-Fi, as well as camera-based systems, all of which are impractical for blind users. Instead of these options, the proposed methods utilize IMU sensors, allowing users to conveniently place their phones in their pockets without the requirement of any external infrastructure.

Indoor localization in the absence of maps is addressed in this thesis through a unique combination of a Mixture Kalman filter and a GRU-based straight walking detector. Together, these form a two-stage turn detector that operates under the assumption that corridor intersections occur at 45° or 90° angles. In situations where maps are accessible, the research incorporates two Pedestrian Dead Reckoning (PDR) methods with the map data via a particle filter. In outdoor settings, this thesis expands the use of IMU sensor data by integrating it with GPS signals through a particle filter. This method creates a flexible model effective in both open areas and in environments with wall constraints, as specified by maps. Comprehensive testing of these systems involved trials with the WeAllWalk dataset, containing data from visually impaired walkers, and user studies conducted using two separate iPhone applications for indoor and outdoor localization. Results from these tests clearly demonstrate the effectiveness of the proposed localization solutions.

Aspects of Emersonianism in American Fiction & At Interminable Oceans (A Novel)


This creative-critical project is a trying out of Emersonian transcendental reconceptions of aesthetic experience. I have written a “draught of a draught” of a novel with a sequence of events and a main character, Darshan Kehama, based loosely on the struggle Ralph Waldo Emerson describes in his great essay, “Experience,” to “realize” the “self” by recovering indigenous feelings of serenity native to the self as Emerson comes to terms with the loss of his son. I take the title of my Emersonian novel - At Interminable Oceans – from a crucial Emersonian passage in the essay to refer to homes found within and on the shores of seas and oceans, the metaphorical and literal setting for the Kehama family. The panoramic view of the Arabian Sea and Pacific Ocean and their horizons from Indian and American homes on the Bombay and Carmel, California coasts is a reminder of the inexhaustible oceanic powers the Kehamas draw from, as they find their “true romance” in Emersonian moments of calm in the face of incalculable grief. This irenic mood enacted in a serene pitch of third person narration and description crucial to conjure up the “tone” of family life in this lost Indian American world is one way to envision that Emersonian slogan of transforming loss into “practical power.” Wai Chee Dimock’s reading of a crucial passage on ownership in Emerson’s moving essay, “Experience,” inspire my modified notion of “aesthetic ownership” as I attempt to dramatize this struggle of transformation in this “draught of a draught” of a novel. I could only express what I found in Ralph Waldo Emerson’s account of having, getting, and keeping irenic feelings, his essential story of self-renewal offered in that indispensable essay, in novelistic form. I draw inspiration for my Emersonian novel from my first attempt to read aspects of Emersonianism I find in Emerson’s “Experience” in the great American novels by Herman Melville, Henry James, and Frank Norris.

In “Aesthetic ownership as Self-Renewal,” I read Emerson’s “Experience,” taking into account one recent reappraisal of experience that gives a prominent place to Emersonian concepts of “ownership” and “property.” Since Wai Chee Dimock’s idea of ownership grounds an interpretation that goes against contemporary materialistic definitions or restatements of Emersonianism it is worth foregrounding a model of self-renewal consistent with an Emersonian idealism and dualism that is more in sync with the transcendentalism of 1842-44. Dimock, in an unforgettable and overlooked reading of this essay, posits a self, “sovereign within itself,” as a consequence of “division” between an “inner locale” of experience with its own measures of “scarcity, sufficiency, superfluity” “separate and apart” from “objective reality” with its economic measures of what is or is not sufficient. The inner aesthetic economy Dimock describes rests on a specific claim about ownership that equates property with poverty. The self that is a consequence of this surprising “commoditization” of poverty or scarcity is in a mode of “aesthetic ownership,” a cooler register of subjective experience shorn of intensities. The ghost-like, soporific, illusory glow, the stunned depleted quality of experience Emerson complains about in the first half of the essay, I argue, also brings its shine and is claimed as a virtue in the second part of “Experience.” This wisdom found in the closing of Emerson’s essay is brought about by a shift from observation conducive to scientific scrutiny of paltry empiricism to Emersonian moments of seeing crucial to renewing awareness of the self’s constitutional indigence or emptiness. The emptiness or poverty claimed in this inner aesthetic mode of ownership is also a first step in reclaiming the recuperative powers of surprise/wonder in the Emersonian self.

In Chapter 1, “Transcendental Resistance: Emersonian Irenic Thinking and the Passage to India in Herman Melville’s Moby-Dick,” I argue that members of the crew in the Pequod, as they struggle to move away from the rising forces of capitalism sweeping the eastern shores of America, also offer transcendental resistance to the possessive/pillaging drive of the whaling business they have unwittingly committed themselves to. The resistance of the Pequod crew is a reminder of how indigenous thinking [the nature of it] of William Apess, a nineteenth century religious figure from the Peqout tribe, resists the paradoxes/hypocrisies in the political/religious arena in colonial America. In Apess’s words: “the pious fathers wrestled hard and long with their God, in prayer, that he would prosper their arms and deliver their enemies into their hands.” This hypocrisy, according to Apess, the “foundation” of “all the slavery and degradation" in the American colonies, was also the basis of the laws for Indian Removal. Even when openly legal crimes were being passed off as part of God's great design, Apess suggests cleaving to what is nobler, and finding ways to resist, not react, to the lower moral orbit of hypocrisy. The parallel between the great Pequot’s example of indigenous thinking, an early precursor to Emersonian irenic thinking, and the hypocrisies/paradoxes of imperialism/racism it uncovers in colonial America and what Melville’s voyage discovers in the role played by Queequeg, Pip, Fedallah, Tashtego, suggests that an Emersonian aesthetic ownership is being offered in this novel to allow the duplicitous nature of this full scale slaughter on the high seas to become clearly visible. When the Pequod’s eastward movement across the oceans orients the ship towards America’s western shores this whale hunt transforms to a quest romance, an obvious reversal of the Columbian voyage and its aspirations. Melville finds a passage to India not to claim the material wealth of the Indies but to reclaim the worth of dispossession, an austere mode of aesthetic ownership, modeled on ancient worship in India’s Elephanta Caves, where Melville’s narrator claims the oldest known portrait of a whale can be found on one of its cave walls. This invocation of a more serene relationship to the non-human world makes starker the tragic fall/lapse in perception that underlies the brutal slaughter/business of whales. Blood flows freely in this oceanic hunt but the brief passage to India offers another renunciatory space, after Apess’s example, to help transcend the grasping tendencies of this business on the oceans.

In Chapter 2, Austere Aestheticism and Emersonian Renunciation in Henry James’s The Portrait of a Lady, I begin with a quick overview, following Jonathan Freedman, of the motivations of the British aestheticism movement. Freedman discusses British aestheticism in the late nineteenth century as the historical context for introducing an aspect of Emersonianism I call austere aestheticism, which brings back Stuart Sherman’s reading of Henry James’s aesthetic idealism and his overlooked essay on Emerson as another way to look at the late nineteenth and early twentieth century revival of aestheticism, a carry over of the American version of the aesthetic project to mark out a special sphere, “a locus of value and a guarantee of authority” to “Art.” I argue that this motivation to allot art its autonomy is derived from an aspect of Emersonianism where aesthetic ownership’s idealistic registers counterbalance the materialist registers of Paterian aesthetic experience. Aesthetic ownership, first shown/described in the “transparent eyeball” passage and its afterlife in moments of seeing in the Emersonian literary tradition, as THE master metaphor for the impersonal/irenic aesthetic dimension of ownership, contrasts with the charged intensity of aesthetic experience defined by Walter Pater in his conclusion to The Renaissance: Studies in Art and Poetry. The calmer, more detached, mode of aesthetic ownership in Henry James’s austere aestheticism achieves the ambition of “Art” (the source of the autonomy and authority that marks the special sphere of “Art” - uppercase “A” as Richard Poirier and F.O. Matthiessen explain this difference) by introducing “ascesis” as the basis of the American writer’s implicit critique and fulfillment of what the British aesthetes attempted to but could not quite achieve in art. I argue that this austere aspect of aestheticism is born out of an overlap between Pater’s ascesis and Emerson’s renunciatory ideals. This conception of life/experience in that defining Emerson essay, “Experience,” is brought about by a beautiful balance in aesthetic ownership and aesthetic experience, the vantage of austere aestheticism that makes renewal possible.

In Chapter 3, “Emersonianism West: Aesthetic Experience and Aesthetic Ownership,” in Frank Norris’s novel, McTeague: A Story of San Francisco, I begin by noting how the use and exchange value of things remain the most obvious measures of worth in the novel. But these values ascribed to possessing things only superficially account for McTeague’s obsessive attachment to what he owns and refuses to part with. I argue that the aspect of aesthetic experience most called for in the novel goes beyond what Gavin Jones has called “the embarrassment of naturalism,” or the feeling of being “stuck” that has come to dominate McTeague’s lived experience in Polk Street. Emersonian compensatory consolations of aesthetic ownership in reimagined property relations take the place of lost material conditions of ownership when McTeague is forced to dispossess his most valued things. This displacement radically transforms McTeague’s longing and the sense of the novel’s unusual ending. A hidden, more dominant, measure of worth reveals itself in these renunciatory moments giving McTeague’s possessive drives an expansive scale. I present a Norris we have not yet fully appreciated, a naturalist working within the tradition of Emersonian aestheticism that was first conceived in Richard Poirier’s A World Elsewhere: The Place of Style in American Literature and that has recently found new articulation in Wai Chee Dimock’s Emerson.

Metaphysical Biases in the Discourse of Artificial Intelligence


This text examines implicit metaphysical assumptions that influence the discourse of artificial intelligence by shaping key underlying concepts such as intelligence, agency, rationality, thought, mechanism, process and number. I consider the impact of these assumptions on the frameworks and methods of the field, as well as their relevance to debates concerning the ethics of research in AI. In this context, I evaluate the consequences of Jacques Derrida’s critique of metaphysics, examining its implications for conceptual models of artificial intelligence that continue to depend on the foundational ideas his critique puts into question.The first part, “Mind Against Mechanism,” deals generally with the idea of intelligence and the figure of the thinking machine. I argue that Beneficial AI, a prominent research program within machine ethics, relies on a misguided theory of intelligence that combines misconceptions about evolutionary biology with inappropriate analogies to nuclear energy, and mobilizes longstanding anxieties about the viability of traditional beliefs regarding individual autonomy and agency. My analysis corroborates Timnit Gebru’s call for a holistic approach to Ethical AI that incorporates a more diverse group of scientists into the design process. I add to this that scientists are not enough—serious engagement is needed with a wider range of scholarly work beyond the sciences. The second part of this manuscript, “Number Beyond the Algorithm,” considers the provenance of premises that inform the way we conceive of the limits of computation. I show that the adoption of the set as the founding element of the real number system follows from a concept of number that elevates presence over process (i.e., cardinality over sequence). This preference for presence aligns with the culturally- specific beliefs of a small group of European mathematicians, particularly with regard to the nature of number and infinity. Yet, the resulting definitions continue to inform contemporary characterizations of the relationship between mathematics and computa- tion by contributing to the conviction that the objects of mathematics (and the world they correspond to) vastly exceed what machines can compute.

Cover page of High-order Kernel-based Finite Volume Methods for Systems of Hyperbolic Conservation Laws

High-order Kernel-based Finite Volume Methods for Systems of Hyperbolic Conservation Laws


Systems of hyperbolic conservation laws (HCLs) commonly arise as mathematical descriptors of the natural world, and are particularly ubiquitous in fluid dynamics. These laws appear as complicated and highly nonlinear partial differential equations describing the evolution of fundamental conserved quantities such as mass, momentum, and energy. Solving these equations analytically is entirely intractable for all but the simplest cases, and investigating problems with real world importance falls to numerical approaches more and more frequently. Most HCLs exhibit rich dynamics with complicated smooth flows and discontinuities coexisting, often with shocks arising frominitially smooth data. Designing numerical schemes that can efficiently and accurately represent smooth phenomena, while also remaining robust and reliable in the vicinity of shocks, is very challenging. Finite volume methods are one particularly useful approach to designing such methods as conservation is enforced discretely, and discontinuities can be represented quite naturally. An unfortunate drawback of these methods is that achieving high-order accuracy in multiple space dimensions is difficult. This dissertation overcomes these challenges by developing a kernel-based non-polynomial reconstruction scheme that is manifestly multidimensional. This scheme is first posed as a linear recovery problem in a reproducing kernel Hilbert space. This linear reconstruction method is then cast into a weighted essentially non-oscillatory (WENO) framework so that it may represent both smooth and discontinuous data. This scheme is then incorporated into solvers for the compressible Euler equations, compressible Navier-Stokes equations, and ideal magnetohydrodynamics (MHD) equations. In doing so, a novel set of variables that are more suited to multidimensional reconstruction, dubbed the linearized primitive variables, are introduced. Troubled cell indicators are developed that allow for a more accurate and efficient treatment of smooth solutions in an entirely automatic fashion. Positivity preserving limiters are also incorporated, and allow for the evolution of flows with extremely strong shocks. A highly parallel multi-GPU implementation is provided, and the proposed method is tested against a variety of stringent benchmark problems.

Cover page of Essays on Influence of Information and Innovation in Digital Markets

Essays on Influence of Information and Innovation in Digital Markets


This dissertation presents three experimental studies with an emphasis on the influence of information and innovation on digital markets including financial exchanges and online marketplaces. The first chapter focuses on the experimental evaluation of a new financial market design. The second and third chapters focus on how access to information affects people's behavior in the online marketplaces.

The first chapter provides a laboratory study of a newly proposed Flow Market format as a response to the design weaknesses of the continuous double auction used in most financial markets worldwide.We designed and deployed a laboratory experiment that compares the Flow Market and the CDA using several fundamental metrics. We find evidence that the flow market changes traders’ behavior relative to CDA, allowing them to shred orders more effectively: compared to the CDA, the Flow market exhibits fewer and larger orders. We also find both formats perform similarly in terms of price and allocative efficiency. However, the Flow Market leads to lower price volatility. Interestingly, the total traded volume is lower under the Flow Market than under CDA. Still, this difference decreases with traders’ experience, i.e., as they learn the mechanics of the Flow format. Our findings provide initial insights regarding the feasibility of the Flow trade format and its potential to promote financial market stability and fairness.

The second chapter modifies the traditional sequential search models to consider the ex post uncertainty in which the uncertainty cannot be fully eliminated by the search. We derived players' optimal search strategy given their risk attitudes. We also test our theory in a laboratory experiment with a search game to track subjects' behavior and use a multiple price list and a bomb risk elicitation task to elicit subjects' risk preferences. We find that, in this scenario, risk-averse players tend to increase their reservation value and extend their search duration.

The last chapter investigates the informational barrier problem in the online marketplaces. We design a sequential game to study how informational barrier is formed due to the reputation system and propose a fractional searching mechanism to enable the entry by reliable firms. The model predicts the optimal behaviors of the firms and shows that the fractional searching enables cost-effective firms with no reputation to enter the market more easily. Additionally, we test our theory in a laboratory experiment using an interactive market game to track subjects' behaviors with a simplified reputation system. The experimental results indicate that fractional searching effectively alleviates the entry problem of new firms with superior quality.

Temporal Regulation of Nematode Development from a Biochemical, Circadian Perspective


Timing mechanisms are utilized by organisms in a variety of biological functions. From circadian rhythms to nematode development, the genetic networks that underly the keeping of time are complex and thoroughly regulated. Circadian rhythms allow organisms to anticipate daily environmental changes and thus confer an adaptive advantage and the genetic network that governs them is well-established. While much is still to be gleaned about the molecular basis of circadian timekeeping, a model of a rewired developmental timer based on conserved circadian clock orthologs C. elegans, is beginning to emerge. In this dissertation, I discuss the conservation of specific factors and provide new insights that highlight the biochemical mechanisms that regulate C. elegans development. C. elegans are a widely studied model organism, yet little is known about the molecular basis for its temporal control of development. Two intricately linked timers, the molting cycle and heterochronic pathway, coordinate cuticle regeneration and growth with stage-specific cellular events. Several circadian orthologs have established roles in regulating these processes. Chapter 2 describes the homology between nuclear hormone receptors (NHRs), retinoic acid-related nuclear receptor (RORα/β/γ) and NHR-23, transcription factors that activate the expression of a core clock component and drives the transcriptional network that governs nematode molting, respectively. We lay the groundwork for a conserved mode of ligand-binding, as well as identify a separate class of small molecules that bind to NHR-23. The interaction between PERIOD (PER) proteins and its cognate kinase, Casein Kinase 1 and ε (CK1), is integral to determining the phase and timing of circadian rhythms. PER is a stoichiometrically limiting factor in the repressive complex that provides the inhibition of circadian transcription. Stable anchoring of CK1 to PER2 mediates phosphorylation of PER that regulates its stability and abundance in the cell. This interaction is also required for CK1-dependent displacement of the core clock transcription factor from DNA. Chapter 3 demonstrates the C. elegans homologs to PER and CK1, LIN-42 and KIN-20, respectively, interact in a similar mode to regulate C. elegans development. We show that two kinase-binding motifs within the CK1-binding domain (CK1BD; CK1BD-A and CK1BD-B) are conserved enough in LIN-42 to mediate binding to CK1 in vitro. We determine that the expression of LIN-42 and KIN-20 temporally and spatially overlaps and that the CK1BD as well as KIN-20 kinase activity are required from proper molting timing. We further show that phosphorylation of LIN-42 by CK1 leads to kinase inhibition, suggesting a conserved mode of product inhibition whereby phosphoserine(s) anchor into conserved anion binding sites along the kinase active site. In chapter 4, we discuss our recent work to identify a novel regulator of the C. elegans molt cycle. Through in vivo techniques, we show that KIN-20 and a previously uncharacterized ankyrin repeat domain-containing protein (ANKRD49), are similarly expressed temporally and spatially, and interact. We show that C. elegans ANKRD49 binds to human CK1 with nanomolar affinity in vitro, and that this interaction influences kinase activity on LIN-42. An AlphaFold binding model of the complex predicts that stable binding is mediated through the ANKRD49 structured C-terminus and a flexible CK1 helix near the active site that is important for substrate recognition and processing. This model also predicts that the interaction is enhanced via binding of the ANKRD49 unstructured N-terminus to the CK1 substrate binding cleft. We show that deletion of the ANKRD49 unstructured N-terminus as well as mutations near and on the flexible CK1 helix that alter circadian period in mammals, reduce the C. elegans ANKRD49/human CK1 affinity >10-fold. Depletion of ANKRD49 in vivo leads to asynchronous and delayed molting similar to kin-20(null) phenotypes. Given the high conservation of CK1 across organisms as well as several proteome-scale studies that also identify a human ANKRD49/CK1 interaction, this work potentially has broader implications for understanding circadian rhythms and temporal regulation in diverse organisms. In summary, throughout this dissertation I have used an interdisciplinary approach of utilizing biochemistry and in vivo C. elegans genetics, to describe the molecular basis for circadian homolog function in C. elegans development. In addition to the findings discussed herein, this work provides a framework elucidating the molecular underpinnings of nematode timing mechanisms as well as an additional insight into evolutionary conservation of timekeeping mechanisms.

Cover page of Data-Efficient Representation Learning for Gaze Estimation

Data-Efficient Representation Learning for Gaze Estimation


The human gaze serves as a potential non-verbal cue that enhances human-computer interfaces, enabling users to engage with devices through eye movements. The ability to accurately measure and interpret gaze direction plays a critical role in various domains, including social interactions, assistive technologies, augmented reality, and psychological research to examine cognitive state.

Over the past decade, gaze estimation has emerged as a prominent area of interest within the research community. Conventional gaze estimation methods rely on specialized hardware, including high-resolution cameras, infrared light sources, and image processing units, to detect eye features like the pupil center and iris boundary. While these devices offer greater accuracy and precision, their practical use is limited by factors such as high costs, restricted head movements, and limited range of allowable distances between user and device. As an alternative to dedicated gaze-tracking hardware, several techniques have been developed to infer gaze direction directly from eye images captured by standard cameras on personal devices such as laptops, tablets, and phones.

The recent emergence of deep learning techniques has enhanced learning-based gaze estimation approaches. These appearance-based gaze estimation methods directly map eye images to gaze targets without the need for explicit detection of eye features and, therefore, have a strong capability to work in unconstrained environments. However, the effectiveness of these approaches greatly depends on having access to extensive training datasets that include a variety of eye appearances, gaze directions, head poses, lighting conditions, and other variables. In this thesis, we focus on improving the adaptability and effectiveness of webcam-based gaze estimation techniques through the application of generative modeling and representation learning.

First, we propose an easy approach for calibrating a laptop camera with a commercial gaze tracker, streamlining the process of collecting labeled gaze data to make it readily accessible for all users. This dataset can then be utilized to enhance the accuracy of appearance-based gaze estimation methods for new users and different domains.

Second, we introduce a generative redirection framework designed to manipulate gaze direction and head pose orientation in synthesized images. This framework is used to generate augmented, gaze-labeled datasets, thereby enhancing the performance of gaze estimation methods.

Third, we explore self-supervised contrastive learning to acquire equivariant gaze representations through an unlabeled multiview dataset. These gaze-specific representations are utilized for few-shot gaze estimation, enhancing the efficacy of user-specific models.

Finally, we present a spatiotemporal model for video-based gaze estimation, incorporating attention modules to enhance understanding of both local spatial and global temporal dynamics. Furthermore, we improve the performance of this model using person-specific few-shot learning through Gaussian processes.