Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Cruz

UC Santa Cruz Electronic Theses and Dissertations bannerUC Santa Cruz

Towards socially acceptable algorithmic models: A study with Actionable Recourse

Creative Commons 'BY' version 4.0 license
Abstract

The integration of machine learning (ML) models into our daily lives has become ubiquitous, influencing almost every aspect of our interaction with technology. However, as these models become more prevalent, particularly in sensitive areas such as healthcare, banking, and criminal justice, they must undergo rigorous scrutiny. This scrutiny addresses several critical social challenges, including data accessibility and integrity, privacy, safety, algorithmic bias, the explainability of outcomes, and transparency.

To foster trust and transparency in ML models, tools like Actionable Recourse (AR) have been developed. AR empowers negatively impacted users by providing recommendations for cost-efficient changes to their actionable features, thereby helping them achieve favorable outcomes. Traditional approaches to providing recourse focus on optimizing properties such as proximity, sparsity, validity, and distance-based costs. However, our work recognizes the importance of incorporating User Preference into the recourse generation process. By capturing user preferences through soft constraints—such as scoring continuous features, bounding feature values, and ranking categorical features—we propose a gradient-based approach to identify User Preferred Actionable Recourse (UP-AR). Our extensive experiments validate the effectiveness of this approach.

Moreover, as ML models automate decisions in various applications, it is crucial to provide recourse that considers latent characteristics not captured in the model, such as age, sex, and marital status. We explore how the cost and feasibility of recourse vary across latent groups. We introduce a notion of group-level plausibility and develop a clustering procedure to identify groups with shared latent characteristics. By employing a constrained optimization approach, which we call Fair Feasible Training (FFT) procedure, we aim to equalize the cost of recourse over these groups. Our empirical study on simulated and real-world datasets demonstrates that our approach can produce models with improved performance in terms of cost and feasibility at the group level.

In addition to addressing group-level disparities, our study suggests a model-agnostic set of actions from a presupposed catalog called Conformal Recourse AcTions Framework (CRAFT), ensuring the high probability of including the Desired action. This framework is adaptable to a black-box model setup and can be generalized across different models. It is intuitive, requiring only a set of calibration data points, and its effectiveness is corroborated by extensive experiments with real-world datasets.

The challenge of integrating ML models into practical applications extends to search engines, which play a pivotal role in retrieving relevant items based on user-specified queries. A significant challenge arises when there is a mismatch between the buyer's and seller's vocabularies, leading to insufficient recall or unsatisfactory results. This issue is exemplified by "Null and Low" (N

amp;L) queries, which can significantly degrade the user experience. Our analysis of user search behavior data from a major e-commerce company revealed that approximately 29% of search queries have multiple category interpretations, a phenomenon we term "multi-faceted query interpretations." Drawing a conceptual parallel between N
amp;L query reformulation and counterfactual explanation literature, we propose a novel method that utilizes a neural translation model to provide diverse and multiple reformulations, thereby enhancing the user experience for N
amp;L queries.

In conclusion, the advancement of machine learning models necessitates a multifaceted approach to ensure their ethical and equitable application. This thesis represents a necessary step in the pursuit of Trustworthy ML. By addressing the challenges of transparency, user preference, latent group disparities, and practical search engine limitations, we can move towards more responsible and user-centric machine learning systems. Additionally, it sheds light on potential avenues for future studies, underscoring the importance of continuous innovation and advancement within this vital field.