Visualization, Prediction, and Causal Inference: Applications in Healthcare
The recent wave of data collection in the field of healthcare has opened up an ocean of possibilities to learn and develop new exploratory, diagnostic, and prognostic methods. This thesis explores how three fields of statistics (1) data visualization, (2) prediction, (3) and causal inference, can help us leverage this data in order to answer a wide range of questions in healthcare.
Part I of this thesis presents a software package called superheat that can be used by researchers to visualize complex datasets and multi-faceted modeling results. The primary users of this software so far have been those in the medical research industry. In this thesis, we apply superheat in three case studies including (1) using a publicly available global organ donation database curated by the World Health Organization to understand and summarize the global organ donation trends, (2) visualizing groups of topics that appear in text data scraped from Google News, (3) examining model performance for a model designed to predict the brain's response to images using fMRI data. The theme of Part 1 of this thesis is visualization in healthcare.
Part II of this thesis introduces an analysis for predicting a patient's risk of developing a Surgical Site Infection (SSI) following surgery. A SSI is an infection that occurs at the site of a surgery within 30 days post surgery, and is responsible for up to 30% of hospital acquired infections. This method was developed in collaboration with healthcare professionals including infectious disease experts and surgeons at UC Davis. The theme of Part 2 of this thesis is prediction in healthcare.
Part III of this thesis presents a novel application of instrumental variables in causal inference, asking about the possible effectiveness of a "survival-benefit"-based liver transplant allocation scheme. The conclusion is that while there could be substantial benefit yielded from rethinking how organs are allocated, the feasibility of implementing such a scheme that relies drawing causal inferences from complex observational data is extremely difficult. The theme of Part 3 of this thesis is causal inference in healthcare.