- Main
Machine Learning for Information Extraction from Pathology Reports and Adaptive Offline Value Estimation in Reinforcement Learning
- PARK, BRITON
- Advisor(s): Yu, Bin
Abstract
The thesis is divided into two parts. The first part focuses on a healthcare-related application of machine learning, and the second part focuses on offline evaluation of reinforcement learning agents, which is critical for estimating values of policies in high-risk and high-cost applications of reinforcement learning, such as patient care.
The first part is comprised by the pathology report parsing work on data from UCSF. Personalized medicine has the potential to revolutionize healthcare by enabling practitioners to tailor treatment and assessment for each individual patient. However, personalized care depends on the ability to leverage patient data intelligently. One major source of clinical data are pathology reports which are currently stored electronically at clinical institutions. However, much of the pathology report data cannot be easily leveraged since they exist in unstructured and semi-structured text; they must be parsed in a structured form before being used in downstream clinical applications. Furthermore, manual extraction of the data is a time-consuming and expensive process for a human annotator. Thus, researchers have studied ways to algorithmically parse reports via machine learning. Despite advancements in machine learning, particularly deep learning, building accurate parsers is still challenging due to the amount of training data that is required. In our work we focus on the sample efficiency of machine learning parsers using limited annotations. In the first part of the thesis, we develop machine learning-based data extraction methods for pathology reports at UCSF based on natural language processing with limited training data sizes. For each specific data field, such as the location of the tumor or the stage of the cancer, we train a model to automatically extract the information from each report using a limited set of human annotations as the training targets. We also analyze the sample efficiency of state-of-the-art methods compared to our approaches and study practical considerations in the deployment of such data extraction systems. We find that our proposed algorithms are able to achieve accuracies comparable to the state-of-the-art using fewer annotated data points.
In the second section, we focus on offline reinforcement learning, which is a data driven approach for reinforcement learning. Offline reinforcement learning relies on learning from static datasets, unlike the less restrictive, online setting which assumes learning is done via a feedback loop between the agent and the environment. The offline setting is crucial for applications in healthcare and robotics where deploying untrained or partially trained agents can be costly or dangerous. Leveraging historical data without further data collection is a unique challenge, because validating models must be done on data deriving from a different model or set of models. In chapter 5, we focus on adaptive weighting of predictions based on model stability for the goal of evaluating reinforcement learning policies, which is known as offline evaluation. We experiment with two state-of-the-art evaluation methods: fitted Q-evaluation and model-based evaluation. We propose a new estimator based on weighting each model based on conditional stability estimates via ensembling, which is inspired by online weighting of predictors for online prediction \cite{CesaBianchi2005OnlinePA, Altieri2021Curating} and the pessimistic reinforcement learning literature \cite{kidambi2020morel, kumar_2019}. We benchmark the offline evaluation methods on simulated environments detailed in chapter 5. We find that stability stemming from ensembling is a promising avenue for adaptively weighting model estimates in the setting where model selection and validation is difficult.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-