Improving Sequential Decision Making in Human-In-The-Loop Systems
Interactions between humans and autonomous systems are always necessary. They could be very simple interactions such as a person pushing a button to trigger a specific function, or more complicated interactions such as an autonomous vehicle interacting with other human drivers. Therefore, a safe and efficient interaction is crucial for advancing autonomous systems, especially those requiring persistent interactions with humans.
One common type of such systems is the human-assistance system such as warning systems in the aircraft and automatic braking systems in automobile. Traditionally, they only monitor the states of the machine to prevent human errors and enhance safety, but not take into account the state of the human in their decision-making processes, arguably the greatest variability affecting the safety. In light of the above drawbacks, we believe that more desirable autonomous systems should take the human state into account in their decision-making processes. In other words, other than the task completion, the exploration, estimation or even control of the human state should be a part of the decision-making loop in such human-in-the-loop systems. Moreover, to estimate the state of the human, most autonomous systems just passively gain information from their sensors, while ignoring the fact that the action of the autonomous system can actually help understand and estimate the human state better, and a better understanding of the human state will better achieve its goal as well.
In this thesis, we will develop frameworks and computational tools for human-in-the-loop systems to achieve a safe and efficient interaction. Beginning with a general form of the interactive model using a partially observable discrete-time stochastic hybrid system, we describe how its discrete form, partially observable Markov decision process, can be used to integrate the human model, the machine dynamical model and their interaction in a probabilistic framework. We will further advance the discrete version to hidden mode stochastic hybrid systems that can consider continuous states with discrete hidden modes used to model the hidden human intents. We tackle the computational challenge of the optimal control problem in hidden mode stochastic hybrid systems and show a significant improvement in the computational time. A driver-assistance application shows the efficacy of our proposed method. Finally, we propose to incorporate the safety constraint by a novel model predictive control based framework, which will encourage the exploration of the hidden human intent as well as achieving its goal with hard safety constraints. Taking them together, these contributions advance the computational framework for next generation human-in-the-loop systems, which are capable to monitor both the human and the machine states, actively explore the human intent, and give appropriate feedbacks to them in order to enhance both safety and efficiency.