Understanding Dynamics of Travel Behavior with Inverse Reinforcement Learning and Hidden Markov Model
Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Understanding Dynamics of Travel Behavior with Inverse Reinforcement Learning and Hidden Markov Model

Abstract

We are in an era of rapid urbanization, technological advances, transportation transformation, and increasingly big data and computation power. We have witnessed how shared transportation (Uber, Lyft, Lime, Bird, etc) intrudes into daily lives in just a few years, how online shopping, including same-day grocery delivery, has changed day-to-day travel trajectories, and how the emerging work-from-home lifestyle would fundamentally change people's location choices. At the same time, large-scale data becomes more accessible than ever; so does the computation power needed to process such data. It is therefore a good time to retrospect existing paradigms of dynamic behavior models and keep exploring the potentials and new opportunities.

While studies on long-term travel behavior, such as residential location choice and working location choice, have been the emphasis of a substantial body of prior work, most empirical studies adopt a static approach to behavior modeling. For the small body of work that allows for dynamic behavior modeling, only backward-looking behavior, i.e., time-dependency, is incorporated, and the role of forward-looking behavior, i.e. by considering future expectations in sequential decision-making, has long been neglected. This is with good reasons: the estimation of a truly dynamic choice model is extraordinarily difficult due to (a) computational tractability associated with big data and large-scale dynamic programming to accommodate forward-looking and (b) scarcity of longitudinal data on long-term travel behaviors, such as residential moving trajectories. Yet long-term travel behavior is inherently dynamic, and this has led to concerns that estimates from static models may be biased.

In economics, dynamic discrete choice models (DDCM) have been used to model many aspects of transportation behaviors; however, this approach has several limitations, including its assumptions of optimal human behavior, conditional independence, extreme value distribution, etc. In the recent decade, advances in artificial intelligence, especially in inverse reinforcement learning (IRL), have inspired new approaches to solving complex dynamic behavioral problems. In particular, IRL can circumvent several assumptions common in DDCM, while still reconstructing problems and estimating models in a tractable way. However, the research worlds of economics and of artificial intelligence rarely reference each other; one objective of this dissertation is to bridge these two disciplines to address the challenging problem of modeling large-scale long-term forward-looking travel behavior.

We do not necessarily need the forward-looking assumption in all situations. In practice, for short-term and medium-term behavior trajectories, such as mode choice, car usage and car ownership, the backward-looking assumption can be sufficient. This is because these choices are associated with much lower costs both financially and psychologically and the impact of future expectations can be trivial. There is a rich amount of literature on backward-looking dynamic models, including studies on identifying policy and environment triggers that shift travel behavior, studies that investigate the role of key life events on travel behavior change, and studies on lifestyle analysis which treat lifestyle transition as a higher-level orientation of behavioral change. However, most frameworks on backward-looking dynamic modeling concentrate on analysis of single-dimensional choice and ignore the interdependence and multi-dimensionality of travel behaviors. Furthermore, few prior work consolidates all these dynamics components in a single framework to analyze the joint effect of different sources of triggers. Therefore, another objective of this dissertation is to develop a unified modeling framework that accounts for time-varying economic and policy context (external dynamics), life events (internal dynamics), lifestyle, and multi-dimensional interrelated choices.

The first component of this dissertation formulates a mathematical framework for representing long-term travel behaviors as sequential actions under the Inverse Reinforcement Learning (IRL) framework, which aims to address the forward-looking limitation. In the proposed framework, the individual observes the environment and takes action (i.e., move to a new location or not) accordingly by evaluating action-dependent future rewards received from the environment. The reward can be a function of built-in environment attributes, which shares a similar concept with the utility function in discrete choice models. Three highlights of the first component of this work are presented below.- In the classic IRL setting within the domain of artificial intelligence, the agents (usually robots) are often assumed to have homogeneous behavior and do not own any internal dynamics associated with the agents. Our work extends the IRL framework to accommodate heterogeneous household behavioral dynamics, and derives its corresponding learning algorithm to estimate the parameters associated with the attributes. - We provide an in-depth theoretical comparison between Dynamic Discrete Choice Model (DDCM) in economics and IRL in artificial intelligence from different aspects, including terminologies, assumptions, and model structures. - To validate the existence of forward-looking behavior and the methodological feasibility of the proposed framework, we use a large-scale infused data set of household relocation trajectories in Texas over a 7-year period (2005-2011). - The empirical results are three-fold. First, all households have a positive preference to locate in areas with higher degree of land-use mix, higher accessibility to jobs, and lower employment density. Our model also shows that low-income households focus more on current needs and are less forward-looking compared with households with higher income level. And low-income households present less willingness to pay for neighborhood amenities such as land-use mix and accessibility to jobs. In terms of goodness of fit, our proposed model outperforms the DDCM model (for high-income and low-income households), backward-looking model and the static model.

The second component of this work addresses the limitation on backward-looking models. The HMM framework has gained increasing attention in the transportation arena (in applications from car ownership to mode choice) due to its latent hierarchical structure, favorable model performance, and intuitive interpretation. Highlights of this component are as follows. - We extend the framework of heterogeneous Hidden Markov Model (HMM) from single-dimensional discrete choice to multi-dimensional discrete and continuous choices, and derive its recursive parameter learning algorithm. Building on this framework, we propose a unified model that conjoins lifestyle, life events, external environment, and multi-dimensional travel behavior dynamics. - We evaluate the feasibility and robustness of the proposed methodology via a case study: a retrospective survey in the San Francisco Bay Area consisting of 830 households. - To fully explore the potentials of the proposed framework, we provide trend analysis of car ownership and mode use based on estimation results, and conduct sensitivity analysis of changes in fuel price and unemployment rate. - We identify four latent lifestyles: auto-oriented-2-car group with rare use of other travel modes, auto-oriented-1-car group with rare use of other travel modes, multi-modals group that own at least one car, and auto-free group that have the lowest car ownership and car usage. The results demonstrate how life events, policies, and the economic environment influence people on lifestyle transitions.

In sum, this dissertation provides building blocks to evolve the field of dynamic behavior modeling by incorporating advances in artificial intelligence. Throughout this dissertation, when providing theoretical improvements building on each mathematical framework, we ground each methodology with case studies. Empirical results have shown our methodologies can effectively help quantify the triggers that prompt individuals and households to change their travel behavior, better predict the trend of future mobility, and help transportation planning and policy-making.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View