From Simulation to the Real World: Deep Reinforcement Learning for Training Robust, Wave-Smoothing Policies for Autonomous Vehicles
- Jang, Kathy
- Advisor(s): Bayen, Alexandre
Abstract
With the advent of autonomous vehicles (AVs) comes a broad array of possibilities for control. Looking beyond the immediate wave of research that is focused on training models that can drive safely, this work looks into the future, and aims to develop models for AVs that can achieve more than safe driving. Traffic dynamics are notoriously difficult to model and capture on the micro-level, with behaviors ranging from human-observable to ones we are not aware of, happening every second. Reinforcement learning (RL) is a method which is effective in capturing structure from highly complex, heavy, behavioral data. In this work, we use RL and leverage its ability to understand complex human-driver and traffic dynamics in order to develop policies that are able to not only drive, but drive in a way that can smooth traffic.
With the goal of taking these traffic-smoothing algorithms to the real world, the aim of this work takes a path through three parts, from work done purely in simulation to a eventual 100-AV road test. We first explore the concept of using RL as a means of control for wave-smoothing policy control by examining experiments across a variety of traffic scenarios that demonstrate its effectiveness. This portion happens purely in simulation and explores various components of RL design, from environment design to reward shaping. With the goal of deployment always in mind, we also conduct research on how to develop RL policies that are robust enough to survive the transfer from simulation to the real world while sacrificing minimal performance. Lastly, the work comes together to explore the development and deployment of the MegaVanderTest, the deployment of 100 RL-enabled AVs, and to our knowledge, the largest test of AVs designed to smooth traffic.