Skip to main content
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

A Deep Learning, Model-Predictive Approach to Neighborhood Congestion Prediction and Control


Traffic congestion is a major concern, especially in large cities. To relieve a city of congestion impacts, transportation authorities typically base controls such as toll or congestion prices on expected congestion patterns. However, such strategies can be sub-optimal and may also lead to unintended negative consequences for two main reasons: (i) estimates of congestion state may be inaccurate if the proposed model inputs are too limiting or the assumptions are too restrictive and (ii) predictions may be too late to help. This creates the need for accurate predictions of traffic congestion well ahead of time to avoid delays and gridlocks.

Traffic congestion is analyzed as a network-wide phenomenon. Large-scale spatial correlation and long-term temporal correlation govern traffic congestion propagation across the regional traffic network. Such correlations may be exploited to develop congestion prediction algorithms that are more effective than purely local predictions for the purpose of dynamic controls. Moreover, real-time information that is generated at locations across the network may signal the future congestion state, making it possible to take control measures in advance of worsening traffic situations. However, this approach also presents several new challenges, addressed in this research.

Microscopic models of congestion do not produce realistic representations of network-wide congestion generation and its propagation. This motivates the analysis of congestion at neighborhood scale through macroscopic analysis. Recent literature has shown that Macroscopic Fundamental Diagrams (MFDs) are effective for developing neighborhood-wide congestion controls by controlling inflows from immediately surrounding areas or managing signal timings, in combination with flow conservation laws. To further enhance the use of MFDs for neighborhood-wide congestion management, traffic prediction over much larger geographic scales is incorporated. To that end, a numerically well-behaved score function, called Macroscopic Congestion Level (MCL), is proposed. The score is defined to be the ratio of the neighborhood’s vehicle accumulation, to its trip completion rate. Future values of this score are then predicted through a model using network state characteristics over the larger, region-wide network as input. These characteristics are represented as a vector of Origin-Destination (O-D) demands, link accumulations, link travel times and observed MCL values. The predicted score can be used to describe the likely congestion state in a neighborhood in the near future.

It is challenging to develop congestion prediction algorithms that incorporate both spatial and temporal dependence at a network scale, adapt to sudden changes in demand patterns and forecast accurately over sufficiently long periods to implement controls. Deep learning is used to build sufficiently complex models for this task. Predictions are made using a deep learning model based on Long Short-Term Memory (LSTM) neural network architecture. The ideas are tested using simulation on a simple, hypothetical city-street network. A battery of simulation tests suggests that the model inputs are sensitive to different queue propagation scenarios, within and across days. MCL predictions made by the proposed deep learning model outperform a simple, yet competent, baseline model that assumes congestion on the next day mimics congestion today (referred to as the 1-Nearest Neighbor or 1-NN model).

The prediction accuracy of the deep learning model is compared to that of the baseline 1-NN model in three situations that either aid the baseline model or adversely impact the deep learning model: (i) correlated O-D demands across multiple days, (ii) noisy observations, and (iii) partially observed network state. The limitations to the prediction accuracy in these situations are studied in simulated scenarios. Methods are suggested to improve accuracy in these situations either by modifying certain model hyper-parameters or by understanding the importance of various inputs.

A Neural Attention Model-based framework is developed to extract the importance of various inputs and to better understand the inner workings of the deep learning model. Simulation experiments suggest that such a framework allows identification of major congestion-causing factors that may be targeted during control.

Model predictions and the importance of various inputs are then used to test a dynamic control strategy, namely an app-based dynamic congestion toll at the beginning of a trip. Individuals are charged a toll at the beginning of each trip based on their predicted congestion impact. This dissuades them from traveling during the time and along the routes where they are likely to cause major harm to the performance of the network. An optimization problem is formulated to minimize cumulative MCL across a day in a target neighborhood while maintaining overall travel demand. The conditions for optimal tolls are obtained and approximations for these conditions are proposed through learned deep learning model parameters. Simulations indicate that a deep learning model-based dynamic toll reduces delays and charges lower toll than other tolling strategies that depend on predictions of demand.

Possible improvements to the architecture of the deep learning model are discussed for congestion prediction in large networks. Signals in large networks are assumed to propagate through hypothetical graphs, such as graphs representing the road network or graphs representing the similarities in route-choices made by various individuals. The inputs are transformed to impose a graphical structure on them. A Graph Convolutional Neural Network (Graph-CNN) architecture is implemented for extracting relevant spatial features from this graphical input. An LSTM model makes real-time predictions based on these extracted features. Simulations of commuter trips on a pared-down freeway/highway network as well as full-scale network representing the San Francisco Bay Area suggest that the model accuracy is superior to that of the 1-NN model, the LSTM-only model, and a Graph-CNN + LSTM model without any road network or route-choice information.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View