Open Access Publications from the University of California

## Information Theoretic and Statistical Models for Spatial Transportation Networks: Total Mixing Entropy on Optimal Fluid Flow Networks and Time Dependent Stochastic Block Models

Abstract

This thesis contains two studies about models on organized spatial transport networks. The first introduces a new objective function aimed at understanding the ability that networked organisms such as fungi and slime molds to mix and efficiently disperse nuclei and molecular cues via advective currents. The second develops a novel type of stochastic block model to multilayer networks expressing model human transportation. It is a statistically based model that aims to classify parts of the network based on the function they serve for commuters. Our specific application is to bicycle share networks in different urban communities. Although these models are applied to disparate subjects they are connected from a mathematical point of view and illuminate a central theme in general tranpsport networks in two contrasting lights.

The first part of the thesis is inspired by the work of Murray [Mur26], who hypothesized that the geometries of blood flow networks are optimized to minimize the friction of flows through the network for a given total investment in the material that makes up the network. In the spirit of Murray, we hypothesize that biological networks, maximize their performance of objectives that are beneficial to the organism's existence while respecting constraints. We are inspired by experimental observations that show that the filamentous fungus \textit{Neurospora crassa} mixes nuclei and the slime mold \textit{Physarum polycephalum} mixes signals it receives from its environment on the distribution of food sources [AAP17]. We use the concept of information entropy to describe how advection currents within the network carry information. We construct a probability space for the signals passing through the network and write down a novel objective function, called the total negative mixing entropy, representing each node receiving the most mixed collection of signals. Put another way, to maximize its ability to adapt to stimuli, each node receives the most even distribution of signals from the other nodes within the network.

We then define optimal networks to be ones minimizing a cost function that is the sum of the total negative mixing entropy and of fluid dissipation. A constraint assuming a fixed amount of energy used for network upkeep is assumed. Using original optimization techniques, that we describe in this paper, we numerically calculate optimal networks under different assumptions on the driving force of the flow, the underlying topology and the total material cost function. From our numerical results we identify highlight results about the structure of optimal networks, which we are then able to prove rigorously. The proofs involve constructions and computations that illluminate how energy efficient fluid transport flows are connected to mix signals and the effects that Murray's law has on features such as whether the networks posess loops.

In the second part of the thesis, we define two new types of time-dependent stochastic block models for Bicycle-Sharing Networks. The model assumes that network can be modeled by a random process based on partitioning the possible origins and destinations into blocks. The blocks in our model express the roles that the stations play in relation to the entire network and trips are assumed to be generated by a mixture of time dependent commute patterns occuring between the blocks. The only parameter of our model that is chosen by the practitioner is the number of different blocks $K$.

The block to block commute patterns are represented in a $K\times K\times 24$ array, and the commute patterns are not assumed to be equal if the order of the pair of blocks is changed to take into account direction of flows. It can be viewed as a degree corrected model in that there are multiplicative terms for each station representing their importance within each block. The commute patterns and degree-correction terms are inferred parameters, optimized using gradient descent. We derive both a continuous and discrete version of this model.

We apply our models to Los Angeles, Manhattan, New York and San Francisco bike-share communities. The results reveal crisp divisions of home and work communities as defined by the preference of commuters to use bikes if we use two blocks. The models also reveal other relevant functional regions, such as parks, leisure commutes, and broad communities representing micro-cosms where riders mostly do not exit. With increasing blocks reveals more roles detecting new functional roles as well as refining roles with less blocks, such as breaking a geographical community into its home-work commute roles. How to choose the number of blocks is touched on, although we do not reach a definitive result with regards to that.