Rapid urbanization and growth in urban populations have forced community-scale infrastructures (e.g., water, power and natural gas distribution systems, and transportation networks) to operate at their limits. Aging (and failing) infrastructures around the world are becoming increasingly vulnerable to operational degradation, extreme weather, natural disasters and cyber attacks/failures. These trends have wide-ranging socioeconomic consequences and raise public safety concerns. In this thesis, we introduce the notion of cyber-physical-human infrastructures (CPHIs) - smart community-scale infrastructures that bridge technologies with physical infrastructures and people. CPHIs are highly dynamic stochastic systems characterized by complex physical models that exhibit regionwide variability and uncertainty under disruptions. Failures in these distributed settings tend to be difficult to predict and estimate, and expensive to repair. Real-time fault identification is crucial to ensure continuity of lifeline services to customers at adequate levels of quality. Emerging smart community technologies have the potential to transform our failing infrastructures into robust and resilient future CPHIs.
In this thesis, we explore one such CPHI - community water infrastructures. Current urban water infrastructures, that are decades (sometimes over a 100 years) old, encompass diverse geophysical regimes. Water stress concerns include the scarcity of supply and an increase in demand due to urbanization. Deterioration and damage to the infrastructure can disrupt water service; contamination events can result in economic and public health consequences. Unfortunately, little investment has gone into modernizing this key lifeline.
To enhance the resilience of water systems, we propose an integrated middleware framework for quick and accurate identification of failures in complex water networks that exhibit uncertain behavior. Our proposed approach integrates IoT-based sensing, domain-specific models and simulations with machine learning methods to identify failures (pipe breaks, contamination events). The composition of techniques results in cost-accuracy-latency tradeoffs in fault identification, inherent in CPHIs due to the constraints imposed by cyber components, physical mechanics and human operators. Three key resilience problems are addressed in this thesis; isolation of multiple faults under a small number of failures, state estimation of the water systems under extreme events such as earthquakes, and contaminant source identification in water networks using human-in-the-loop based sensing. By working with real world water agencies (WSSC, DC and LADWP, LA), we first develop an understanding of operations of water CPHI systems. We design and implement a sensor-simulation-data integration framework AquaSCALE, and apply it to localize multiple concurrent pipe failures. We use a mixture of infrastructure measurements (i.e., historical and live water pressure/flow), environmental data (i.e., weather) and human inputs (i.e., twitter feeds), combined and enhanced with the domain model and supervised learning techniques to locate multiple failures at fine levels of granularity (individual pipeline level) with detection time reduced by orders of magnitude (from hours/days to minutes). We next consider the resilience of water infrastructures under extreme events (i.e., earthquakes) - the challenge here is the lack of apriori knowledge and the increased number and severity of damages to infrastructures. We present a graphical model based approach for efficient online state estimation, where the offline graph factorization partitions a given network into disjoint subgraphs, and the belief propagation based inference is executed on-the-fly in a distributed manner on those subgraphs. Our proposed approach can isolate 80% broken pipes and 99% loss-of-service to end-users during an earthquake.
Finally, we address issues of water quality - today this is a human-in-the-loop process where operators need to gather water samples for lab tests. We incorporate the necessary abstractions with event processing methods into a workflow, which iteratively selects and refines the set of potential failure points via human-driven grab sampling. Our approach utilizes Hidden Markov Model based representations for event inference, along with reinforcement learning methods for further refining event locations and reducing the cost of human efforts.
The proposed techniques are integrated into a middleware architecture, which enables components to communicate/collaborate with one another. We validate our approaches through a prototype implementation with multiple real-world water networks, supply-demand patterns from water utilities and policies set by the U.S. EPA. While our focus here is on water infrastructures in a community, the developed end-to-end solution is applicable to other infrastructures and community services which operate in disruptive and resource-constrained environments.