UC San Diego
Automating cross-layer diagnosis of enterprise 802.11 wireless networks
- Author(s): Cheng, Yu-Chung
- et al.
The combination of unlicensed spectrum, cheap wireless interfaces and the inherent convenience of untethered computing have made 802.11-based networks ubiquitous in the enterprise. Modern universities, corporate campuses and government offices routinely deploy scores of access points to blanket their sites with wireless Internet access. However, while the fine-grained behavior of the 802.11 protocol itself has been well studied, our understanding of how large 802.11 networks behave in their full empirical complexity is surprisingly limited. Enterprise networks are of sufficient complexity that even simple faults can be difficult to diagnose-- let alone transient outages or service degradations. Nowhere is this problem more apparent than in the 802.11-based wireless access networks now ubiquitous in the enterprise. In addition to the myriad complexities of the wired network, wireless networks face the additional challenges of shared spectrum, user mobility and authentication management. Not surprisingly, few organizations have the expertise, data or tools to decompose the underlying problems and interactions responsible for transient outages or performance degradations. In this dissertation, we describe the design and implementation of an automated 802.11 wireless network diagnostic system called Shaman. First, we have deployed an infrastructure of over 190 radio monitors that simultaneously capture all 802.11b and 802.11g activity in a UC San Diego Computer Science & Engineering building (1M+ cubic feet). We address the challenges posed by both the scale and ambiguity inherent in such an architecture, and explain the algorithms and inference techniques to merge all distinctive traces into single globally synchronized traces. Since the end-to-end performance of user traffic is some combination of factors across all network layers, Shaman incorporates comprehensive, cross-layer models of 802.11 network behavior and performance. These models include broadband interference at the physical layer, per-packet link layer media access delays and losses, network layer device mobility and association management, and transport layer congestion and flow control. When users experience unsatisfactory performance at a particular time, they can query Shaman for a diagnosis. Shaman will then profile a user's traffic at that time, determine the network events that shape the performance profile, infer the causal sources of those events, and report the results to the user. Finally, we demonstrate the use of Shaman on UCSD CSE department building-wide 802.11 networks and illustrate the underlying analysis Shaman performs on real network trouble reports submitted by users of the network. We show that no one anomally, failure, or interaction is singularly responsible for all network problems, and that a holistic analysis is necessary to cover the range of problems experienced in real networks