In this dissertation, we evaluate the potential of unsolicited Internet traffic, called Internet Background Radiation (IBR), to provide insights into address space usage and network conditions. IBR is primarily collected through darknets, which are blocks of IP addresses dedicated to collecting unsolicited traffic resulting from scans, backscatter, misconfigurations, and bugs. We expect these pervasively sourced components to yield visibility into networks that are hard to measure (e.g., hosts behind firewalls or not appearing in logs) with traditional active and passive techniques. Using the largest collections of IBR available to academic researchers, we test this hypothesis by: (1) identifying the phenomena that induce many hosts to send IBR, (2) characterizing the factors that influence our visibility, including aspects of the traffic itself and measurement infrastructure, and (3) extracting insights from 11 diverse case studies, after excluding obvious cases of sender inauthenticity.
Through IBR, we observe traffic from nearly every country, most ASes with routable prefixes, and millions of /24 blocks. Misconfigurations and bugs, often involving P2P networks, result in the widest coverage in terms of visible networks, though scanning traffic is applicable for in-depth and repeated analysis due to its large volume. We find, notwithstanding the extraordinary popularity of some IP addresses, similar observations using IBR collected in different darknets, and a predictable degradation using smaller darknets. Although the mix of IBR components evolves, our observations are consistent over time.
Our case studies highlight the versatility of IBR and help establish guidelines for when researchers should consider using unsolicited traffic for opportunistic network analysis. Based on our experience, IBR may assist in: corroborating inferences made through other datasets (e.g., DHCP lease durations) supplementing current state-of-the art techniques (e.g., IPv4 address space utilization), exposing weaknesses in other datasets (e.g., missing router interfaces), identifying abused resources (e.g., open resolvers), testing Internet tools by acting as a diverse traffic sample (e.g., uptime heuristics), and reducing the number of required active probes (e.g., path change inferences). In nearly every case study, IBR improves our analysis of an Internet-wide behavior. We expect future studies to reap similar benefits by including IBR.