Lawrence Berkeley National Laboratory
Accelerating Network Traffic Analytics Using Query-Driven Visualization
- Author(s): Bethel, E. Wes
- Campbell, Scott
- Dart, Eli
- Stockinger, Kurt
- Wu, Kesheng
- et al.
Realizing operational analytics solutions where large and complex data must be analyzed in a time-critical fashion entails integrating many different types of technology. This paper focuses on an interdisciplinary combination of scientific data management and visualization/analysis technologies targeted at reducing the time required for data filtering, querying, hypothesis testing and knowledge discovery in the domain of network connection data analysis. We show that use of compressed bitmap indexing can quickly answer queries in an interactive visual data analysis application, and compare its performance with two alternatives for serial and parallel filtering/querying on 2.5 billion records' worth of network connection data collected over a period of 42 weeks. Our approach to visual network connection data exploration centers on two primary factors: interactive ad-hoc and multiresolution query formulation and execution over n dimensions and visual display of then-dimensional histogram results. This combination is applied in a case study to detect a distributed network scan and to then identify the set of remote hosts participating in the attack. Our approach is sufficiently general to be applied to a diverse set of data understanding problems as well as used in conjunction with a diverse set of analysis and visualization tools.