Skip to main content
eScholarship
Open Access Publications from the University of California

Scalable Analysis of Distributed Workflow Traces

Abstract

Bacterial response to nitric oxide (NO) is of major importance since NO is an obligatory intermediate of the nitrogen cycle. Transcriptional regulation of the dissimilatory nitric oxides metabolism in bacteria is Large-scale workflows are becoming increasingly important in both the scientific research and business domains. Science and commerce have both experienced an explosion in the sheer amount of data that must be analyzed. An important tool for analyzing these huge data sets is a compute cluster of hundreds or thousands of machines. However, debugging and tuning clusters requires specialized tools. Current cluster performance tools are more oriented towards tightly coupled parallel applications. We describe how the NetLogger Toolkit methodology is more appropriate for this class of cluster computing, and describe our new automatic workflow anomaly detection component. We also describe how this methodology is being used in the Nearby Supernova Factory (SNfactory) project at Lawrence Berkeley National Laboratory.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View