<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <docs>http://www.rssboard.org/rss-specification</docs>
    <atom:link rel="self" type="application/rss+xml" href="https://escholarship.org/uc/lbnl_cs_sn/rss"/>
    <ttl>720</ttl>
    <title>Recent lbnl_cs_sn items</title>
    <link>https://escholarship.org/uc/lbnl_cs_sn/rss</link>
    <description>Recent eScholarship items from Scientific Networking</description>
    <pubDate>Tue, 30 Jun 2026 01:09:04 +0000</pubDate>
    <item>
      <title>EPOC Deep Dive Retrospective: A Brief Overview of 7 years of Science Engagement Discussions</title>
      <link>https://escholarship.org/uc/item/998093wg</link>
      <description>Understanding the appropriate ways cyberinfrastructure can be designed, implemented, and executed for scientific use cases requires a deep understanding of the way that researchers and educators interact with technology, and how it may be best implemented to suit their needs. The Engagement and Performance Operations Center (EPOC) has conducted a series of scientific “Deep Dives” of use cases at partner institutions to better understand the requirements for modern scientific innovation across the United States research complex. The results of these activities have revealed gaps in the way that technology has been used to foster research activities. This gap in cyberinfrastructure support has impacts for the overall productivity and innovation possibilities for scientific users.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/998093wg</guid>
      <pubDate>Thu, 4 Jun 2026 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Robb, George</name>
      </author>
      <author>
        <name>Eichelberger, Corey</name>
      </author>
      <author>
        <name>Mendoza, Nathaniel</name>
      </author>
      <author>
        <name>Schopf, Jennifer</name>
      </author>
      <author>
        <name>Southworth, Doug</name>
      </author>
    </item>
    <item>
      <title>A Brief Survey of Data Streaming Technologies</title>
      <link>https://escholarship.org/uc/item/5jh9253q</link>
      <description>Streaming data is data that is emitted at variable volumes in a continuous, incremental manner with the goal of low-latency processing often at a different physical location. Network infrastructure is used to facilitate the connection between data sources and sinks, and must be robust to handle the requirements of the workflow. The U.S. Department of Energy Office of Science (DOE SC) a federal agency supporting fundamental scientific research for energy and the Nation’s largest supporter of basic research in the physical sciences. DOE SC has the responsibility for operating $\mathbf{1 0}$ National Laboratories, and 28 scientific user facilities supporting advanced supercomputers, particle accelerators, large x-ray light sources, neutron scattering sources, and other specialized facilities for nanoscience and genomics. This paper investigates the state of streaming data workfows, and details some of the approaches to this challenging problem.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/5jh9253q</guid>
      <pubDate>Thu, 4 Jun 2026 00:00:00 +0000</pubDate>
      <author>
        <name>Kissel, Ezra</name>
      </author>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
    </item>
    <item>
      <title>A Two-Level Control Framework for Quantum Networks</title>
      <link>https://escholarship.org/uc/item/5wx7q92m</link>
      <description>Quantum network control is a major research area of the QUANT-NET project. We strive to build a quantum network control plane to orchestrate and manage all the physicallayer technologies, and to explore what a quantum network control plane should look like in the future, so as to automate high-rate and high-fidelity entanglement generation, distribution, and storage in an efficient, reliable, and cost-effective way. To these ends, we have designed a two-level control framework for quantum networks. Within such a framework, a two-level scheduler has been implemented to support synchronous time slot scheduling, network-wide non-real-time control, and nodewide real-time control. This two-level control framework and the scheduler are being deployed and evaluated in the QUANTNET testbed. Enabled by this two-level control framework and the scheduler, several basic quantum network operations have been automated in the testbed, which include automated quantum node calibration and on-demand...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/5wx7q92m</guid>
      <pubDate>Wed, 25 Feb 2026 00:00:00 +0000</pubDate>
      <author>
        <name>Yu, Se-young</name>
      </author>
      <author>
        <name>Perego, Elia</name>
      </author>
      <author>
        <name>Phillips, Justin</name>
      </author>
      <author>
        <name>Cheah, You-Wei</name>
      </author>
      <author>
        <name>Umesh, Prathwiraj</name>
      </author>
      <author>
        <name>Gao, Guangqi</name>
      </author>
      <author>
        <name>Liu, Jiarui</name>
      </author>
      <author>
        <name>Kissel, Ezra</name>
      </author>
      <author>
        <name>Bregar, Michael</name>
      </author>
      <author>
        <name>Sun, Ke</name>
      </author>
      <author>
        <name>Wu, Qiming</name>
      </author>
      <author>
        <name>Valivarthi, Raju</name>
      </author>
      <author>
        <name>Saglamyurek, Erhan</name>
      </author>
      <author>
        <name>Wu, Wenji</name>
      </author>
      <author>
        <name>Spiropulu, Maria</name>
      </author>
      <author>
        <name>Häffner, Hartmut</name>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
    </item>
    <item>
      <title>An extensible control plane software architecture for quantum networking research</title>
      <link>https://escholarship.org/uc/item/1203r78b</link>
      <description>As quantum networking experiments move from laboratory experiments to larger scale deployments, integrated control software becomes essential for managing complex interactions between the numerous distributed resources involved. While a number of laboratory-scale control systems have been developed for specific quantum platform demonstrations, an openly available and general solution for operating quantum networks has not emerged. With the QUANT-NET Control Plane (QNCP), we introduce a model-based, extensible control plane implementation that offers a framework for enabling network-wide orchestration in quantum information network environments. QCNP provides a quantum network data model, resource management, communication primitives, and a plugin interface for defining orchestration and protocol interactions across distributed quantum network devices and services. This paper describes the design and architecture of QNCP, its implementation, and opportunities for extensibility...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/1203r78b</guid>
      <pubDate>Wed, 25 Feb 2026 00:00:00 +0000</pubDate>
      <author>
        <name>Yu, Se-young</name>
      </author>
      <author>
        <name>Zhang, Liang</name>
      </author>
      <author>
        <name>Kissel, Ezra</name>
        <uri>https://orcid.org/0000-0003-3972-9651</uri>
      </author>
      <author>
        <name>Wu, Wenji</name>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
    </item>
    <item>
      <title>Comparing Cache Utilization Trends for Regional Data Caches</title>
      <link>https://escholarship.org/uc/item/5393w8g5</link>
      <description>The rapid growth of data volumes from large scientific collaborations, such as the Large Hadron Collider (LHC), presents significant challenges for the High Energy Physics (HEP) community. With annual data volumes projected to increase by a factor of thirty by 2028, efficient data management has become a critical concern. The HEP community’s reliance on wide-area networks for global data distribution often results in redundant long-distance transfers, leading to network congestion and degraded application performance. This study investigates the effectiveness of regional data caches in mitigating network congestion and enhancing application performance, using a large-scale dataset of millions of access records from regional caches in Southern California, Chicago, and Boston, which serve the LHC’s CMS experiment. Our analysis reveals the substantial potential of in-network caching to transform large-scale scientific data dissemination, enabling faster and more efficient data access...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/5393w8g5</guid>
      <pubDate>Tue, 2 Dec 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wang, Erica</name>
      </author>
      <author>
        <name>Monga, Ronak</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Balcas, Justas</name>
      </author>
      <author>
        <name>White, Brendan</name>
      </author>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
      <author>
        <name>Davila, Diego</name>
      </author>
      <author>
        <name>Würthwein, Frank</name>
      </author>
      <author>
        <name>Newman, Harvey</name>
      </author>
    </item>
    <item>
      <title>Superfacility: The Convergence of Data, Compute, Networking, Analytics and Software</title>
      <link>https://escholarship.org/uc/item/9x1858hh</link>
      <description>Superfacility: The Convergence of Data, Compute, Networking, Analytics and Software</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/9x1858hh</guid>
      <pubDate>Mon, 22 Sep 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Antypas, Katie</name>
      </author>
      <author>
        <name>Canon, Shane</name>
      </author>
      <author>
        <name>Dart, Eli</name>
        <uri>https://orcid.org/0000-0002-8229-5433</uri>
      </author>
      <author>
        <name>Fagnan, Kjiersten</name>
      </author>
      <author>
        <name>Gerhardt, Lisa</name>
        <uri>https://orcid.org/0000-0003-0166-5162</uri>
      </author>
      <author>
        <name>Jacobsen, Doug</name>
      </author>
      <author>
        <name>Lockwood, Glenn K</name>
        <uri>https://orcid.org/0000-0002-9241-9372</uri>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
      <author>
        <name>Nugent, Peter</name>
        <uri>https://orcid.org/0000-0002-3389-0586</uri>
      </author>
      <author>
        <name>Ramakrishnan, Lavanya</name>
      </author>
      <author>
        <name>Snavely, Cory</name>
        <uri>https://orcid.org/0000-0003-2021-4746</uri>
      </author>
      <author>
        <name>Parkinson, Dilworth</name>
      </author>
      <author>
        <name>Hexemer, Alexander</name>
        <uri>https://orcid.org/0000-0002-5269-0125</uri>
      </author>
      <author>
        <name>Tull, Craig</name>
      </author>
    </item>
    <item>
      <title>Implementing the Palomar Transient Factory Real-Time Detection Pipeline in GLADE: Results and Observations</title>
      <link>https://escholarship.org/uc/item/7mw3f186</link>
      <description>Palomar Transient Factory is a comprehensive detection system for the identification and classification of transient astrophysical objects. The central piece in the identification pipeline is represented by an automated classifier that distinguishes between real and bogus objects with high accuracy. Given that the classifier has to identify the most significant transients out of a large number of candidates in near real-time, the response time it provides is of critical importance. In this paper, we present an experimental study that evaluates a novel implementation of the classifier in GLADE—a parallel data processing system that combines the efficiency of a database with the extensibility of Map-Reduce. We show how each stage in the classifier – candidate identification, pruning, and contextual realbogus – maps optimally into GLADE tasks by taking advantage of the unique features of the system—range-based data partitioning, columnar storage, multi-query execution, and in-database...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/7mw3f186</guid>
      <pubDate>Mon, 22 Sep 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Rusu, Florin</name>
      </author>
      <author>
        <name>Nugent, Peter</name>
        <uri>https://orcid.org/0000-0002-3389-0586</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
    </item>
    <item>
      <title>Distributed caching for processing raw arrays</title>
      <link>https://escholarship.org/uc/item/5x19v952</link>
      <description>As applications continue to generate multi-dimensional data at exponentially increasing rates, fast analytics to extract meaningful results is becoming extremely important. The database community has developed array databases that alleviate this problem through a series of techniques. In-situ mechanisms provide direct access to raw data in the original format---without loading and partitioning. Parallel processing scales to the largest datasets. In-memory caching reduces latency when the same data are accessed across a workload of queries. However, we are not aware of any work on distributed caching of multi-dimensional raw arrays. In this paper, we introduce a distributed framework for cost-based caching of multi-dimensional arrays in native format. Given a set of files that contain portions of an array and an online query workload, the framework computes an effective caching plan in two stages. First, the plan identifies the cells to be cached locally from each of the input...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/5x19v952</guid>
      <pubDate>Mon, 22 Sep 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Zhao, Weijie</name>
      </author>
      <author>
        <name>Rusu, Florin</name>
      </author>
      <author>
        <name>Dong, Bin</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Ho, Anna YQ</name>
      </author>
      <author>
        <name>Nugent, Peter</name>
        <uri>https://orcid.org/0000-0002-3389-0586</uri>
      </author>
    </item>
    <item>
      <title>Incremental View Maintenance over Array Data</title>
      <link>https://escholarship.org/uc/item/09n0z100</link>
      <description>Science applications are producing an ever-increasing volume of multi-dimensional data that are mainly processed with distributed array databases. These raw arrays are ``cooked'' into derived data products using complex pipelines that are time-consuming. As a result, derived data products are released infrequently and become stale soon thereafter. In this paper, we introduce materialized array views as a database construct for scientific data products. We model the ``cooking'' process as incremental view maintenance with batch updates and give a three-stage heuristic that finds effective update plans. Moreover, the heuristic repartitions the array and the view continuously based on a window of past updates as a side-effect of view maintenance without overhead. We design an analytical cost model for integrating materialized array views in queries. A thorough experimental evaluation confirms that the proposed techniques are able to incrementally maintain a real astronomical data...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/09n0z100</guid>
      <pubDate>Mon, 22 Sep 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Zhao, Weijie</name>
      </author>
      <author>
        <name>Rusu, Florin</name>
      </author>
      <author>
        <name>Dong, Bin</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Nugent, Peter</name>
        <uri>https://orcid.org/0000-0002-3389-0586</uri>
      </author>
    </item>
    <item>
      <title>Swiftn: Accelerating Quantum Circuit Simulation Through Tensor Optimization</title>
      <link>https://escholarship.org/uc/item/8fk0r2sn</link>
      <description>Quantum computers are evolving at a rapid pace and are considered next-generation computers with high computational capabilities. However, due to the unique characteristics of qubits, state-of-the-art quantum computers are vulnerable to noise caused by qubit instability. To overcome this, highperformance computing (HPC) systems are utilized for quantum circuit simulations to evaluate complex quantum algorithms with great accuracy. However, quantum circuit simulations have high computational demands, and the data volume increases exponentially as the number of qubits increases. In this paper, we propose SWIFTN, a quantum circuit simulation optimization framework for HPC systems with scalability. To achieve this, it enhances parallelism by dividing the tensor networks and distributing them across multiple GPUs and nodes. Additionally, it reduces computational costs by bypassing tasks through intermittent tensor contraction. Finally, to mitigate the degradation in accuracy due to...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/8fk0r2sn</guid>
      <pubDate>Thu, 4 Sep 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Kim, Seunghwan</name>
      </author>
      <author>
        <name>Kim, Changjong</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Tang, Houjun</name>
        <uri>https://orcid.org/0000-0001-7038-8360</uri>
      </author>
      <author>
        <name>Kim, Sunggon</name>
      </author>
    </item>
    <item>
      <title>High Energy Physics Network Requirements Review (Final Report, July 2024–December 2024)</title>
      <link>https://escholarship.org/uc/item/1dc9j42v</link>
      <description>The world-class research infrastructure at the US Department of Energy (DOE) Office of Science (SC) provides the research community with premier observational, experimental, computational, and network capabilities. Each user facility is designed to provide unique capabilities to advance the core DOE mission in science and technology for its SC program to stimulate rich scientific discoveries and enhance its innovation ecosystem. Research communities gather and flourish around each user facility, bringing together new and enhanced perspectives. The continual reinvention of the practice of science — as users and staff forge novel approaches expressed in research workflows — unlocks new discoveries and propels scientific progress.

Within this research ecosystem, the high-performance computing (HPC) and networking user facilities stewarded by the SC’s Advanced Scientific Computing Research (ASCR) program play a dynamic cross-cutting role, enabling complex workflows demanding high-performance...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/1dc9j42v</guid>
      <pubDate>Mon, 25 Aug 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Carder, Dale</name>
      </author>
      <author>
        <name>Chaniotakis, Evangelos</name>
      </author>
      <author>
        <name>Dawson, Cian</name>
        <uri>https://orcid.org/0000-0001-7659-2944</uri>
      </author>
      <author>
        <name>Dart, Eli</name>
        <uri>https://orcid.org/0000-0002-8229-5433</uri>
      </author>
      <author>
        <name>Hawk, Carol</name>
      </author>
      <author>
        <name>Love, Jeremy</name>
      </author>
      <author>
        <name>Paine, Drew</name>
        <uri>https://orcid.org/0000-0003-0711-9744</uri>
      </author>
      <author>
        <name>Patwa, Abid</name>
      </author>
      <author>
        <name>Robinson, Kate</name>
      </author>
      <author>
        <name>Tian, Jiachuan</name>
      </author>
      <author>
        <name>Tracy, Chris</name>
      </author>
      <author>
        <name>Wiedlea, Andrew</name>
      </author>
    </item>
    <item>
      <title>TensorSearch: Parallel Similarity Search on Tensors</title>
      <link>https://escholarship.org/uc/item/4j4664sr</link>
      <description>Existing similarity search methods, often limited to scalar or vector data, struggle to identify complex patterns found in scientific datasets, such as 2D seismic events or 3D magnetic flux ropes. We introduce TensorSearch, a novel parallel similarity search paradigm designed to identify known patterns in high-dimensional tensors. By directly employing tensor representations, TensorSearch captures intricate pattern structures more effectively than traditional vector-based approaches. Furthermore, its parallel architecture optimizes cache and I/O operations, enabling efficient processing of large-scale scientific data. Our performance evaluations demonstrate that TensorSearch outperforms state-of-the-art vector-based systems like Milvus by up to 10x, and achieves up to a remarkable 55x advantage over custom solution developed in Matlab used by the domain scientists. In these tests, TensorSearch exhibits linear scalability, supporting up to 2240 CPU cores.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/4j4664sr</guid>
      <pubDate>Tue, 29 Jul 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Dong, Bin</name>
      </author>
      <author>
        <name>Nayak, Avinash</name>
      </author>
      <author>
        <name>Tribaldos, Verónica Rodríguez</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Ajo-Franklin, Jonathan</name>
      </author>
      <author>
        <name>Zhang, Qile</name>
      </author>
      <author>
        <name>Guo, Fan</name>
      </author>
      <author>
        <name>Byna, Suren</name>
        <uri>https://orcid.org/0000-0003-3048-3448</uri>
      </author>
      <author>
        <name>Dobson, Patrick</name>
        <uri>https://orcid.org/0000-0001-5031-8592</uri>
      </author>
      <author>
        <name>Sim, Alexander</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
    </item>
    <item>
      <title>ESnet Data and AI Workshop Report</title>
      <link>https://escholarship.org/uc/item/1hx9b9ww</link>
      <description>In February 2025, the DOE user facility Energy Sciences Network (ESnet) held a three-day Data and AI Workshop in Berkeley, California. The objective of the workshop was to identify challenges within ESnet that could be addressed through data-driven methods, to help define ESnet’s data-analysis requirements, and to shape its AI strategy, guiding data-stewardship efforts and the direction of AI research and AIOps exploration for ESnet7, the next iteration of ESnet’s network. This report summarizes the multi-faceted discussions and findings and presents a set of recommendations for next steps.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/1hx9b9ww</guid>
      <pubDate>Fri, 11 Jul 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
      <author>
        <name>Balas, Ed</name>
      </author>
      <author>
        <name>Balasubramanian, Sowmya</name>
      </author>
      <author>
        <name>Balcas, Justas</name>
      </author>
      <author>
        <name>Daneshamooz, Jaber</name>
      </author>
      <author>
        <name>Gholba, Sukhada</name>
      </author>
      <author>
        <name>Haberman, M</name>
      </author>
      <author>
        <name>Kwang, Shawn</name>
      </author>
      <author>
        <name>MacAuley, John</name>
      </author>
      <author>
        <name>Moats, Sam</name>
      </author>
      <author>
        <name>Nikahd, Matthew</name>
      </author>
      <author>
        <name>Oehlert, Sam</name>
      </author>
      <author>
        <name>Rotermund, Cody</name>
      </author>
      <author>
        <name>Robb, Chris</name>
      </author>
      <author>
        <name>Stewart, Garrett</name>
      </author>
      <author>
        <name>Tian, Jiachuan</name>
      </author>
      <author>
        <name>Tracy, Chris</name>
      </author>
      <author>
        <name>Wiedlea, Andrew</name>
      </author>
      <author>
        <name>Wu, John</name>
        <uri>https://orcid.org/0000-0002-6907-3393</uri>
      </author>
      <author>
        <name>Yang, Xi</name>
      </author>
      <author>
        <name>Yu, Se-young</name>
      </author>
    </item>
    <item>
      <title>Improving Slow Transfer Predictions: Generative Methods Compared</title>
      <link>https://escholarship.org/uc/item/4vr6z6zt</link>
      <description>Monitoring data transfer performance is a crucial task in scientific computing networks. By predicting performance early in the communication phase, potentially sluggish transfers can be identified and selectively monitored, optimizing network usage and overall performance. A key bottleneck to improving the predictive power of machine learning (ML) models in this context is the issue of class imbalance. This project focuses on addressing the class imbalance problem to enhance the accuracy of performance predictions. In this study, we analyze and compare various augmentation strategies, including traditional oversampling methods and generative techniques. Additionally, we adjust the class imbalance ratios in training datasets to evaluate their impact on model performance. While augmentation may improve performance, as the imbalance ratio increases, the performance does not significantly improve. We conclude that even the most advanced technique, such as CTGAN, does not significantly...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/4vr6z6zt</guid>
      <pubDate>Tue, 1 Jul 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Kim, Jacob Taegon</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Kim, Jinoh</name>
      </author>
    </item>
    <item>
      <title>Mitigation of birefringence in cavity-based quantum networks using frequency-encoded photons</title>
      <link>https://escholarship.org/uc/item/2px1n07k</link>
      <description>Atom-cavity systems offer unique advantages for building large-scale distributed quantum computers by providing strong atom-photon coupling while allowing for high-fidelity local operations of atomic qubits. However, in prevalent schemes where the photonic state is encoded in polarization, cavity birefringence introduces an energy splitting of the cavity eigenmodes and alters the polarization states, thus limiting the fidelity of remote entanglement generation. To address this challenge, we propose a scheme that encodes the photonic qubit in the frequency degree of freedom. The scheme relies on resonant coupling of multiple transverse cavity modes to different atomic transitions that are well separated in frequency. We numerically investigate the temporal properties of the photonic wave packet, two-photon interference visibility, and atom-atom entanglement fidelity under various cavity polarization-mode splittings and find that our scheme is less affected by cavity birefringence....</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/2px1n07k</guid>
      <pubDate>Tue, 1 Jul 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Zhang, Chengxi</name>
      </author>
      <author>
        <name>Phillips, Justin</name>
        <uri>https://orcid.org/0009-0008-0750-9519</uri>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
      <author>
        <name>Saglamyurek, Erhan</name>
      </author>
      <author>
        <name>Wu, Qiming</name>
      </author>
      <author>
        <name>Haeffner, Hartmut</name>
        <uri>https://orcid.org/0000-0002-5113-9622</uri>
      </author>
    </item>
    <item>
      <title>FabFed: Tool-Based Network Federation for Testbed of Testbeds - Paradigm and Practice</title>
      <link>https://escholarship.org/uc/item/0737p2dd</link>
      <description>Approaching the end of the FABRIC project construction phase, many experimenters expressed a need for integrating heterogeneous types of resources from external testbed and cloud providers. This prompted research in cross-testbed federation paradigms, practically in pursuit of the vision of 'testbed of testbeds'. With past experience and lessons learned, we propose to adopt a 'tool-based federation paradigm' with the hypothesis that a tool-based federation approach is viable and performant for automating large, complex cross-testbed experiments. In this paper, we discuss the challenges and solutions in developing the FABRIC Federation Extension (FabFed), a software framework that implements the tool-based federation approach and enables FABRIC users to run large experiments across multiple testbed and cloud providers. We validate our approach through extensive use of FabFed to build complex experiments across both the FABRIC and partner testbeds. We also share our observations...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/0737p2dd</guid>
      <pubDate>Tue, 1 Jul 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Yang, Xi</name>
      </author>
      <author>
        <name>Kissel, Ezra</name>
        <uri>https://orcid.org/0000-0003-3972-9651</uri>
      </author>
      <author>
        <name>Essiari, Abdelilah</name>
      </author>
      <author>
        <name>Zhang, Liang</name>
      </author>
      <author>
        <name>Lehman, Tom</name>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
      <author>
        <name>Ruth, Paul</name>
      </author>
      <author>
        <name>Thareja, Komal</name>
      </author>
      <author>
        <name>Baldin, Ilya</name>
      </author>
    </item>
    <item>
      <title>Conditional Recurrent Neural Networks for Enhancing Throughput Prediction and Slow File Transfers Detection in Large Science Workflows</title>
      <link>https://escholarship.org/uc/item/6dj897vz</link>
      <description>Efficient data transfer across scientific computing facilities is critical for enabling timely scientific discoveries. In this work, we explore the options of anticipating extremely slow data transfers to enable preventive actions. However, the dynamic nature of the large distributed scientific workflows driving these data transfers presents significant challenges for predicting network throughput. This study introduces a Conditional Recurrent Neural Network (CondRNN) model, specifically utilizing Conditional Long Short-Term Memory (CondLSTM), to integrate both static and dynamic features for enhanced throughput prediction. By leveraging historical transfers as proxy features, more than 60% of predictions achieved an absolute percentage error (APE) of less than 20%, and slow transfers were detected with a precision of 91.7% and recall of 100%, outperforming traditional RNN models. Implementing CondLSTM in scientific computing environments can optimize network resource utilization,...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/6dj897vz</guid>
      <pubDate>Tue, 3 Jun 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Fan, Boyu</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Kim, Jinoh</name>
      </author>
    </item>
    <item>
      <title>Regen: An object layout regenerator on large-scale production HPC systems</title>
      <link>https://escholarship.org/uc/item/07t4k8ss</link>
      <description>This article proposes an object layout regenerator called Regen which regenerates and removes the object layout dynamically to improve the read performance of applications. Regen first detects frequent access patterns from the I/O requests of the applications. Second, Regen reorganizes the objects and regenerates or preallocates new object layouts according to the identified access patterns. Finally, Regen removes or reuses the obsolete or regenerated object layouts as necessary. As a result, Regen accelerates access to objects by providing a flexible object layout. We implement Regen as a framework on top of Proactive Data Container (PDC) and evaluate it on Cori supercomputer, a production-scale HPC system, by using realistic HPC I/O benchmarks. The experimental results show that Regen improves the I/O performance by up to 16.92 × compared with an existing system.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/07t4k8ss</guid>
      <pubDate>Tue, 3 Jun 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Sung, Dong Kyu</name>
      </author>
      <author>
        <name>Kim, Sunggon</name>
      </author>
      <author>
        <name>Lee, Sangjin</name>
      </author>
      <author>
        <name>Tang, Houjun</name>
        <uri>https://orcid.org/0000-0001-7038-8360</uri>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Byna, Suren</name>
        <uri>https://orcid.org/0000-0003-3048-3448</uri>
      </author>
      <author>
        <name>Son, Yongseok</name>
      </author>
    </item>
    <item>
      <title>A Study of a Deterministic Networking Framework for Latency Critical Large Scientific Data Transfers</title>
      <link>https://escholarship.org/uc/item/6sv9p0w7</link>
      <description>Scientific workflows often involve large data transfers, which increasingly require completion-time guarantees. To support these time-sensitive flows, the Energy Science Network (ESnet) has implemented on-demand circuits with packet priority, allowing the circuit to be utilized by other traffic when the deadline-sensitive flow is inactive. In this paper, we explore a deterministic networking framework designed to support large scientific data transfers with completion guarantees. We consider an ideal network where all nodes are time-synchronized and utilize Cyclic Queueing and Forwarding (CQF) to achieve reliable low-latency data transfers. Specifically, the CQF cycle time is configured to ensure that all data transfers between neighboring nodes are completed within the cycle time. The number of packets transferable between two neighboring nodes depends on the cycle time, propagation delay, and link bandwidth. We conduct simulations to compare the performance of the deterministic...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/6sv9p0w7</guid>
      <pubDate>Tue, 25 Feb 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Lakshminarayana, Vijeth Kumbarahally</name>
      </author>
      <author>
        <name>Oguchi, Carolina Minami</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Ghosal, Dipak</name>
      </author>
    </item>
    <item>
      <title>Understanding Data Access Patterns for dCache System</title>
      <link>https://escholarship.org/uc/item/9k42b0p5</link>
      <description>The storage management system dCache acts as a disk cache for high-energy physics (HEP) data from the US ATLAS community. Since its disk capacity is considerably smaller than the total volume of ATLAS data, a heuristic is needed to determine what data should be kept on disks. An effective heuristic would be to keep the data files that are expected to be heavily accessed in the near future. Through a careful study of access statistics, we find a few most popular datasets are accessed much more frequently than others, even though these popular datasets change over time. If we could predict the near-term popularity of datasets, we could pin the most popular ones in the disk cache to prevent their accidental removal and guarantee their availability. To predict a dataset popularity, we present several methods for forecasting the number of times a dataset will be accessed in the next day. Test results show that these methods could predict the next-day access counts of popular datasets...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/9k42b0p5</guid>
      <pubDate>Tue, 28 Jan 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Bellavita, Julian</name>
      </author>
      <author>
        <name>Sim, Caitlin</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Yoo, Shinjae</name>
      </author>
      <author>
        <name>Ito, Hiro</name>
      </author>
      <author>
        <name>Garonne, Vincent</name>
      </author>
      <author>
        <name>Lancon, Eric</name>
      </author>
    </item>
    <item>
      <title>Experiences in deploying in-network data caches</title>
      <link>https://escholarship.org/uc/item/1r37t9wz</link>
      <description>Data caches of various forms have been widely deployed in the context of commercial and research and education networks, but their common positioning at the Edge limits their utility from a network operator perspective. When deployed outside the network core, providers lack visibility to make decisions or apply traffic engineering based on data access patterns and caching node location. As an alternative, in-network caching provides a different type of content delivery network for scientific data infrastructure, supporting on-demand temporary caching service. We will describe the status of in-network caching nodes deployed within ESnet in support of the US CMS data federation. We will describe the container and networking architecture used to deploy data caches within ESnet, and update on the evolving tooling around service management lifecycle. An analysis of cache usage will also be provided along with an outlook for expanding the in-network cache footprint.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/1r37t9wz</guid>
      <pubDate>Mon, 13 Jan 2025 00:00:00 +0000</pubDate>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Kissel, Ezra</name>
      </author>
      <author>
        <name>Hazen, Damian</name>
      </author>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
    </item>
    <item>
      <title>Salk Institute for Biological Studies Requirements Analysis Report</title>
      <link>https://escholarship.org/uc/item/4rt3q4dx</link>
      <description>EPOC uses the Deep Dive process to discuss and analyze current and planned science, research, or education activities and the anticipated data output of a particular use case, site, or project to help inform the strategic planning of a campus or regional networking environment. This includes understanding future needs related to network operations, network capacity upgrades, and other technological service investments. A Deep Dive comprehensively surveys major research stakeholders’ plans and processes in order to investigate data management requirements over the next 5–10 years.

Between February and March 2024, staff members from the Engagement and Performance Operations Center (EPOC) met with researchers and staff from the Salk Institute for Biological Studies (Salk) for the purpose of a Deep Dive into scientific and research drivers. The goal of this activity was to help characterize the requirements for a number of campus use cases, and to enable cyberinfrastructure support...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/4rt3q4dx</guid>
      <pubDate>Wed, 11 Dec 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
    </item>
    <item>
      <title>Identifying and Understanding Scientific Network Flows</title>
      <link>https://escholarship.org/uc/item/5k57m390</link>
      <description>The High-Energy Physics (HEP) and Worldwide LHC Computing Grid (WLCG) communities have faced significant challenges in understanding their global network flows across the world’s research and education (R&amp;amp;E) networks. This article describes the status of the work carried out to tackle this challenge by the Research Technical Networking Working Group (RNTWG) and the Scientific Network Tags (Scitags) initiative, including the evolving framework and tools, as well as our plans to improve network visibility before the next WLCG Network Data Challenge in early 2024. The Scitags initiative is a long-term effort to improve the visibility and management of network traffic for data-intensive sciences. The efforts of the RNTWG and Scitags initiatives have created a set of tools, standards, and proof-of-concept demonstrators that show the feasibility of identifying the owner (community) and purpose (activity) of network traffic anywhere in the network.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/5k57m390</guid>
      <pubDate>Tue, 3 Dec 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Attebury, Garhan</name>
      </author>
      <author>
        <name>Babik, Marian</name>
      </author>
      <author>
        <name>Carder, Dale</name>
        <uri>https://orcid.org/0000-0001-8357-0997</uri>
      </author>
      <author>
        <name>Chown, Tim</name>
      </author>
      <author>
        <name>Hanushevsky, Andrew</name>
      </author>
      <author>
        <name>Hoeft, Bruno</name>
      </author>
      <author>
        <name>Lake, Andrew</name>
      </author>
      <author>
        <name>Lambert, Michael</name>
      </author>
      <author>
        <name>Letts, James</name>
      </author>
      <author>
        <name>McKee, Shawn</name>
      </author>
      <author>
        <name>Newell, Karl</name>
      </author>
      <author>
        <name>Sullivan, Tristan</name>
      </author>
    </item>
    <item>
      <title>New York-Presbyterian and Columbia University Irving Medical Center Requirements Analysis Report</title>
      <link>https://escholarship.org/uc/item/3gr7j44m</link>
      <description>EPOC uses the Deep Dive process to discuss and analyze current and planned science, research, or education activities and the anticipated data output of a particular use case, site, or project to help inform the strategic planning of a campus or regional networking environment. This includes understanding future needs related to network operations, network capacity upgrades, and other technological service investments. A Deep Dive comprehensively surveys major research stakeholders’ plans and processes in order to investigate data management requirements over the next 5–10 years.

Between February and June 2024, staff members from the Engagement and Performance Operations Center (EPOC) met with researchers and staff from New York-Presbyterian (NYP), Columbia University Irving Medical Center (CUIMC), and NYSERNet for the purpose of a Deep Dive into scientific and research drivers.  The goal of this activity was to help characterize the requirements for a number of campus use cases,...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/3gr7j44m</guid>
      <pubDate>Tue, 3 Dec 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
    </item>
    <item>
      <title>Data Driven Dimensionality Reduction to Improve Modeling Performance✱</title>
      <link>https://escholarship.org/uc/item/0555v6rb</link>
      <description>In a number of applications, data may be anonymized, obfuscated, or highly noisy. In such cases, it is difficult to use domain knowledge or low-dimensional visualizations to engineer the features for tasks such as machine learning, instead, we explore dimensionality reduction (DR) as a data-driven approach for engineering these low-dimensional representations. Through a careful examination of available feature selection and feature extraction techniques, we propose a new class named feature clustering. These new methods could utilize different forms of clustering to help evaluate the relative importance of features and take on properties different from the well-known DR algorithms. To evaluate these algorithms, we develop a parallel computing framework that optimizes their hyperparameters on a sample of application datasets. This framework harnesses the parallel computing power to examine a large number of parameter combinations and enables hyperparameter tuning and model tuning...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/0555v6rb</guid>
      <pubDate>Tue, 3 Dec 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Chung, Joshua</name>
      </author>
      <author>
        <name>De Prado, Marcos Lopez</name>
      </author>
      <author>
        <name>Simon, Horst</name>
        <uri>https://orcid.org/0000-0003-0832-3720</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
    </item>
    <item>
      <title>Imb-FinDiff: Conditional Diffusion Models for Class Imbalance Synthesis of Financial Tabular Data</title>
      <link>https://escholarship.org/uc/item/8vj430z1</link>
      <description>Handling imbalanced datasets remains a critical challenge in financial machine-learning applications such as loan approval, credit scoring, and fraud detection. We present Imbalanced Financial Diffusion (Imb-FinDiff), a novel denoising diffusion framework designed to address class imbalance in financial tabular data. Our framework leverages embedding encodings for categorical and numerical attributes, effectively managing the complexities of mixed-type financial datasets. By incorporating a dual learning objective, (i) diffusion timestep noise and (ii) class label prediction, we synthesize minority class samples. Extensive experiments on diverse and real-world financial datasets demonstrate that Imb-FinDiff maintains the statistical properties of the original data while reducing bias caused by class imbalance. The minority class samples generated by Imb-FinDiff enhance the utility and fidelity of downstream machine learning classifiers.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/8vj430z1</guid>
      <pubDate>Tue, 19 Nov 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Schreyer, Marco</name>
      </author>
      <author>
        <name>Sattarov, Timur</name>
      </author>
      <author>
        <name>Sim, Alexander</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
    </item>
    <item>
      <title>An Architecture For Edge Networking Services</title>
      <link>https://escholarship.org/uc/item/8md8q9kq</link>
      <description>An Architecture For Edge Networking Services</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/8md8q9kq</guid>
      <pubDate>Wed, 28 Aug 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Brown, Lloyd</name>
      </author>
      <author>
        <name>Marx, Emily</name>
      </author>
      <author>
        <name>Bali, Dev</name>
      </author>
      <author>
        <name>Amaro, Emmanuel</name>
      </author>
      <author>
        <name>Sur, Debnil</name>
      </author>
      <author>
        <name>Kissel, Ezra</name>
        <uri>https://orcid.org/0000-0003-3972-9651</uri>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
      <author>
        <name>Katz-Bassett, Ethan</name>
      </author>
      <author>
        <name>Krishnamurthy, Arvind</name>
      </author>
      <author>
        <name>McCauley, James</name>
      </author>
      <author>
        <name>Narechania, Tejas</name>
        <uri>https://orcid.org/0000-0001-6495-6413</uri>
      </author>
      <author>
        <name>Panda, Aurojit</name>
      </author>
      <author>
        <name>Shenker, Scott</name>
      </author>
    </item>
    <item>
      <title>Designing, Constructing, and Operating an IPv6 Network at SC23: A case study in implementing the IPv6 protocol on a heterogenous network that supports the SC23 conference</title>
      <link>https://escholarship.org/uc/item/3pp098sf</link>
      <description>IPv6 is the current version of IP, the protocol that is used to route traffic across internet connections. This standard was originally developed as a new approach to mitigate concerns about address exhaustion and allow for near infinite scalability. While this protocol has gained significant support in mobile and broadband networks, as well as being the default for networks in emerging economies, it has yet to be fully adopted as a standard deployment model. Complications include legacy devices unable to support the proposed changes, as well as potential challenges that exist between devices that may not be able to fully implement current standards or configuration norms. The SCinet volunteers who deliver advanced networking to support the SC Conference set an ambitious goal of deploying an IPv6-only network at SC23. While the necessary technology is widely available and understood, the implications of deployment to support more than 15,000 users, each with multiple devices of...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/3pp098sf</guid>
      <pubDate>Wed, 28 Aug 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Robinson, Kate</name>
      </author>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Costello, Tom</name>
      </author>
    </item>
    <item>
      <title>A2FL: Autonomous and Adaptive File Layout in HPC through Real-time Access Pattern Analysis</title>
      <link>https://escholarship.org/uc/item/1sg3q1w6</link>
      <description>Various scientific applications with different I/O characteristics are executed in HPC systems. However, underlying parallel file systems are unaware of these characteristics of applications, and using a single fixed file layout for all applications can degrade the performance of HPC systems. In this paper, we propose A2FL, an autonomous and adaptive file layout adjustment scheme that optimizes parallel file system configurations by analyzing the access pattern of the applications. The key steps of A2FL are as follows: (1) A2FL initially intercepts the I/O operations of the application, recording their access patterns in real-time. (2) The access patterns are then transformed into a graphical representation used for predicting I/O performance and providing adjustment recommendations. (3) A2FL autonomously adjusts the file layout based on the prediction results, delivering an optimal file layout within the parallel file system. Moreover, we propose A2FL-Compound which analyzes...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/1sg3q1w6</guid>
      <pubDate>Wed, 28 Aug 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Sung, Dong Kyu</name>
      </author>
      <author>
        <name>Son, Yongseok</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Byna, Suren</name>
        <uri>https://orcid.org/0000-0003-3048-3448</uri>
      </author>
      <author>
        <name>Tang, Houjun</name>
        <uri>https://orcid.org/0000-0001-7038-8360</uri>
      </author>
      <author>
        <name>Eom, Hyeonsang</name>
      </author>
      <author>
        <name>Kim, Changjong</name>
      </author>
      <author>
        <name>Kim, Sunggon</name>
      </author>
    </item>
    <item>
      <title>Complete genome sequence of Anabaena variabilis ATCC 29413</title>
      <link>https://escholarship.org/uc/item/4kg7f5sp</link>
      <description>Anabaena variabilis ATCC 29413 is a filamentous, heterocyst-forming cyanobacterium that has served as a model organism, with an extensive literature extending over 40 years. The strain has three distinct nitrogenases that function under different environmental conditions and is capable of photoautotrophic growth in the light and true heterotrophic growth in the dark using fructose as both carbon and energy source. While this strain was first isolated in 1964 in Mississippi and named Anabaena flos-aquae MSU A-37, it clusters phylogenetically with cyanobacteria of the genus Nostoc. The strain is a moderate thermophile, growing well at approximately 40° C. Here we provide some additional characteristics of the strain, and an analysis of the complete genome sequence.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/4kg7f5sp</guid>
      <pubDate>Mon, 12 Aug 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Thiel, Teresa</name>
      </author>
      <author>
        <name>Pratte, Brenda S</name>
      </author>
      <author>
        <name>Zhong, Jinshun</name>
      </author>
      <author>
        <name>Goodwin, Lynne</name>
      </author>
      <author>
        <name>Copeland, Alex</name>
        <uri>https://orcid.org/0000-0002-3971-5439</uri>
      </author>
      <author>
        <name>Lucas, Susan</name>
      </author>
      <author>
        <name>Han, Cliff</name>
      </author>
      <author>
        <name>Pitluck, Sam</name>
      </author>
      <author>
        <name>Land, Miriam L</name>
      </author>
      <author>
        <name>Kyrpides, Nikos C</name>
        <uri>https://orcid.org/0000-0002-6131-0462</uri>
      </author>
      <author>
        <name>Woyke, Tanja</name>
        <uri>https://orcid.org/0000-0002-9485-5637</uri>
      </author>
    </item>
    <item>
      <title>Complete genome sequence of the phenanthrene-degrading soil bacterium Delftia acidovorans Cs1-4</title>
      <link>https://escholarship.org/uc/item/4375f1r2</link>
      <description>Polycyclic aromatic hydrocarbons (PAH) are ubiquitous environmental pollutants and microbial biodegradation is an important means of remediation of PAH-contaminated soil. Delftia acidovorans Cs1-4 (formerly Delftia sp. Cs1-4) was isolated by using phenanthrene as the sole carbon source from PAH contaminated soil in Wisconsin. Its full genome sequence was determined to gain insights into a mechanisms underlying biodegradation of PAH. Three genomic libraries were constructed and sequenced: an Illumina GAii shotgun library (916,416,493 reads), a 454 Titanium standard library (770,171 reads) and one paired-end 454 library (average insert size of 8 kb, 508,092 reads). The initial assembly contained 40 contigs in two scaffolds. The 454 Titanium standard data and the 454 paired end data were assembled together and the consensus sequences were computationally shredded into 2 kb overlapping shreds. Illumina sequencing data was assembled, and the consensus sequence was computationally shredded...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/4375f1r2</guid>
      <pubDate>Mon, 12 Aug 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Shetty, Ameesha R</name>
      </author>
      <author>
        <name>de Gannes, Vidya</name>
      </author>
      <author>
        <name>Obi, Chioma C</name>
      </author>
      <author>
        <name>Lucas, Susan</name>
      </author>
      <author>
        <name>Lapidus, Alla</name>
      </author>
      <author>
        <name>Cheng, Jan-Fang</name>
        <uri>https://orcid.org/0000-0001-7315-7613</uri>
      </author>
      <author>
        <name>Goodwin, Lynne A</name>
      </author>
      <author>
        <name>Pitluck, Samuel</name>
      </author>
      <author>
        <name>Peters, Linda</name>
      </author>
      <author>
        <name>Mikhailova, Natalia</name>
      </author>
      <author>
        <name>Teshima, Hazuki</name>
      </author>
      <author>
        <name>Han, Cliff</name>
      </author>
      <author>
        <name>Tapia, Roxanne</name>
      </author>
      <author>
        <name>Land, Miriam</name>
      </author>
      <author>
        <name>Hauser, Loren J</name>
      </author>
      <author>
        <name>Kyrpides, Nikos</name>
        <uri>https://orcid.org/0000-0002-6131-0462</uri>
      </author>
      <author>
        <name>Ivanova, Natalia</name>
      </author>
      <author>
        <name>Pagani, Ioanna</name>
      </author>
      <author>
        <name>Chain, Patrick SG</name>
      </author>
      <author>
        <name>Denef, Vincent J</name>
      </author>
      <author>
        <name>Woyke, Tanya</name>
        <uri>https://orcid.org/0000-0002-9485-5637</uri>
      </author>
      <author>
        <name>Hickey, William J</name>
      </author>
    </item>
    <item>
      <title>High Energy Physics Network Requirements Review: Two-Year Update</title>
      <link>https://escholarship.org/uc/item/00w301f1</link>
      <description>The Energy Sciences Network (ESnet) is the high-performance network user facility for the US Department of Energy (DOE) Office of Science (SC) and delivers highly reliable data transport capabilities optimized for the requirements of data-intensive science. In essence, ESnet is the circulatory system that enables the DOE science mission by connecting all its laboratories and facilities in the US and abroad. ESnet is funded and stewarded by the Advanced Scientific Computing Research (ASCR) program and managed and operated by the Scientific Networking Division at Lawrence Berkeley National Laboratory (LBNL). ESnet is widely regarded as a global leader in the research and education networking community.

ESnet interconnects DOE national laboratories, user facilities, and major experiments so that scientists can use remote instruments and computing resources as well as share data with collaborators, transfer large datasets, and access distributed data repositories. ESnet is specifically...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/00w301f1</guid>
      <pubDate>Tue, 23 Jul 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Carder, Dale</name>
        <uri>https://orcid.org/0000-0001-8357-0997</uri>
      </author>
      <author>
        <name>Colby, Eric</name>
      </author>
      <author>
        <name>Dart, Eli</name>
      </author>
      <author>
        <name>Hawk, Carol</name>
      </author>
      <author>
        <name>Miller, Kenneth</name>
      </author>
      <author>
        <name>Patwa, Abid</name>
      </author>
      <author>
        <name>Robinson, Kate</name>
      </author>
      <author>
        <name>Wiedlea, Andrew</name>
      </author>
    </item>
    <item>
      <title>Fusion Energy Sciences Network Requirements Review: Mild-cycle Update</title>
      <link>https://escholarship.org/uc/item/4w2151rp</link>
      <description>The Energy Sciences Network (ESnet) is the high-performance network user facility for the US Department of Energy (DOE) Office of Science (SC) and delivers highly reliable data transport capabilities optimized for the requirements of data-intensive science. In essence, ESnet is the circulatory system that enables the DOE science mission by connecting all its laboratories and facilities in the US and abroad. ESnet is funded and stewarded by the Advanced Scientific Computing Research (ASCR) program and managed and operated by the Scientific Networking Division at Lawrence Berkeley National Laboratory (LBNL). ESnet is widely regarded as a global leader in the research and education networking community.

ESnet interconnects DOE national laboratories, user facilities, and major experiments so that scientists can use remote instruments and computing resources as well as share data with collaborators, transfer large datasets, and access distributed data repositories. ESnet is specifically...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/4w2151rp</guid>
      <pubDate>Wed, 3 Jul 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Dart, Eli</name>
      </author>
      <author>
        <name>Halfmoon, Michael</name>
      </author>
      <author>
        <name>Hawk, Carol</name>
      </author>
      <author>
        <name>King, Josh</name>
      </author>
      <author>
        <name>Mandrekas, John</name>
      </author>
      <author>
        <name>Miller, Kenneth</name>
      </author>
      <author>
        <name>Wiedlea, Andrew</name>
      </author>
    </item>
    <item>
      <title>Nuclear Physics Network Requirements Review Final Report</title>
      <link>https://escholarship.org/uc/item/4qx1b4x8</link>
      <description>The Energy Sciences Network (ESnet) is the high-performance network user facility for the US Department of Energy (DOE) Office of Science (SC) and delivers highly reliable data transport capabilities optimized for the requirements of data-intensive science. In essence, ESnet is the circulatory system that enables the DOE science mission by connecting all its laboratories and facilities in the US and abroad. ESnet is funded and stewarded by the Advanced Scientific Computing Research (ASCR) program and managed and operated by the Scientific Networking Division at Lawrence Berkeley National Laboratory (LBNL). ESnet is widely regarded as a global leader in the research and education networking community.

ESnet interconnects DOE national laboratories, user facilities, and major experiments so that scientists can use remote instruments and computing resources as well as share data with collaborators, transfer large datasets, and access distributed data repositories. ESnet is specifically...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/4qx1b4x8</guid>
      <pubDate>Wed, 3 Jul 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Brown, Ben</name>
      </author>
      <author>
        <name>Rai, Gulshan</name>
      </author>
      <author>
        <name>Dart, Eli</name>
      </author>
      <author>
        <name>Dawson, Cian</name>
        <uri>https://orcid.org/0000-0001-7659-2944</uri>
      </author>
      <author>
        <name>Hawk, Carol</name>
      </author>
      <author>
        <name>Mantica, Paul</name>
      </author>
      <author>
        <name>Margetis, Spyridon</name>
      </author>
      <author>
        <name>Miller, Kenneth</name>
      </author>
      <author>
        <name>Miller, Nathan</name>
      </author>
      <author>
        <name>Wiedlea, Andrew</name>
      </author>
    </item>
    <item>
      <title>Integrating network and transfer metrics to optimize transfer efficiency and experiment workflows</title>
      <link>https://escholarship.org/uc/item/5fd209jp</link>
      <description>The Worldwide LHC Computing Grid relies on the network as a critical part of its infrastructure and therefore needs to guarantee effective network usage and prompt detection and resolution of any network issues, including connection failures, congestion, traffic routing, etc. The WLCG Network and Transfer Metrics project aims to integrate and combine all network-related monitoring data collected by the WLCG infrastructure. This includes FTS monitoring information, monitoring data from the XRootD federation, as well as results of the perfSONAR tests. The main challenge consists of further integrating and analyzing this information in order to allow the optimizing of data transfers and workload management systems of the LHC experiments. In this contribution, we present our activity in commissioning WLCG perfSONAR network and integrating network and transfer metrics: We motivate the need for the network performance monitoring, describe the main use cases of the LHC experiments as...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/5fd209jp</guid>
      <pubDate>Wed, 19 Jun 2024 00:00:00 +0000</pubDate>
      <author>
        <name>McKee, S</name>
      </author>
      <author>
        <name>Babik, M</name>
      </author>
      <author>
        <name>Campana, S</name>
      </author>
      <author>
        <name>Di Girolamo, A</name>
      </author>
      <author>
        <name>Wildish, T</name>
      </author>
      <author>
        <name>Closier, J</name>
      </author>
      <author>
        <name>Roiser, S</name>
      </author>
      <author>
        <name>Grigoras, C</name>
      </author>
      <author>
        <name>Vukotic, I</name>
      </author>
      <author>
        <name>Salichos, M</name>
      </author>
      <author>
        <name>De, Kaushik</name>
      </author>
      <author>
        <name>Garonne, V</name>
      </author>
      <author>
        <name>Cruz, JAD</name>
      </author>
      <author>
        <name>Forti, A</name>
      </author>
      <author>
        <name>Walker, CJ</name>
      </author>
      <author>
        <name>Rand, D</name>
      </author>
      <author>
        <name>de Salvo, A</name>
      </author>
      <author>
        <name>Mazzoni, E</name>
      </author>
      <author>
        <name>Gable, I</name>
      </author>
      <author>
        <name>Chollet, F</name>
      </author>
      <author>
        <name>Caillat, L</name>
      </author>
      <author>
        <name>Schaer, F</name>
      </author>
      <author>
        <name>Chen, Hsin-Yen</name>
      </author>
      <author>
        <name>Tigerstedt, U</name>
      </author>
      <author>
        <name>Duckeck, G</name>
      </author>
      <author>
        <name>Hoeft, B</name>
      </author>
      <author>
        <name>Petzold, A</name>
      </author>
      <author>
        <name>Lopez, F</name>
      </author>
      <author>
        <name>Flix, J</name>
      </author>
      <author>
        <name>Stancu, S</name>
      </author>
      <author>
        <name>Shade, J</name>
      </author>
      <author>
        <name>O'Connor, M</name>
      </author>
      <author>
        <name>Kotlyar, V</name>
      </author>
      <author>
        <name>Zurawski, J</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
    </item>
    <item>
      <title>Predicting Resource Utilization Trends with Southern California Petabyte Scale Cache</title>
      <link>https://escholarship.org/uc/item/8vv9390p</link>
      <description>Large community of high-energy physicists share their data all around world making it necessary to ship a large number of files over wide- area networks. Regional disk caches such as the Southern California Petabyte Scale Cache have been deployed to reduce the data access latency. We observe that about 94% of the requested data volume were served from this cache, without remote transfers, between Sep. 2022 and July 2023. In this paper, we show the predictability of the resource utilization by exploring the trends of recent cache usage. The time series based prediction is made with a machine learning approach and the prediction errors are small relative to the variation in the input data. This work would help understanding the characteristics of the resource utilization and plan for additional deployments of caches in the future.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/8vv9390p</guid>
      <pubDate>Tue, 18 Jun 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Sim, Caitlin</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
      <author>
        <name>Hazen, Damian</name>
      </author>
      <author>
        <name>Würthwein, Frank</name>
      </author>
      <author>
        <name>Davila, Diego</name>
      </author>
      <author>
        <name>Newman, Harvey</name>
      </author>
      <author>
        <name>Balcas, Justas</name>
      </author>
    </item>
    <item>
      <title>Proximity Portability and in Transit, M-to-N Data Partitioning and Movement in SENSEI</title>
      <link>https://escholarship.org/uc/item/7878c2x6</link>
      <description>In high-performance parallel in situ processing, the term in transit processing refers to those configurations where data must move from a producer to a consumer that runs on separate resources. In the context of parallel and distributed computing on an HPC platform one of the central challenges is to determine a mapping of data from producer ranks to consumer ranks. This problem is complicated by the heterogeneity that arises in producer-consumer pairs, such as when producer and consumer codes have different levels of concurrency, different scaling characteristics, or different data models. The resulting mapping and movement of data from M producer to N consumer ranks can have a significant impact on aggregate application performance, particularly when the data consumer requires only a subset of the overall data for its task. This chapter focuses on the design considerations that underlie SENSEI’s implementation to this challenging problem. These design considerations extend...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/7878c2x6</guid>
      <pubDate>Fri, 7 Jun 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Bethel, E Wes</name>
        <uri>https://orcid.org/0000-0003-0790-7716</uri>
      </author>
      <author>
        <name>Loring, Burlen</name>
        <uri>https://orcid.org/0000-0002-4678-8142</uri>
      </author>
      <author>
        <name>Ayachit, Utkarsh</name>
      </author>
      <author>
        <name>Duque, Earl PN</name>
      </author>
      <author>
        <name>Ferrier, Nicola</name>
      </author>
      <author>
        <name>Insley, Joseph</name>
      </author>
      <author>
        <name>Gu, Junmin</name>
        <uri>https://orcid.org/0000-0002-1521-8534</uri>
      </author>
      <author>
        <name>Kress, James</name>
      </author>
      <author>
        <name>O’Leary, Patrick</name>
      </author>
      <author>
        <name>Pugmire, Dave</name>
      </author>
      <author>
        <name>Rizzi, Silvio</name>
      </author>
      <author>
        <name>Thompson, David</name>
      </author>
      <author>
        <name>Usher, Will</name>
      </author>
      <author>
        <name>Weber, Gunther H</name>
        <uri>https://orcid.org/0000-0002-1794-1398</uri>
      </author>
      <author>
        <name>Whitlock, Brad</name>
      </author>
      <author>
        <name>Wolf, Matthew</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
    </item>
    <item>
      <title>The SENSEI Generic In Situ Interface: Tool and Processing Portability at Scale</title>
      <link>https://escholarship.org/uc/item/56f7p5vq</link>
      <description>One key challenge when doing in situ processing is the investment required to add code to numerical simulations needed to take advantage of in situ processing. Such instrumentation code is often specialized, and tailored to a specific in situ method or infrastructure. Then, if a simulation wants to use other in situ tools, each of which has its own bespoke API&amp;nbsp;[4], then the simulation code team will quickly become overwhelmed with having a different set of instrumentation APIs, one per in situ tool or method. In an ideal situation, such instrumentation need happen only once, and then the instrumentation API provides access to a large diversity of tools. In this way, a data producer’s instrumentation need not be modified if the user desires to take advantage of a different set of in situ tools. The SENSEI generic in situ interface addresses this challenge, which means that SENSEI-instrumented codes enjoy the benefit of being able to use a diversity of tools at scale, tools...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/56f7p5vq</guid>
      <pubDate>Fri, 7 Jun 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Bethel, E Wes</name>
        <uri>https://orcid.org/0000-0003-0790-7716</uri>
      </author>
      <author>
        <name>Loring, Burlen</name>
        <uri>https://orcid.org/0000-0002-4678-8142</uri>
      </author>
      <author>
        <name>Ayachit, Utkarsh</name>
      </author>
      <author>
        <name>Camp, David</name>
      </author>
      <author>
        <name>Duque, Earl PN</name>
      </author>
      <author>
        <name>Ferrier, Nicola</name>
      </author>
      <author>
        <name>Insley, Joseph</name>
      </author>
      <author>
        <name>Gu, Junmin</name>
        <uri>https://orcid.org/0000-0002-1521-8534</uri>
      </author>
      <author>
        <name>Kress, James</name>
      </author>
      <author>
        <name>O’Leary, Patrick</name>
      </author>
      <author>
        <name>Pugmire, David</name>
      </author>
      <author>
        <name>Rizzi, Silvio</name>
      </author>
      <author>
        <name>Thompson, David</name>
      </author>
      <author>
        <name>Weber, Gunther H</name>
        <uri>https://orcid.org/0000-0002-1794-1398</uri>
      </author>
      <author>
        <name>Whitlock, Brad</name>
      </author>
      <author>
        <name>Wolf, Matthew</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
    </item>
    <item>
      <title>Serving Deep Learning Models from Relational Databases</title>
      <link>https://escholarship.org/uc/item/9jq2r4k1</link>
      <description>Serving deep learning (DL) models on relational data has become a critical requirement across diverse commercial and scientific domains, sparking growing interest recently. In this visionary paper, we embark on a comprehensive exploration of representative architectures to address the requirement. We highlight three pivotal paradigms: The state-of-the-art DL-centric architecture offloads DL computations to dedicated DL frameworks. The potential UDF-centric architecture encapsulates one or more tensor computations into User Defined Functions (UDFs) within the relational database management system (RDBMS). The potential relation-centric architecture aims to represent a large-scale tensor computation through relational operators. While each of these architectures demonstrates promise in specific use scenarios, we identify urgent requirements for seamless integration of these architectures and the middle ground in-between these architectures. We delve into the gaps that impede the...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/9jq2r4k1</guid>
      <pubDate>Tue, 4 Jun 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Zhou, L</name>
      </author>
      <author>
        <name>Lin, Q</name>
      </author>
      <author>
        <name>Chowdhury, K</name>
      </author>
      <author>
        <name>Masood, S</name>
      </author>
      <author>
        <name>Eichenberger, A</name>
      </author>
      <author>
        <name>Min, H</name>
      </author>
      <author>
        <name>Sim, A</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wang, J</name>
      </author>
      <author>
        <name>Wang, Y</name>
      </author>
      <author>
        <name>Wu, K</name>
      </author>
      <author>
        <name>Yuan, B</name>
      </author>
      <author>
        <name>Zou, J</name>
      </author>
    </item>
    <item>
      <title>Unsupervised anomaly detection in daily wan traffic patterns</title>
      <link>https://escholarship.org/uc/item/4gh0m8n2</link>
      <description>Growth in large-scale experiments using high capacity reliable networking as part of their design is creating a need for better monitoring and analysis of observed traffic. Network providers need intelligent solutions that can help quickly identify and understand anomalous behaviors at the network edge, allowing reactions to unexpected traffic or attacks on facilities and their peerings. However, due to lack of labeled data in network traffic analysis and user diversity, we introduce novel methods that process very large network datasets quickly for outlier identification. In this paper, we leverage artificial intelligence (AI), network research, and edge computing to collect and train unsupervised classification algorithms using streaming data pipelines from multiple months of network flow records. Once trained, individual classifiers quickly observe and flag alerts in hourly behaviors. Our work describes building the data pipeline as well as addressing issues of false positives...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/4gh0m8n2</guid>
      <pubDate>Tue, 26 Mar 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Campbell, S</name>
        <uri>https://orcid.org/0000-0002-6542-7473</uri>
      </author>
      <author>
        <name>Kiran, M</name>
      </author>
      <author>
        <name>Wala, FB</name>
      </author>
    </item>
    <item>
      <title>Unsupervised Anomaly Detection in Daily WAN Traffic Patterns</title>
      <link>https://escholarship.org/uc/item/3n2021jm</link>
      <description>Unsupervised Anomaly Detection in Daily WAN Traffic Patterns</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/3n2021jm</guid>
      <pubDate>Tue, 26 Mar 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Campbell, Scott</name>
        <uri>https://orcid.org/0000-0002-6542-7473</uri>
      </author>
      <author>
        <name>Kiran, Mariam</name>
      </author>
      <author>
        <name>Wala, Fatema Bannat</name>
      </author>
    </item>
    <item>
      <title>Detecting Anomalies in Time Series Using Kernel Density Approaches</title>
      <link>https://escholarship.org/uc/item/78m512td</link>
      <description>This paper introduces a novel anomaly detection approach tailored for time series data with exclusive reliance on normal events during training. Our key innovation lies in the application of kernel-density estimation (KDE) to scrutinize reconstruction errors, providing an empirically derived probability distribution for normal events post-reconstruction. This non-parametric density estimation technique offers a nuanced understanding of anomaly detection, differentiating it from prevalent threshold-based mechanisms in existing methodologies. In post-training, events are encoded, decoded, and evaluated against the estimated density, providing a comprehensive notion of normality. In addition, we propose a data augmentation strategy involving variational autoencoder-generated events and a smoothing step for enhanced model robustness. The significance of our autoencoder-based approach is evident in its capacity to learn normal representation without prior anomaly knowledge. Through...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/78m512td</guid>
      <pubDate>Tue, 12 Mar 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Frehner, Robin</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Sim, Alexander</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Kim, Jinoh</name>
      </author>
      <author>
        <name>Stockinger, Kurt</name>
      </author>
    </item>
    <item>
      <title>Insights into DoH: Traffic Classification for DNS over HTTPS in an Encrypted Network</title>
      <link>https://escholarship.org/uc/item/35c27812</link>
      <description>In the past few years there has been a growing desire to provide more built in functionality to protect user communications from eavesdropping. An example of this is DNS over HTTPS (DoH) which can be used to protect user privacy, confidentiality and against spoofing attacks. Since its first popularity in 2018 as used in browsers, there is much further study to test the effectiveness of DoH in protection schemes and whether it is possible to detect the protocol over the web. Detecting DoH traffic among normal web traffic is also a major challenge for network admins to allow filtering of malicious traffic flows. In this paper, we investigate machine learning classification to study the detection of DoH traffic and further analyze the key feature characteristics in the protocol behavior to help researchers build credibility in the DoH protocol detection. Our study reveals key features and statistical relationships among DoH test runs on the Alexa-recommended 100 most-used websites...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/35c27812</guid>
      <pubDate>Tue, 12 Mar 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Wala, Fatema Bannat</name>
      </author>
      <author>
        <name>Campbell, Scott</name>
        <uri>https://orcid.org/0000-0002-6542-7473</uri>
      </author>
      <author>
        <name>Kiran, Mariam</name>
      </author>
    </item>
    <item>
      <title>Automatic Data Transformation Using Large Language Model - An Experimental Study on Building Energy Data</title>
      <link>https://escholarship.org/uc/item/0tp1886v</link>
      <description>Existing approaches to automatic data transformation are insufficient to meet the requirements in many real-world scenarios, such as the building sector. First, there is no convenient interface for domain experts to provide domain knowledge easily. Second, they require significant training data collection overheads. Third, the accuracy suffers from complicated schema changes. To address these shortcomings, we present a novel approach that leverages the unique capabilities of large language models (LLMs) in coding, complex reasoning, and zero-shot learning to generate SQL code that transforms the source datasets into the target datasets. We demonstrate the viability of this approach by designing an LLM-based framework, termed SQLMorpher, which comprises a prompt generator that integrates the initial prompt with optional domain knowledge and historical patterns in external databases. It also implements an iterative prompt optimization mechanism that automatically improves the prompt...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/0tp1886v</guid>
      <pubDate>Tue, 12 Mar 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Sharma, Ankita</name>
      </author>
      <author>
        <name>Li, Xuanmao</name>
      </author>
      <author>
        <name>Guan, Hong</name>
      </author>
      <author>
        <name>Sun, Guoxin</name>
      </author>
      <author>
        <name>Zhang, Liang</name>
      </author>
      <author>
        <name>Wang, Lanjun</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Cao, Lei</name>
      </author>
      <author>
        <name>Zhu, Erkang</name>
      </author>
      <author>
        <name>Sim, Alexander</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Teresa</name>
      </author>
      <author>
        <name>Zou, Jia</name>
      </author>
    </item>
    <item>
      <title>The Adaptable IO System (ADIOS)</title>
      <link>https://escholarship.org/uc/item/0bv5d7dj</link>
      <description>The Adaptable I/O System (ADIOS) provides a publish/subscribe abstraction for data access and storage. The framework provides various engines for producing and consuming data through different mediums (storage, memory, network) for various application scenarios. ADIOS engines exist to write/read files on a storage system, to couple independent simulations together or to stream data from a simulation to analysis and visualization tools via the computer’s network infrastructure, and to stream experimental/observational data from the producer to data processors via the wide-area-network. Both lossy and lossless compression are supported by ADIOS to provide for seamless exchange of data between producer and consumer. In this work we provide a description for the ADIOS framework and the abstractions provided. We demonstrate the capabilities of the ADIOS framework using a number of examples, including strong coupling of simulation codes, in situ visualization running on a separate computing...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/0bv5d7dj</guid>
      <pubDate>Thu, 29 Feb 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Pugmire, David</name>
      </author>
      <author>
        <name>Podhorszki, Norbert</name>
      </author>
      <author>
        <name>Klasky, Scott</name>
      </author>
      <author>
        <name>Wolf, Matthew</name>
      </author>
      <author>
        <name>Kress, James</name>
      </author>
      <author>
        <name>Kim, Mark</name>
      </author>
      <author>
        <name>Thompson, Nicholas</name>
      </author>
      <author>
        <name>Logan, Jeremy</name>
      </author>
      <author>
        <name>Wang, Ruonan</name>
      </author>
      <author>
        <name>Mehta, Kshitij</name>
      </author>
      <author>
        <name>Suchyta, Eric</name>
      </author>
      <author>
        <name>Godoy, William</name>
      </author>
      <author>
        <name>Choi, Jong</name>
      </author>
      <author>
        <name>Ostrouchov, George</name>
      </author>
      <author>
        <name>Wan, Lipeng</name>
      </author>
      <author>
        <name>Chen, Jieyang</name>
      </author>
      <author>
        <name>Atkins, Berk Geveci Chuck</name>
      </author>
      <author>
        <name>Ross, Caitlin</name>
      </author>
      <author>
        <name>Eisenhauer, Greg</name>
      </author>
      <author>
        <name>Gu, Junmin</name>
        <uri>https://orcid.org/0000-0002-1521-8534</uri>
      </author>
      <author>
        <name>Wu, John</name>
        <uri>https://orcid.org/0000-0002-6907-3393</uri>
      </author>
      <author>
        <name>Huebl, Axel</name>
        <uri>https://orcid.org/0000-0003-1943-7141</uri>
      </author>
      <author>
        <name>Tsutsumi, Seiji</name>
      </author>
    </item>
    <item>
      <title>A primer on artificial intelligence in plant digital phenomics: embarking on the data to insights journey</title>
      <link>https://escholarship.org/uc/item/6kt1d51f</link>
      <description>Artificial intelligence (AI) has emerged as a fundamental component of global agricultural research that is poised to impact on many aspects of plant science. In digital phenomics, AI is capable of learning intricate structure and patterns in large datasets. We provide a perspective and primer on AI applications to phenome research. We propose a novel human-centric explainable AI (X-AI) system architecture consisting of data architecture, technology infrastructure, and AI architecture design. We clarify the difference between post hoc models and 'interpretable by design' models. We include guidance for effectively using an interpretable by design model in phenomic analysis. We also provide directions to sources of tools and resources for making data analytics increasingly accessible. This primer is accompanied by an interactive online tutorial.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/6kt1d51f</guid>
      <pubDate>Tue, 23 Jan 2024 00:00:00 +0000</pubDate>
      <author>
        <name>Harfouche, Antoine L</name>
      </author>
      <author>
        <name>Nakhle, Farid</name>
      </author>
      <author>
        <name>Harfouche, Antoine H</name>
      </author>
      <author>
        <name>Sardella, Orlando G</name>
      </author>
      <author>
        <name>Dart, Eli</name>
        <uri>https://orcid.org/0000-0002-8229-5433</uri>
      </author>
      <author>
        <name>Jacobson, Daniel</name>
      </author>
    </item>
    <item>
      <title>Managed Network Services for Exascale Data Movement Across Large Global Scientific Collaborations</title>
      <link>https://escholarship.org/uc/item/2179g9wk</link>
      <description>Unique scientific instruments designed and operated by large global collaborations are expected to produce Exabytescale data volumes per year by 2030. These collaborations depend on globally distributed storage and compute to turn raw data into science. While all of these infrastructures have batch scheduling capabilities to share compute, Research and Education networks lack those capabilities. There is thus uncontrolled competition for bandwidth between and within collaborations. As a result, data 'hogs' disk space at processing facilities for much longer than it takes to process, leading to vastly over-provisioned storage infrastructures. Integrated co-scheduling of networks as part of high-level managed workflows might reduce these storage needs by more than an order of magnitude. This paper describes such a solution, demonstrates its functionality in the context of the Large Hadron Collider (LHC) at CERN, and presents the nextsteps towards its use in production.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/2179g9wk</guid>
      <pubDate>Tue, 5 Dec 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Würthwein, Frank</name>
      </author>
      <author>
        <name>Guiang, Jonathan</name>
      </author>
      <author>
        <name>Arora, Aashay</name>
      </author>
      <author>
        <name>Davila, Diego</name>
      </author>
      <author>
        <name>Graham, John</name>
      </author>
      <author>
        <name>Mishin, Dima</name>
      </author>
      <author>
        <name>Hutton, Thomas</name>
      </author>
      <author>
        <name>Sfiligoi, Igor</name>
      </author>
      <author>
        <name>Newman, Harvey</name>
      </author>
      <author>
        <name>Balcas, Justas</name>
      </author>
      <author>
        <name>Lehman, Tom</name>
      </author>
      <author>
        <name>Yang, Xi</name>
      </author>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
    </item>
    <item>
      <title>Integrating End-to-End Exascale SDN into the LHC Data Distribution Cyberinfrastructure</title>
      <link>https://escholarship.org/uc/item/18f3m43g</link>
      <description>The Compact Muon Solenoid (CMS) experiment at the CERN Large Hadron Collider (LHC) distributes its data by leveraging a diverse array of National Research and Education Networks (NRENs), which CMS is forced to treat as an opaque resource. Consequently, CMS sees highly variable performance that already poses a challenge for operators coordinating the movement of petabytes around the globe. This kind of unpredictability, however, threatens CMS with a logistical nightmare as it barrels towards the High Luminosity LHC (HL-LHC) era in 2030, which is expected to produce roughly 0.5 exabytes of data per year. This paper explores one potential solution to this issue: software-defined networking (SDN). In particular, the prototypical interoperation of SENSE, an SDN product developed by the Energy Sciences Network, with Rucio, the data management software used by the LHC, is outlined. In addition, this paper presents the current progress in bringing these technologies together.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/18f3m43g</guid>
      <pubDate>Tue, 5 Dec 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Guiang, Jonathan</name>
      </author>
      <author>
        <name>Arora, Aashay</name>
      </author>
      <author>
        <name>Davila, Diego</name>
      </author>
      <author>
        <name>Graham, John</name>
      </author>
      <author>
        <name>Mishin, Dima</name>
      </author>
      <author>
        <name>Sfiligoi, Igor</name>
      </author>
      <author>
        <name>Wuerthwein, Frank</name>
        <uri>https://orcid.org/0000-0001-5912-6124</uri>
      </author>
      <author>
        <name>Lehman, Tom</name>
      </author>
      <author>
        <name>Yang, Xi</name>
      </author>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
      <author>
        <name>Newman, Harvey</name>
      </author>
      <author>
        <name>Balcas, Justas</name>
      </author>
      <author>
        <name>Hutton, Thomas</name>
      </author>
    </item>
    <item>
      <title>ESnet Requirements Review Program Through the IRI Lens: A Meta-Analysis of Workflow Patterns Across DOE Office of Science Programs (Final Report)</title>
      <link>https://escholarship.org/uc/item/9fg8k5xh</link>
      <description>ESnet Requirements Review Program Through the IRI Lens: A Meta-Analysis of Workflow Patterns Across DOE Office of Science Programs (Final Report)</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/9fg8k5xh</guid>
      <pubDate>Tue, 7 Nov 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Dart, Eli</name>
        <uri>https://orcid.org/0000-0002-8229-5433</uri>
      </author>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Hawk, Carol</name>
      </author>
      <author>
        <name>Brown, Benjamin</name>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
    </item>
    <item>
      <title>Analyzing Transatlantic Network Traffic over Scientific Data Caches</title>
      <link>https://escholarship.org/uc/item/65s4w5fv</link>
      <description>Large scientific collaborations often share huge volumes of data around the world. Consequently a significant amount of network bandwidth is needed for data replication and data access. Users in the same region may possibly share resources as well as data, especially when they are working on related topics with similar datasets. In this work, we study the network traffic patterns and resource utilization for scientific data caches connecting European networks to the US. We explore the efficiency of resource utilization, especially for network traffic which consists mostly of transatlantic data transfers, and the potential for having more caching node deployments. Our study shows that these data caches reduced network traffic volume by 97% during the study period. This demonstrates that such caching nodes are effective in reducing wide-area network traffic.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/65s4w5fv</guid>
      <pubDate>Tue, 29 Aug 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Deng, Ziyue</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
      <author>
        <name>Hazen, Damian</name>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
      <author>
        <name>Andrijauskas, Fabio</name>
        <uri>https://orcid.org/0000-0002-1254-8570</uri>
      </author>
      <author>
        <name>Würthwein, Frank</name>
      </author>
      <author>
        <name>Weitzel, Derek</name>
      </author>
    </item>
    <item>
      <title>Biological and Environmental Research Network Requirements Review (Final Report)</title>
      <link>https://escholarship.org/uc/item/3mz7h3mm</link>
      <description>The Energy Sciences Network (ESnet) is the high-performance network user facility for the US Department of Energy (DOE) Office of Science (SC) and delivers highly reliable data transport capabilities optimized for the requirements of data-intensive science. In essence, ESnet is the circulatory system that enables the DOE science mission by connecting all its laboratories and facilities in the US and abroad. ESnet is funded and stewarded
by the Advanced Scientific Computing Research (ASCR) program and managed and operated by the Scientific Networking Division at Lawrence Berkeley National Laboratory (LBNL). ESnet is widely regarded as a global leader in the research and education (R&amp;amp;E) networking community.

Between August 2022 and April 2023, ESnet and the Office of Biological and Environmental Research (BER) of the DOE SC organized an ESnet requirements review of BER-supported activities. Preparation for these events included identification of key stakeholders: program and...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/3mz7h3mm</guid>
      <pubDate>Thu, 24 Aug 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Dart, Eli</name>
      </author>
      <author>
        <name>Harlan, Zach</name>
      </author>
      <author>
        <name>Hawk, Carol</name>
      </author>
      <author>
        <name>Hess, John</name>
      </author>
      <author>
        <name>Hnilo, Justin</name>
      </author>
      <author>
        <name>MacAuley, John</name>
      </author>
      <author>
        <name>Madupu, Ramana</name>
      </author>
      <author>
        <name>Miller, Ken</name>
      </author>
      <author>
        <name>Tracy, Chris</name>
      </author>
      <author>
        <name>Wiedlea, Andrew</name>
      </author>
    </item>
    <item>
      <title>AIIO: Using Artificial Intelligence for Job-Level and Automatic I/O Performance Bottleneck Diagnosis</title>
      <link>https://escholarship.org/uc/item/0dd0v40x</link>
      <description>Manually diagnosing the I/O performance bottleneck for a single application (hereinafter referred to as the "job level'') is a tedious and error-prone procedure requiring domain scientists to have deep knowledge of complex storage systems. However, existing automatic methods for I/O performance bottleneck diagnosis have one major issue: the granularity of the analysis is at the platform or group level and the diagnosis results cannot be applied to the individual application. To address this issue, we designed and developed a method named "Artificial Intelligence for I/O" (AIIO), which uses AI and its interpretation technology to diagnose I/O performance bottlenecks at the job level automatically. By considering the sparsity of I/O log files, employing multiple AI models for performance prediction, merging diagnosis results across multiple models, and generalizing its performance prediction and diagnosis functions, AIIO can accurately and robustly identify the bottleneck of an...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/0dd0v40x</guid>
      <pubDate>Mon, 21 Aug 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Dong, Bin</name>
      </author>
      <author>
        <name>Bez, Jean Luca</name>
        <uri>https://orcid.org/0000-0002-3915-1135</uri>
      </author>
      <author>
        <name>Byna, Suren</name>
        <uri>https://orcid.org/0000-0003-3048-3448</uri>
      </author>
    </item>
    <item>
      <title>Leveraging History to Predict Infrequent Abnormal Transfers in Distributed Workflows †</title>
      <link>https://escholarship.org/uc/item/5tt947v2</link>
      <description>Scientific computing heavily relies on data shared by the community, especially in distributed data-intensive applications. This research focuses on predicting slow connections that create bottlenecks in distributed workflows. In this study, we analyze network traffic logs collected between January 2021 and August 2022 at the National Energy Research Scientific Computing Center (NERSC). Based on the observed patterns, we define a set of features primarily based on history for identifying low-performing data transfers. Typically, there are far fewer slow connections on well-maintained networks, which creates difficulty in learning to identify these abnormally slow connections from the normal ones. We devise several stratified sampling techniques to address the class-imbalance challenge and study how they affect the machine learning approaches. Our tests show that a relatively simple technique that undersamples the normal cases to balance the number of samples in two classes (normal...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/5tt947v2</guid>
      <pubDate>Tue, 20 Jun 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Shao, Robin</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Kim, Jinoh</name>
      </author>
    </item>
    <item>
      <title>Design and implementation of I/O performance prediction scheme on HPC systems through large-scale log analysis</title>
      <link>https://escholarship.org/uc/item/4b4567xs</link>
      <description>Large-scale high performance computing (HPC) systems typically consist of many thousands of CPUs and storage units used by hundreds to thousands of users simultaneously. Applications from large numbers of users have diverse characteristics, such as varying computation, communication, memory, and I/O intensity. A good understanding of the performance characteristics of each user application is important for job scheduling and resource provisioning. Among these performance characteristics, I/O performance is becoming increasingly important as data sizes rapidly increase and large-scale applications, such as simulation and model training, are widely adopted. However, predicting I/O performance is difficult because I/O systems are shared among all users and involve many layers of software and hardware stack, including the application, network interconnect, operating system, file system, and storage devices. Furthermore, updates to these layers and changes in system management policy...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/4b4567xs</guid>
      <pubDate>Tue, 20 Jun 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Kim, Sunggon</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Byna, Suren</name>
        <uri>https://orcid.org/0000-0003-3048-3448</uri>
      </author>
      <author>
        <name>Son, Yongseok</name>
      </author>
    </item>
    <item>
      <title>Effectiveness and predictability of in-network storage cache for Scientific Workflows</title>
      <link>https://escholarship.org/uc/item/1507s9df</link>
      <description>Large scientific collaborations often have multiple scientists accessing the same set of files while doing different analyses, which create repeated accesses to the large amounts of shared data located far away. These data accesses have long latency due to distance and occupy the limited bandwidth available over the wide-area network. To reduce the wide-area network traffic and the data access latency, regional data storage caches have been installed as a new networking service. To study the effectiveness of such a cache system in scientific applications, we examine the Southern California Petabyte Scale Cache for a high-energy physics experiment. By examining about 3TB of operational logs, we show that this cache removed 67.6% of file requests from the wide-area network and reduced the traffic volume on wide-area network by 12. 3TB (or 35.4%) an average day. The reduction in the traffic volume (35.4%) is less than the reduction in file counts (67.6%) because the larger files...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/1507s9df</guid>
      <pubDate>Tue, 20 Jun 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Sim, Caitlin</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
      <author>
        <name>Würthwein, Frank</name>
      </author>
      <author>
        <name>Davila, Diego</name>
      </author>
      <author>
        <name>Newman, Harvey</name>
      </author>
      <author>
        <name>Balcas, Justas</name>
      </author>
    </item>
    <item>
      <title>Identifying Time Series Similarity in Large-Scale Earth System Datasets</title>
      <link>https://escholarship.org/uc/item/01h5s1jc</link>
      <description>Identifying Time Series Similarity in Large-Scale Earth System Datasets</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/01h5s1jc</guid>
      <pubDate>Tue, 6 Jun 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Linton, Payton</name>
      </author>
      <author>
        <name>Melodia, William</name>
      </author>
      <author>
        <name>Lazar, Alina</name>
      </author>
      <author>
        <name>Agarwal, Deborah</name>
        <uri>https://orcid.org/0000-0001-5045-2396</uri>
      </author>
      <author>
        <name>Bianchi, Ludovico</name>
      </author>
      <author>
        <name>Ghoshal, Devarshi</name>
        <uri>https://orcid.org/0000-0002-6819-6949</uri>
      </author>
      <author>
        <name>Wu, keshang</name>
      </author>
      <author>
        <name>Pastorello, Gilberto</name>
        <uri>https://orcid.org/0000-0002-9387-3702</uri>
      </author>
      <author>
        <name>Ramakrishnan, Lavanya</name>
      </author>
    </item>
    <item>
      <title>Snowmass 2021 Computational Frontier CompF4 Topical Group Report Storage and Processing Resource Access</title>
      <link>https://escholarship.org/uc/item/47b4737w</link>
      <description>Snowmass 2021 Computational Frontier CompF4 Topical Group Report Storage and Processing Resource Access</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/47b4737w</guid>
      <pubDate>Tue, 23 May 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Bhimji, W</name>
      </author>
      <author>
        <name>Carder, D</name>
        <uri>https://orcid.org/0000-0001-8357-0997</uri>
      </author>
      <author>
        <name>Dart, E</name>
        <uri>https://orcid.org/0000-0002-8229-5433</uri>
      </author>
      <author>
        <name>Duarte, J</name>
        <uri>https://orcid.org/0000-0002-5076-7096</uri>
      </author>
      <author>
        <name>Fisk, I</name>
      </author>
      <author>
        <name>Gardner, R</name>
      </author>
      <author>
        <name>Guok, C</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
      <author>
        <name>Jayatilaka, B</name>
      </author>
      <author>
        <name>Lehman, T</name>
      </author>
      <author>
        <name>Lin, M</name>
      </author>
      <author>
        <name>Maltzahn, C</name>
        <uri>https://orcid.org/0000-0001-8305-0748</uri>
      </author>
      <author>
        <name>McKee, S</name>
      </author>
      <author>
        <name>Neubauer, MS</name>
      </author>
      <author>
        <name>Rind, O</name>
      </author>
      <author>
        <name>Shadura, O</name>
      </author>
      <author>
        <name>Tran, NV</name>
      </author>
      <author>
        <name>van Gemmeren, P</name>
      </author>
      <author>
        <name>Watts, G</name>
      </author>
      <author>
        <name>Weaver, BA</name>
      </author>
      <author>
        <name>Würthwein, F</name>
      </author>
    </item>
    <item>
      <title>Use It or Lose It: Cheap Compute Everywhere</title>
      <link>https://escholarship.org/uc/item/60w672j2</link>
      <description>Moore’s Law is tapering off, but FLOPS per dollar continues to grow. Inexpensive CPUs are emerging everywhere from network to storage as an effective way of managing and deploying hardware and firmware as well as providing services close to the data path. Examples of this include ARM cores within Mellanox Bluefield, Broadcom Stingray DPUs, switches, and compute in storage. This additional processing power can be useful for (1) enabling higher throughput, (2) decreasing or hiding latency, (3) increasing power/cost efficiency, (4) alleviating contention for oversubscribed resources. In order to make these additional resources available to a wide range of services and applications we must first develop: (1) an understanding of the strengths and weaknesses of the hardware, (2) an understanding of how portions of a workload might be decomposed into tasks for offload, (3) abstractions to allow code portability on the heterogeneous components. We take a look at existing hardware trends...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/60w672j2</guid>
      <pubDate>Wed, 26 Apr 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Groves, Taylor</name>
      </author>
      <author>
        <name>Hazen, Damian</name>
      </author>
      <author>
        <name>Lockwood, Glenn</name>
        <uri>https://orcid.org/0000-0002-9241-9372</uri>
      </author>
      <author>
        <name>Wright, Nicholas J</name>
        <uri>https://orcid.org/0000-0003-1883-6108</uri>
      </author>
    </item>
    <item>
      <title>National Institute of Standards and Technology Requirements (Analysis Report)</title>
      <link>https://escholarship.org/uc/item/704411nv</link>
      <description>EPOC uses the Deep Dive process to discuss and analyze current and planned science, research, or education activities and the anticipated data output of a particular use case, site, or project to help inform the strategic planning of a campus or regional networking environment. This includes understanding future needs related to network operations, network capacity upgrades, and other technological service investments. A Deep Dive comprehensively surveys major research stakeholders’ plans and processes in order to investigate data management requirements over the next 5–10 years. 

In October of 2022, staff members from the Engagement and Performance Operations Center (EPOC) met with researchers and staff from the National Institute of Standards and Technology (NIST) for the purpose of a Deep Dive into scientific and research drivers. The goal of this activity was to help characterize the requirements for a number of campus use cases, and to enable cyberinfrastructure support...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/704411nv</guid>
      <pubDate>Fri, 21 Apr 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Schopf, Jennifer</name>
      </author>
    </item>
    <item>
      <title>Real-time and post-hoc compression for data from Distributed Acoustic Sensing</title>
      <link>https://escholarship.org/uc/item/9f38j545</link>
      <description>Distributed Acoustic Sensing (DAS) is an emerging sensing technology that records the strain-rate along fiber optic cables at high spatial and temporal resolution. This technique is becoming a popular tool in seismology, hydrology, and other subsurface monitoring applications. However, due to the large coverage (10’s of km) and high density of measurements (1m spacing at 100’s of Hz), a DAS installation could produce terabytes of data records per day. Because many DAS instruments are deployed in remote locations, this large data size poses significant challenges to its transfer and storage. In this paper, we explore lossless compression methods to reduce the storage requirement in both real-time and post-hoc scenarios. We propose a two-stage compression method to improve the compression ratio and compression speed. This two-stage compression method could reduce the storage requirement by 40%, which is 20% more than other lossless methods, such as ZSTD. We demonstrate that the...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/9f38j545</guid>
      <pubDate>Mon, 6 Mar 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Dong, Bin</name>
      </author>
      <author>
        <name>Popescu, Alex</name>
      </author>
      <author>
        <name>Tribaldos, Verónica Rodríguez</name>
      </author>
      <author>
        <name>Byna, Suren</name>
        <uri>https://orcid.org/0000-0003-3048-3448</uri>
      </author>
      <author>
        <name>Ajo-Franklin, Jonathan</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Team, the Imperial Valley Dark Fiber</name>
      </author>
    </item>
    <item>
      <title>Studying Scientific Data Lifecycle in On-demand Distributed Storage Caches</title>
      <link>https://escholarship.org/uc/item/3ks4r91k</link>
      <description>The XRootD system is used to transfer, store, and cache large datasets from high-energy physics (HEP). In this study we focus on its capability as distributed on-demand storage cache. Through exploring a large set of daily log files between 2020 and 2021, we seek to understand the data access patterns that might inform future cache design. Our study begins with a set of summary statistics regarding file read operations, file lifetimes, and file transfers. We observe that the number of read operations on each file remains nearly constant, while the average size of a read operation grows over time. Furthermore, files tend to have a consistent length of time during which they remain open and are in use. Based on this comprehensive study of the cache access statistics, we developed a cache simulator to explore the behavior of caches of different sizes. Within a certain size range, we find that increasing the XRootD cache size improves the cache hit rate, yielding faster overall file...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/3ks4r91k</guid>
      <pubDate>Tue, 28 Feb 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Bellavita, Julian</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
      <author>
        <name>Würthwein, Frank</name>
      </author>
      <author>
        <name>Davila, Diego</name>
      </author>
    </item>
    <item>
      <title>Locating Partial Discharges in Power Transformers with Convolutional Iterative Filtering †</title>
      <link>https://escholarship.org/uc/item/2k81z2xx</link>
      <description>The most common source of transformer failure is in the insulation, and the most prevalent warning signal for insulation weakness is partial discharge (PD). Locating the positions of these partial discharges would help repair the transformer to prevent failures. This work investigates algorithms that could be deployed to locate the position of a PD event using data from ultra-high frequency (UHF) sensors inside the transformer. These algorithms typically proceed in two steps: first determining the signal arrival time, and then locating the position based on time differences. This paper reviews available methods for each task and then propose new algorithms: a convolutional iterative filter with thresholding (CIFT) to determine the signal arrival time and a reference table of travel times to resolve the source location. The effectiveness of these algorithms are tested with a set of laboratory-triggered PD events and two sets of simulated PD events inside transformers in production...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/2k81z2xx</guid>
      <pubDate>Tue, 28 Feb 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Wang, Jonathan</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Hwangbo, Seongwook</name>
      </author>
    </item>
    <item>
      <title>Access Trends of In-network Cache for Scientific Data</title>
      <link>https://escholarship.org/uc/item/03r8w0sb</link>
      <description>Scientific collaborations are increasingly relying on large volumes of data for their work and many of them employ tiered systems to replicate the data to their worldwide user communities. Each user in the community often selects a different subset of data for their analysis tasks; however, members of a research group often are working on related research topics that require similar data objects. Thus, there is a significant amount of data sharing possible. In this work, we study the access traces of a federated storage cache known as the Southern California Petabyte Scale Cache. By studying the access patterns and potential for network traffic reduction by this caching system, we aim to explore the predictability of the cache uses and the potential for a more general in-network data caching. Our study shows that this distributed storage cache is able to reduce the network traffic volume by a factor of 2.35 during a part of the study period. We further show that machine learning...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/03r8w0sb</guid>
      <pubDate>Tue, 28 Feb 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Han, Ruize</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
      <author>
        <name>Würthwein, Frank</name>
      </author>
      <author>
        <name>Davila, Diego</name>
      </author>
      <author>
        <name>Balcas, Justas</name>
      </author>
      <author>
        <name>Newman, Harvey</name>
      </author>
    </item>
    <item>
      <title>St. Mary’s University Requirements Analysis Report</title>
      <link>https://escholarship.org/uc/item/5d65w10j</link>
      <description>EPOC uses the Deep Dive process to discuss and analyze current and planned science, research, or education activities and the anticipated data output of a particular use case, site, or project to help inform the strategic planning of a campus or regional networking environment. This includes understanding future needs related to network operations, network capacity upgrades, and other technological service investments. A Deep Dive comprehensively surveys major research stakeholders’ plans and processes in order to investigate data management requirements over the next 5–10 years. 

Between September and December 2022, staff members from the Engagement and Performance Operations Center (EPOC) met with researchers and staff from LEARN and St. Mary’s University for the purpose of a Deep Dive into scientific and research drivers. The goal of this activity was to help characterize the requirements for a number of campus use cases, and to enable cyberinfrastructure support staff to...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/5d65w10j</guid>
      <pubDate>Mon, 16 Jan 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Schopf, Jennifer</name>
      </author>
      <author>
        <name>Southworth, Douglas</name>
      </author>
      <author>
        <name>Gamble, Austin</name>
      </author>
      <author>
        <name>Hicks, Byron</name>
      </author>
      <author>
        <name>Schultz, Amy</name>
      </author>
    </item>
    <item>
      <title>Arizona State University Requirements Analysis Report</title>
      <link>https://escholarship.org/uc/item/03r3h2d5</link>
      <description>EPOC uses the Deep Dive process to discuss and analyze current and planned science, research, or education activities and the anticipated data output of a particular use case, site, or project to help inform the strategic planning of a campus or regional networking environment. This includes understanding future needs related to network operations, network capacity upgrades, and other technological service investments. A Deep Dive comprehensively surveys major research stakeholders’ plans and processes in order to investigate data management requirements over the next 5–10 years.   

Between October 2021 and February 2022 staff members from the Engagement and Performance Operations Center (EPOC) met with researchers and staff from Arizona State University (ASU) for the purpose of a Deep Dive into scientific and research drivers. The goal of this activity was to help characterize the requirements for a number of campus use cases, and to enable cyberinfrastructure support staff...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/03r3h2d5</guid>
      <pubDate>Wed, 11 Jan 2023 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Southworth, Douglas</name>
      </author>
      <author>
        <name>Meade, Brenna</name>
      </author>
    </item>
    <item>
      <title>Transport Layer Networking</title>
      <link>https://escholarship.org/uc/item/1jz6k6kk</link>
      <description>In this paper we focus on the invention of new network forwarding behaviors
between network Layers 4 and Layer 7 in the OSI network model. Our design goal
is to propose no changes to L3 - The IP network layer, thus maintaining 100%
compatibility with the existing internet. Small changes are made to L4 the
transport layer, and a new design for a session ( L5 ) is proposed. This new
capability is intended to have minimal or no impact on the application layer,
except for exposing the ability for L7 to select this new mode of data transfer
or not. The invention of new networking technologies is frequently done in an
academic setting, however the design needs to be constrained by practical
considerations for cost, operational feasibility, robustness and scale. Our
goal is to improve the production data infrastructure for HEP 24/7 on a global
scale.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/1jz6k6kk</guid>
      <pubDate>Tue, 13 Dec 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Kumar, Yatish</name>
      </author>
      <author>
        <name>Sheldon, Stacey</name>
      </author>
      <author>
        <name>Carder, Dale</name>
        <uri>https://orcid.org/0000-0001-8357-0997</uri>
      </author>
    </item>
    <item>
      <title>Levenberg–Marquardt multi-classification using hinge loss function</title>
      <link>https://escholarship.org/uc/item/1vj2492x</link>
      <description>Incorporating higher-order optimization functions, such as Levenberg-Marquardt (LM) have revealed better generalizable solutions for deep learning problems. However, these higher-order optimization functions suffer from very large processing time and training complexity especially as training datasets become large, such as in multi-view classification problems, where finding global optima is a very costly problem. To solve this issue, we develop a solution for LM-enabled classification with, to the best of knowledge first-time implementation of hinge loss, for multiview classification. Hinge loss allows the neural network to converge faster and perform better than other loss functions such as logistic or square loss rates. We prove our method by experimenting with various multiclass classification challenges of varying complexity and training data size. The empirical results show the training time and accuracy rates achieved, highlighting how our method outperforms in all cases,...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/1vj2492x</guid>
      <pubDate>Thu, 8 Dec 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Ozyildirim, Buse Melis</name>
      </author>
      <author>
        <name>Kiran, Mariam</name>
      </author>
    </item>
    <item>
      <title>High Energy Physics Network Requirements Review: One-Year Update</title>
      <link>https://escholarship.org/uc/item/65h4t5zc</link>
      <description>The Energy Sciences Network (ESnet) is the high-performance network user facility for the US Department of Energy (DOE) Office of Science (SC) and delivers highly reliable data transport capabilities optimized for the requirements of data-intensive science. In essence, ESnet is the circulatory system that enables the DOE science mission by connecting all its laboratories and facilities in the US and abroad. ESnet is funded and stewarded
by the Advanced Scientific Computing Research (ASCR) program and managed and operated by the Scientific Networking Division at Lawrence Berkeley National Laboratory (LBNL). ESnet is widely regarded as a global leader in the research and education (R&amp;amp;E) networking community.

In April 2022, ESnet and the Office of High Energy Physics (HEP) of the DOE SC organized an ESnet requirements review of HEP-supported activities. Preparation for the review included identification of key stakeholders: program and facility management, research groups, and...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/65h4t5zc</guid>
      <pubDate>Wed, 7 Dec 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Carder, Dale</name>
        <uri>https://orcid.org/0000-0001-8357-0997</uri>
      </author>
      <author>
        <name>Colby, Eric</name>
      </author>
      <author>
        <name>Dart, Eli</name>
        <uri>https://orcid.org/0000-0002-8229-5433</uri>
      </author>
      <author>
        <name>Hawk, Carol</name>
      </author>
      <author>
        <name>Hier-Majumder, Saswata</name>
      </author>
      <author>
        <name>Miller, Bill</name>
      </author>
      <author>
        <name>Miller, Ken</name>
      </author>
      <author>
        <name>Patwa, Abid</name>
      </author>
      <author>
        <name>Robinson, Kate</name>
      </author>
      <author>
        <name>Rotman, Lauren</name>
      </author>
      <author>
        <name>Wiedlea, Andrew</name>
      </author>
    </item>
    <item>
      <title>CRADA Final Report: RouteViews project</title>
      <link>https://escholarship.org/uc/item/6cz2s19c</link>
      <description>Final report of the RouteViews project via CRADA # FP00009959. This was an exchange of FTE resources.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/6cz2s19c</guid>
      <pubDate>Tue, 6 Dec 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Mace, Kathryn</name>
        <uri>https://orcid.org/0000-0003-0307-8738</uri>
      </author>
    </item>
    <item>
      <title>Basic Energy Sciences Network Requirements Review (Final Report)</title>
      <link>https://escholarship.org/uc/item/3jj0h54n</link>
      <description>The Energy Sciences Network (ESnet) is the high-performance network user facility for the US Department of Energy (DOE) Office of Science (SC) and delivers highly reliable data transport capabilities optimized for the requirements of data-intensive science. In essence, ESnet is the circulatory system that enables the DOE science mission by connecting all of its laboratories and facilities in the US and abroad. ESnet is funded and stewarded by the Advanced Scientific Computing Research (ASCR) program and managed and operated by the Scientific Networking Division at Lawrence Berkeley National Laboratory (LBNL). ESnet is widely regarded as a global leader in the research and education networking community.


Between March and September 2022, ESnet and the Office of Basic Energy Sciences (BES) of the DOE SC organized an ESnet requirements review of BES-supported activities. Preparation for these events included identification of key stakeholders: program and facility management, research...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/3jj0h54n</guid>
      <pubDate>Tue, 22 Nov 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Carder, Dale</name>
        <uri>https://orcid.org/0000-0001-8357-0997</uri>
      </author>
      <author>
        <name>Dart, Eli</name>
        <uri>https://orcid.org/0000-0002-8229-5433</uri>
      </author>
      <author>
        <name>Graf, Matthias</name>
      </author>
      <author>
        <name>Hawk, Carol</name>
      </author>
      <author>
        <name>Holder, Aaron</name>
      </author>
      <author>
        <name>Jacob, Dylan</name>
      </author>
      <author>
        <name>Lessner, Eliane</name>
      </author>
      <author>
        <name>Miller, Kenneth</name>
      </author>
      <author>
        <name>Rotermund, Cody</name>
      </author>
      <author>
        <name>Russell, Thomas</name>
      </author>
      <author>
        <name>Sefat, Athena</name>
      </author>
      <author>
        <name>Wiedlea, Andrew</name>
      </author>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
    </item>
    <item>
      <title>Design and implementation of dynamic I/O control scheme for large scale distributed file systems</title>
      <link>https://escholarship.org/uc/item/0cc4357f</link>
      <description>In this work, we have analyzed the input/output (I/O) activities of Cori, which is a high-performance computing system at the National Energy Research Scientific Computing Center at Lawrence Berkeley National Laboratory. Our analysis results indicate that most users do not adjust storage configurations but rather use the default settings. In addition, owing to the interference from many applications running simultaneously, the performance varies based on the system status. To configure file systems autonomously in complex environments, we developed DCA-IO, a dynamic distributed file system configuration adjustment algorithm that utilizes the system log information to adjust storage configurations automatically. Our scheme aims to improve the application performance and avoid interference from other applications without user intervention. Moreover, DCA-IO uses the existing system logs and does not require code modifications, an additional library, or user intervention. To demonstrate...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/0cc4357f</guid>
      <pubDate>Wed, 31 Aug 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Kim, Sunggon</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Byna, Suren</name>
        <uri>https://orcid.org/0000-0003-3048-3448</uri>
      </author>
      <author>
        <name>Son, Yongseok</name>
      </author>
    </item>
    <item>
      <title>San Diego State University Requirements Analysis Report</title>
      <link>https://escholarship.org/uc/item/82d7b7b5</link>
      <description>EPOC uses the Deep Dive process to discuss and analyze current and planned science, research, or education activities and the anticipated data output of a particular use case, site, or project to help inform the strategic planning of a campus or regional networking environment. This includes understanding future needs related to network operations, network capacity upgrades, and other technological service investments. A Deep Dive comprehensively surveys major research stakeholders’ plans and processes in order to investigate data management requirements over the next 5–10 years. 

Between December 2021 and April 2022 staff members from the Engagement and Performance Operations Center (EPOC) met with researchers and staff from San Diego State University (SDSU) the purpose of a Deep Dive into scientific and research drivers. The goal of this activity was to help characterize the requirements for a number of campus use cases, and to enable cyberinfrastructure support staff to better...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/82d7b7b5</guid>
      <pubDate>Tue, 23 Aug 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
    </item>
    <item>
      <title>ARIES Network Requirements Review</title>
      <link>https://escholarship.org/uc/item/5x5563kx</link>
      <description>The Energy Sciences Network (ESnet) is the high-performance network user facility for the US Department of Energy (DOE) Office of Science (SC) and delivers highly reliable data transport capabilities optimized for the requirements of data-intensive science. In essence, ESnet is the circulatory system that enables the DOE science mission by connecting all of its laboratories and facilities in the US and abroad. ESnet is funded and stewarded by the Advanced Scientific Computing Research (ASCR) program and managed and operated by the Scientific Networking Division at Lawrence Berkeley National Laboratory (LBNL). ESnet is widely regarded as a global leader in the research and education networking community.

On May 1, 2021, ESnet and the DOE Office of Energy Efficiency and Renewable Energy (EERE), organized an ESnet requirements review of the ARIES (Advanced Research on Integrated Energy Systems) platform. Preparation for this event included identification of key stakeholders to the...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/5x5563kx</guid>
      <pubDate>Tue, 23 Aug 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Dart, Eli</name>
        <uri>https://orcid.org/0000-0002-8229-5433</uri>
      </author>
      <author>
        <name>Rotman, Lauren</name>
      </author>
      <author>
        <name>Wiedlea, Andrew</name>
      </author>
      <author>
        <name>Miller, Ken</name>
      </author>
    </item>
    <item>
      <title>Sinclair Community College and OARnet Requirements Analysis Report</title>
      <link>https://escholarship.org/uc/item/0th6628f</link>
      <description>EPOC uses the Deep Dive process to discuss and analyze current and planned science, research, or education activities and the anticipated data output of a particular use case, site, or project to help inform the strategic planning of a campus or regional networking environment. This includes understanding future needs related to network operations, network capacity upgrades, and other technological service investments. A Deep Dive comprehensively surveys major research stakeholders’ plans and processes in order to investigate data management requirements over the next 5–10 years. 

Between October 2021 and March 2022 staff members from the Engagement and Performance Operations Center (EPOC) met with researchers and staff from Sinclair Community College (SCC) and OARnet with the purpose of performing a Deep Dive into scientific and research drivers. The goal of this activity was to help characterize the requirements for a number of campus use cases, and to enable cyberinfrastructure...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/0th6628f</guid>
      <pubDate>Tue, 23 Aug 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
    </item>
    <item>
      <title>The NetSage measurement and analysis framework in practice</title>
      <link>https://escholarship.org/uc/item/5vt9q733</link>
      <description>Data sharing is required for research collaborations, but effective data transfer performance continues to be difficult to achieve. The NetSage Measurement and Analysis Framework can assist in understanding research data movement. It collects a broad set of monitoring data and builds performance Dashboards to visualize the data. Each Dashboard is specifically designed to address a well-defined analysis need of the stakeholders. This paper describes the design methodology, the resulting architecture, the development approach and lessons learned, and a set of discoveries that NetSage Dashboards made possible.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/5vt9q733</guid>
      <pubDate>Mon, 1 Aug 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Schopf, Jennifer M</name>
      </author>
      <author>
        <name>Turner, Katrina</name>
      </author>
      <author>
        <name>Doyle, Dan</name>
      </author>
      <author>
        <name>Lake, Andrew</name>
      </author>
      <author>
        <name>Leigh, Jason</name>
      </author>
      <author>
        <name>Tierney, Brian L</name>
      </author>
    </item>
    <item>
      <title>South Plains College Requirements Analysis Report</title>
      <link>https://escholarship.org/uc/item/9k29338w</link>
      <description>EPOC uses the Deep Dive process to discuss and analyze current and planned science, research, or education activities and the anticipated data output of a particular use case, site, or project to help inform the strategic planning of a campus or regional networking environment. This includes understanding future needs related to network operations, network capacity upgrades, and other technological service investments. A Deep Dive comprehensively surveys major research stakeholders’ plans and processes in order to investigate data management requirements over the next 5–10 years. 

Between October 2021 and January 2022, staff members from the Engagement and Performance Operations Center (EPOC) met with researchers and staff from LEARN and South Plains College (SPC) for the purpose of a Deep Dive into scientific and research drivers. The goal of this activity was to help characterize the requirements for a number of campus use cases, and to enable cyberinfrastructure support staff...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/9k29338w</guid>
      <pubDate>Tue, 12 Jul 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
    </item>
    <item>
      <title>McLennan Community College Requirements Analysis Report</title>
      <link>https://escholarship.org/uc/item/3j40h3cs</link>
      <description>EPOC uses the Deep Dive process to discuss and analyze current and planned science, research, or education activities and the anticipated data output of a particular use case, site, or project to help inform the strategic planning of a campus or regional networking environment. This includes understanding future needs related to network operations, network capacity upgrades, and other technological service investments. A Deep Dive comprehensively surveys major research stakeholders’ plans and processes in order to investigate data management requirements over the next 5–10 years. 

In March of 2022, staff members from the Engagement and Performance Operations Center (EPOC) met with researchers and staff from LEARN and McLennan Community College (MCC) for the purpose of a Deep Dive into scientific and research drivers. The goal of this activity was to help characterize the requirements for a number of campus use cases, and to enable cyberinfrastructure support staff to better understand...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/3j40h3cs</guid>
      <pubDate>Tue, 12 Jul 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Meade, Brenna</name>
      </author>
    </item>
    <item>
      <title>Midland College Requirements Analysis Report</title>
      <link>https://escholarship.org/uc/item/0q85n3mm</link>
      <description>EPOC uses the Deep Dive process to discuss and analyze current and planned science, research, or education activities and the anticipated data output of a particular use case, site, or project to help inform the strategic planning of a campus or regional networking environment. This includes understanding future needs related to network operations, network capacity upgrades, and other technological service investments. A Deep Dive comprehensively surveys major research stakeholders’ plans and processes in order to investigate data management requirements over the next 5–10 years. 

In April of 2022, staff members from the Engagement and Performance Operations Center (EPOC) met with researchers and staff from LEARN and Midland College for the purpose of a Deep Dive into scientific and research drivers. The goal of this activity was to help characterize the requirements for a number of campus use cases, and to enable cyberinfrastructure support staff to better understand the needs...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/0q85n3mm</guid>
      <pubDate>Tue, 12 Jul 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Meade, Brenna</name>
      </author>
    </item>
    <item>
      <title>MEICAN: Simplifying DCN Life-Cycle Management from End-User and Operator Perspectives in Inter-Domain Environments</title>
      <link>https://escholarship.org/uc/item/7xm5z338</link>
      <description>National research and education networks (NRENs), such as ESnet, GéANT, and RNP, currently promote the employment of dynamic circuit networks (DCNs) to improve scientific communications beyond the capabilities of today's Internet. In spite of their alleged benefits to materialize user-initiated, ad hoc dedicated end-to-end circuits for high-demanding applications, DCNs also pose important challenges. Two examples are dealing with the end-user's lack of ability or willingness to understand low-level technicalities of virtual circuit establishment, and accommodating NRENs' local policies throughout the life-cycle of DCN. In this article, we seek to improve the usability of DCN services with a focus on the inexperienced end-user and the skilled network operator alike. We introduce MEICAN, a platform to manage the life-cycle of DCN from definition to provisioning of virtual circuits. We also present a case study of inter-domain circuit reservation with mixed manual and automated policy...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/7xm5z338</guid>
      <pubDate>Tue, 24 May 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Wickboldt, Juliano Araujo</name>
      </author>
      <author>
        <name>Guerreiro, Maurício Quatrin</name>
      </author>
      <author>
        <name>Granville, Lisandro Zambenedetti</name>
      </author>
      <author>
        <name>Gaspary, Luciano Paschoal</name>
      </author>
      <author>
        <name>Schwarz, Marcos Felipe</name>
      </author>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
      <author>
        <name>Chaniotakis, Vangelis</name>
      </author>
      <author>
        <name>Lake, Andrew</name>
        <uri>https://orcid.org/0000-0002-2228-6260</uri>
      </author>
      <author>
        <name>MacAuley, John</name>
      </author>
    </item>
    <item>
      <title>Fusion Energy Sciences Network Requirements Review (Final Report)</title>
      <link>https://escholarship.org/uc/item/8d5213bv</link>
      <description>The Energy Sciences Network (ESnet) is the high-performance network user facility for the US Department of Energy (DOE) Office of Science (SC) and delivers highly reliable data transport capabilities optimized for the requirements of data-intensive science. In essence, ESnet is the circulatory system that enables the DOE science mission by connecting all of its laboratories and facilities in the US and abroad. ESnet is funded and stewarded by the Advanced Scientific Computing Research (ASCR) program and managed and operated by the Scientific Networking Division at Lawrence Berkeley National Laboratory (LBNL). ESnet is widely regarded as a global leader in the research and education networking community.

Throughout 2021, ESnet and the Office of Fusion Energy Sciences (FES) of the DOE SC organized an ESnet requirements review of FES-supported activities. Preparation for these events included identification of key stakeholders: program and facility management, research groups, and...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/8d5213bv</guid>
      <pubDate>Mon, 23 May 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Wala, Fatema</name>
      </author>
      <author>
        <name>Bonoli, Paul</name>
      </author>
      <author>
        <name>Brown, Ben</name>
      </author>
      <author>
        <name>Chang, CS</name>
      </author>
      <author>
        <name>Churchill, Michael</name>
      </author>
      <author>
        <name>Cook, Brandon</name>
      </author>
      <author>
        <name>Curry, Doug</name>
      </author>
      <author>
        <name>Dart, Eli</name>
      </author>
      <author>
        <name>Diallo, Ahmed</name>
      </author>
      <author>
        <name>Dorland, Bill</name>
      </author>
      <author>
        <name>Fiuza, Frederico</name>
      </author>
      <author>
        <name>Foster, Mark</name>
      </author>
      <author>
        <name>Gerber, Richard</name>
      </author>
      <author>
        <name>Green, David</name>
      </author>
      <author>
        <name>Guok, Chin</name>
      </author>
      <author>
        <name>Guttenfelder, Walter</name>
      </author>
      <author>
        <name>Hicks, Susan</name>
      </author>
      <author>
        <name>Hier-Majumder, Saswata</name>
      </author>
      <author>
        <name>Hughes, Jerry</name>
      </author>
      <author>
        <name>Kafader, James</name>
      </author>
      <author>
        <name>Kampel, Scott</name>
      </author>
      <author>
        <name>Kaye, Stan</name>
      </author>
      <author>
        <name>King, Josh</name>
      </author>
      <author>
        <name>Lau, Cornwell</name>
      </author>
      <author>
        <name>Mandrekas, John</name>
      </author>
      <author>
        <name>Meneghini, Orso</name>
      </author>
      <author>
        <name>Miller, Bill</name>
      </author>
      <author>
        <name>Miller, Ken</name>
      </author>
      <author>
        <name>Monga, Inder</name>
      </author>
      <author>
        <name>Nazikian, Raffi</name>
      </author>
      <author>
        <name>Nguyen, Jeff</name>
      </author>
      <author>
        <name>Poli, Francesca</name>
      </author>
      <author>
        <name>Rapp, Jurgen</name>
      </author>
      <author>
        <name>Riley, Katherine</name>
      </author>
      <author>
        <name>Rotman, Lauren</name>
      </author>
      <author>
        <name>Sabbagh, Steve</name>
      </author>
      <author>
        <name>Savage, Brandon</name>
      </author>
      <author>
        <name>Schissel, David</name>
      </author>
      <author>
        <name>Schumacher, Douglass</name>
      </author>
      <author>
        <name>Stephey, Laurie</name>
      </author>
      <author>
        <name>Stillerman, Josh</name>
      </author>
      <author>
        <name>Wiedlea, Andrew</name>
      </author>
      <author>
        <name>Youchison, Dennis</name>
      </author>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
    </item>
    <item>
      <title>Enhancing IoT anomaly detection performance for federated learning</title>
      <link>https://escholarship.org/uc/item/81z5r3qm</link>
      <description>Federated Learning (FL) with mobile computing and the Internet of Things (IoT) is an effective cooperative learning approach. However, several technical challenges still need to be addressed. For instance, dividing the training process among several devices may impact the performance of Machine Learning (ML) algorithms, often significantly degrading prediction accuracy compared to centralized learning. One of the primary reasons for such performance degradation is that each device can access only a small fraction of data (that it generates), which limits the efficacy of the local ML model constructed on that device. The performance degradation could be exacerbated when the participating devices produce different classes of events, which is known as the class balance problem. Moreover, if the participating devices are of different types, each device may never observe the same types of events, which leads to the device heterogeneity problem. In this study, we investigate how data...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/81z5r3qm</guid>
      <pubDate>Tue, 12 Apr 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Weinger, Brett</name>
      </author>
      <author>
        <name>Kim, Jinoh</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Nakashima, Makiya</name>
      </author>
      <author>
        <name>Moustafa, Nour</name>
      </author>
      <author>
        <name>Wu, K John</name>
      </author>
    </item>
    <item>
      <title>Near real-time streaming analysis of big fusion data</title>
      <link>https://escholarship.org/uc/item/8n79h7fs</link>
      <description>Experiments on fusion plasmas produce high-dimensional data time series with ever-increasing magnitude and velocity, but turn-around times for analysis of this data have not kept up. For example, many data analysis tasks are often performed in a manual, ad-hoc manner some time after an experiment. In this article, we introduce the Delta framework that facilitates near real-time streaming analysis of big and fast fusion data. By streaming measurement data from fusion experiments to a high-performance compute center, Delta allows computationally expensive data analysis tasks to be performed in between plasma pulses. This article describes the modular and expandable software architecture of Delta and presents performance benchmarks of individual components as well as of an example workflow. Focusing on a streaming analysis workflow where electron cyclotron emission imaging (ECEi) data is measured at KSTAR on the National Energy Research Scientific Computing Center’s (NERSC’s) supercomputer...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/8n79h7fs</guid>
      <pubDate>Tue, 15 Mar 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Kube, R</name>
      </author>
      <author>
        <name>Churchill, RM</name>
      </author>
      <author>
        <name>Chang, CS</name>
      </author>
      <author>
        <name>Choi, J</name>
      </author>
      <author>
        <name>Wang, R</name>
      </author>
      <author>
        <name>Klasky, S</name>
      </author>
      <author>
        <name>Stephey, L</name>
      </author>
      <author>
        <name>Dart, E</name>
        <uri>https://orcid.org/0000-0002-8229-5433</uri>
      </author>
      <author>
        <name>Choi, MJ</name>
      </author>
    </item>
    <item>
      <title>Improving I/O Performance for Exascale Applications Through Online Data Layout Reorganization</title>
      <link>https://escholarship.org/uc/item/0s74p189</link>
      <description>The applications being developed within the U.S. Exascale Computing Project (ECP) to run on imminent Exascale computers will generate scientific results with unprecedented fidelity and record turn-around time. Many of these codes are based on particle-mesh methods and use advanced algorithms, especially dynamic load-balancing and mesh-refinement, to achieve high performance on Exascale machines. Yet, as such algorithms improve parallel application efficiency, they raise new challenges for I/O logic due to their irregular and dynamic data distributions. Thus, while the enormous data rates of Exascale simulations already challenge existing file system write strategies, the need for efficient read and processing of generated data introduces additional constraints on the data layout strategies that can be used when writing data to secondary storage. We review these I/O challenges and introduce two online data layout reorganization approaches for achieving good tradeoffs between read...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/0s74p189</guid>
      <pubDate>Tue, 15 Mar 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Wan, Lipeng</name>
      </author>
      <author>
        <name>Huebl, Axel</name>
        <uri>https://orcid.org/0000-0003-1943-7141</uri>
      </author>
      <author>
        <name>Gu, Junmin</name>
        <uri>https://orcid.org/0000-0002-1521-8534</uri>
      </author>
      <author>
        <name>Poeschel, Franz</name>
      </author>
      <author>
        <name>Gainaru, Ana</name>
      </author>
      <author>
        <name>Wang, Ruonan</name>
      </author>
      <author>
        <name>Chen, Jieyang</name>
      </author>
      <author>
        <name>Liang, Xin</name>
      </author>
      <author>
        <name>Ganyushin, Dmitry</name>
      </author>
      <author>
        <name>Munson, Todd</name>
      </author>
      <author>
        <name>Foster, Ian</name>
      </author>
      <author>
        <name>Vay, Jean-Luc</name>
      </author>
      <author>
        <name>Podhorszki, Norbert</name>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Klasky, Scott</name>
      </author>
    </item>
    <item>
      <title>Machine learning-based Analysis of COVID-19 Pandemic Impact on US Research Networks</title>
      <link>https://escholarship.org/uc/item/97d284ps</link>
      <description>Machine learning-based Analysis of COVID-19 Pandemic Impact on US Research Networks</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/97d284ps</guid>
      <pubDate>Fri, 18 Feb 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Kiran, Mariam</name>
      </author>
      <author>
        <name>Campbell, Scott</name>
        <uri>https://orcid.org/0000-0002-6542-7473</uri>
      </author>
      <author>
        <name>Wala, Fatema Bannat</name>
      </author>
      <author>
        <name>Buraglio, Nick</name>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
    </item>
    <item>
      <title>Machine learning-based analysis of COVID-19 pandemic impact on US research networks</title>
      <link>https://escholarship.org/uc/item/36n6f5xk</link>
      <description>This study explores how fallout from the changing public health policy around COVID-19 has changed how researchers access and process their science experiments. Using a combination of techniques from statistical analysis and machine learning, we conduct a retrospective analysis of historical network data for a period around the stay-At-home orders that took place in March 2020. Our analysis takes data from the entire ESnet infrastructure to explore DOE high-performance computing (HPC) resources at OLCF, ALCF, and NERSC, as well as User sites such as PNNL and JLAB. We look at detecting and quantifying changes in site activity using a combination of t-Distributed Stochastic Neighbor Embedding (t-SNE) and decision tree analysis. Our findings bring insights into the working patterns and impact on data volume movements, particularly during late-night hours and weekends.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/36n6f5xk</guid>
      <pubDate>Fri, 18 Feb 2022 00:00:00 +0000</pubDate>
      <author>
        <name>Kiran, Mariam</name>
      </author>
      <author>
        <name>Campbell, Scott</name>
        <uri>https://orcid.org/0000-0002-6542-7473</uri>
      </author>
      <author>
        <name>Wala, Fatema Bannat</name>
      </author>
      <author>
        <name>Buraglio, Nick</name>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
    </item>
    <item>
      <title>Kennesaw State University Scientific Deep Dive</title>
      <link>https://escholarship.org/uc/item/77k892k5</link>
      <description>EPOC uses the Deep Dive process to discuss and analyze current and planned science, research, or education activities and the anticipated data output of a particular use case, site, or project to help inform the strategic planning of a campus or regional networking environment. This includes understanding future needs related to network operations, network capacity upgrades, and other technological service investments. A Deep Dive comprehensively surveys major research stakeholders’ plans and processes in order to investigate data management requirements over the next 5–10 years. 

Between August 2021 and October 2021, staff members from the Engagement and Performance Operations Center (EPOC) met with researchers and staff from Kennesaw State University (KSU) for the purpose of a Deep Dive into scientific and research drivers. The goal of this activity was to help characterize the requirements for a number of campus use cases, and to enable cyberinfrastructure support staff to...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/77k892k5</guid>
      <pubDate>Wed, 3 Nov 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Addleman, Hans</name>
      </author>
    </item>
    <item>
      <title>Analyzing Scientific Data Sharing Patterns for In-network Data Caching.</title>
      <link>https://escholarship.org/uc/item/91f9q777</link>
      <description>Analyzing Scientific Data Sharing Patterns for In-network Data Caching.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/91f9q777</guid>
      <pubDate>Tue, 26 Oct 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Copps, Elizabeth</name>
      </author>
      <author>
        <name>Zhang, Huiyi</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
      <author>
        <name>Würthwein, Frank</name>
      </author>
      <author>
        <name>Davila, Diego</name>
      </author>
      <author>
        <name>Hernandez, Edgar Fajardo</name>
      </author>
    </item>
    <item>
      <title>2019 Computing Sciences Strategic Plan</title>
      <link>https://escholarship.org/uc/item/61j6m742</link>
      <description>2019 Computing Sciences Strategic Plan</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/61j6m742</guid>
      <pubDate>Tue, 26 Oct 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Yelick, Kathy</name>
      </author>
      <author>
        <name>Agarwal, Deb</name>
        <uri>https://orcid.org/0000-0001-5045-2396</uri>
      </author>
      <author>
        <name>Bard, Debbie</name>
        <uri>https://orcid.org/0000-0002-5162-5153</uri>
      </author>
      <author>
        <name>Shalf, John</name>
        <uri>https://orcid.org/0000-0002-0608-3690</uri>
      </author>
      <author>
        <name>Almgren, Ann</name>
      </author>
      <author>
        <name>Bhimji, Wahid</name>
      </author>
      <author>
        <name>Brown, Ben</name>
      </author>
      <author>
        <name>Carter, Jonathan</name>
        <uri>https://orcid.org/0000-0001-9006-7636</uri>
      </author>
      <author>
        <name>Jong, Bert</name>
      </author>
      <author>
        <name>Doerfler, Doug</name>
        <uri>https://orcid.org/0000-0001-5016-8854</uri>
      </author>
      <author>
        <name>Donofrio, David</name>
      </author>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
      <author>
        <name>Iancu, Costin</name>
      </author>
      <author>
        <name>Kiran, Mariam</name>
      </author>
      <author>
        <name>Li, Sherry</name>
      </author>
      <author>
        <name>Nugent, Peter</name>
        <uri>https://orcid.org/0000-0002-3389-0586</uri>
      </author>
      <author>
        <name>Prabhat, M</name>
      </author>
      <author>
        <name>Ramakrishnan, Lavanya</name>
      </author>
      <author>
        <name>Vasudevan, Dilip</name>
      </author>
      <author>
        <name>Wright, Nick</name>
        <uri>https://orcid.org/0000-0003-1883-6108</uri>
      </author>
      <author>
        <name>Cademartori, Helen</name>
      </author>
      <author>
        <name>Antypas, Katie</name>
      </author>
      <author>
        <name>Kincade, Kathy</name>
      </author>
    </item>
    <item>
      <title>SDN for End-to-End Networked Science at the Exascale (SENSE)</title>
      <link>https://escholarship.org/uc/item/5dk3195q</link>
      <description>The Software-defined network for End-to-end Networked Science at Exascale (SENSE) research project is building smart network services to accelerate scientific discovery in the era of 'big data' driven by Exascale, cloud computing, machine learning and AI. The project's architecture, models, and demonstrated prototype define the mechanisms needed to dynamically build end-to-end virtual guaranteed networks across administrative domains, with no manual intervention. In addition, a highly intuitive 'intent' based interface, as defined by the project, allows applications to express their high-level service requirements, and an intelligent, scalable model-based software orchestrator converts that intent into appropriate network services, configured across multiple types of devices. The significance of these capabilities is the ability for science applications to manage the network as a first-class schedulable resource akin to instruments, compute, and storage, to enable well defined...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/5dk3195q</guid>
      <pubDate>Tue, 26 Oct 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Monga, Inder</name>
        <uri>https://orcid.org/0000-0003-4524-0457</uri>
      </author>
      <author>
        <name>Yang, Xi</name>
      </author>
      <author>
        <name>Guok, Chin</name>
        <uri>https://orcid.org/0000-0003-4532-1222</uri>
      </author>
      <author>
        <name>MacAuley, John</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Newman, Harvey</name>
      </author>
      <author>
        <name>Balcas, Justas</name>
      </author>
      <author>
        <name>DeMar, Phil</name>
      </author>
      <author>
        <name>Winkler, Linda</name>
      </author>
      <author>
        <name>Lehman, Tom</name>
      </author>
    </item>
    <item>
      <title>A Quantitative Approach to Architecting All-Flash Lustre File Systems</title>
      <link>https://escholarship.org/uc/item/4wn1341d</link>
      <description>New experimental and AI-driven workloads are moving into the realm of extreme-scale HPC systems at the same time that high-performance flash is becoming cost-effective to deploy at scale. This confluence poses a number of new technical and economic challenges and opportunities in designing the next generation of HPC storage and I/O subsystems to achieve the right balance of bandwidth, latency, endurance, and cost. In this work, we present quantitative models that use workload data from existing, disk-based file systems to project the architectural requirements of all-flash Lustre file systems. Using data from NERSC’s Cori I/O subsystem, we then demonstrate the minimum required capacity for data, capacity for metadata and data-on-MDT, and SSD endurance for a future all-flash Lustre file system.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/4wn1341d</guid>
      <pubDate>Tue, 26 Oct 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Lockwood, Glenn K</name>
        <uri>https://orcid.org/0000-0002-9241-9372</uri>
      </author>
      <author>
        <name>Lozinskiy, Kirill</name>
        <uri>https://orcid.org/0000-0003-1153-8474</uri>
      </author>
      <author>
        <name>Gerhardt, Lisa</name>
        <uri>https://orcid.org/0000-0003-0166-5162</uri>
      </author>
      <author>
        <name>Cheema, Ravi</name>
      </author>
      <author>
        <name>Hazen, Damian</name>
      </author>
      <author>
        <name>Wright, Nicholas J</name>
        <uri>https://orcid.org/0000-0003-1883-6108</uri>
      </author>
    </item>
    <item>
      <title>Performance Analysis Tool for HPC and Big Data Applications on Scientific Clusters</title>
      <link>https://escholarship.org/uc/item/4sg377w2</link>
      <description>Big data is prevalent in HPC computing. Many HPC projects rely on complex workflows to analyze terabytes or petabytes of data. These workflows often require running over thousands of CPU cores and performing simultaneous data accesses, data movements, and computation. It is challenging to analyze the performance involving terabytes or petabytes of workflow data or measurement data of the executions, from complex workflows over a large number of nodes and multiple parallel task executions. To help identify performance bottlenecks or debug the performance issues in large-scale scientific applications and scientific clusters, we have developed a performance analysis framework, using state-of-the-art open-source big data processing tools. Our tool can ingest system logs and application performance measurements to extract key performance features, and apply the most sophisticated statistical tools and data mining methods on the performance data. It utilizes an efficient data processing...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/4sg377w2</guid>
      <pubDate>Tue, 26 Oct 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Yoo, Wucherl</name>
      </author>
      <author>
        <name>Koo, Michelle</name>
      </author>
      <author>
        <name>Cao, Yi</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Nugent, Peter</name>
        <uri>https://orcid.org/0000-0002-3389-0586</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
    </item>
    <item>
      <title>Parallel variable selection for effective performance prediction</title>
      <link>https://escholarship.org/uc/item/40d5t2hp</link>
      <description>© 2017 IEEE. Large data analysis problems often involve a large number of variables, and the corresponding analysis algorithms may examine all variable combinations to find the optimal solution. For example, to model the time required to complete a scientific workflow, we need to consider the impact of dozens of parameters. To reduce the model building time and reduce the likelihood of overfitting, we look to variable selection methods to identify the critical variables for the performance model. In this work, we create a combination of variable selection and performance prediction methods that is as effective as the exhaustive search approach when the exhaustive search could be completed in a reasonable amount of time. To handle the cases where the exhaustive search is too time consuming, we develop the parallelized variable selection algorithm. Additionally, we develop a parallel grouping mechanism that further reduces the variable selection time by 70%.As a case study, we exercise...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/40d5t2hp</guid>
      <pubDate>Tue, 26 Oct 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Wang, J</name>
      </author>
      <author>
        <name>Yoo, W</name>
      </author>
      <author>
        <name>Sim, A</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Nugent, P</name>
        <uri>https://orcid.org/0000-0002-3389-0586</uri>
      </author>
      <author>
        <name>Wu, K</name>
      </author>
    </item>
    <item>
      <title>High Energy Physics Network Requirements Review (Final Report, July-October 2020)</title>
      <link>https://escholarship.org/uc/item/2vt3f7r6</link>
      <description>High Energy Physics Network Requirements Review (Final Report, July-October 2020)</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/2vt3f7r6</guid>
      <pubDate>Tue, 26 Oct 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
      <author>
        <name>Brown, Benjamin</name>
      </author>
      <author>
        <name>Carder, Dale</name>
        <uri>https://orcid.org/0000-0001-8357-0997</uri>
      </author>
      <author>
        <name>Colby, Eric</name>
      </author>
      <author>
        <name>Dart, Eli</name>
        <uri>https://orcid.org/0000-0002-8229-5433</uri>
      </author>
      <author>
        <name>Miller, Ken</name>
      </author>
      <author>
        <name>Patwa, Abid</name>
      </author>
      <author>
        <name>Robinson, Kate</name>
      </author>
      <author>
        <name>Rotman, Lauren</name>
      </author>
      <author>
        <name>Wiedlea, Andrew</name>
      </author>
    </item>
    <item>
      <title>Detecting Anomalies in the LCLS Workflow</title>
      <link>https://escholarship.org/uc/item/1pp0p81k</link>
      <description>The Linac Coherent Light Source (LCLS) located at SLAC National Accelerator Laboratory has been essential to over 1023 publications since 2009. The LCLS produces vast quantities of data - thousands of gigabytes per experiment. The data must be analyzed and stored at large data centers to be available to the world-wide user community. Due to the vast quantities of data flowing through the network, many abnormal data transfers remain unnoticed. This work focuses on identifying network failures that could slow down the data transfer process. This work aims to develop a diagnostic tool to detect when network transfers become anomalously slow. The tool uses an algorithm based on the hampel filter to detect poor performance and alert SLAC administrators to bottlenecks in each phase of the workflow. We will describe our experience of preparing the data and modifying the hampel filter to enhance its effectiveness. We found that applying a heuristic to the algorithm in conjunction with...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/1pp0p81k</guid>
      <pubDate>Tue, 26 Oct 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Shachaf, Tal</name>
      </author>
      <author>
        <name>Sim, Alexander</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, Kesheng</name>
      </author>
      <author>
        <name>Kroeger, Wilko</name>
      </author>
    </item>
    <item>
      <title>Auto-tuned Publisher in a Pub/Sub System: Design and Performance Evaluation</title>
      <link>https://escholarship.org/uc/item/1hs2012d</link>
      <description>Pub/sub systems form the underlying framework for many distributed applications including large social networking applications. In this paper, we consider the optimization of the end-to-end latency of a pub/sub system in which the publisher, the broker, and the subscriber are in different administrative domains. While general pub/sub systems provide reliability of message delivery, good end-to-end latency in a multi-domain environment requires that the pub/sub system adapts to workload changes and bottlenecks in the different sub-systems. This study is motivated by two applications. First, a pub/sub based Simple Lookup Service (sLS) that is used in perfSONAR to provide information about network performance in Research and Education (R&amp;amp;E) networks. Second, the pub/sub system that is used to distribute alerts generated in the data pipeline in the Zwicky Transient Factory (ZTF). In this multi-domain pub/sub performance study, we consider a publisher with a multi-threaded architecture...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/1hs2012d</guid>
      <pubDate>Tue, 26 Oct 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Balasubramanian, Sowmya</name>
      </author>
      <author>
        <name>Ghosal, Dipak</name>
      </author>
      <author>
        <name>Sharath, Kamala Narayan Balasubramanian</name>
      </author>
      <author>
        <name>Pouyoul, Eric</name>
      </author>
      <author>
        <name>Sim, Alex</name>
        <uri>https://orcid.org/0000-0002-6295-1982</uri>
      </author>
      <author>
        <name>Wu, John</name>
        <uri>https://orcid.org/0000-0002-6907-3393</uri>
      </author>
      <author>
        <name>Tierney, Brian</name>
      </author>
    </item>
    <item>
      <title>The Engagement and Performance Operations Center: EPOC</title>
      <link>https://escholarship.org/uc/item/16w3788j</link>
      <description>In 2018, the US National Science Foundation (NSF) funded the Engagement and Performance Operations Center (EPOC), a joint project between Indiana University (IU) and the Department of Energy’s Energy Science Network (ESnet), to work with domain scientists to accelerate the ability of distributed collaborations to share data in order to reach broader science goals. The goal of this funding was to create an operations center for engagement - including definition of formal processes, tracking of engagements, and funded staff, not simply best effort by volunteers, with a goal of enabling digital societies to better share scientific data.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/16w3788j</guid>
      <pubDate>Tue, 26 Oct 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Moynihan, Edward</name>
      </author>
      <author>
        <name>Schopf, Jennifer</name>
      </author>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
    </item>
    <item>
      <title>Designing an all-flash Lustre file system for the 2020 NERSC Perlmutter system</title>
      <link>https://escholarship.org/uc/item/25k024cr</link>
      <description>Designing an all-flash Lustre file system for the 2020 NERSC Perlmutter system</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/25k024cr</guid>
      <pubDate>Wed, 6 Oct 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Lockwood, Glenn K</name>
        <uri>https://orcid.org/0000-0002-9241-9372</uri>
      </author>
      <author>
        <name>Lozinskiy, Kirill</name>
        <uri>https://orcid.org/0000-0003-1153-8474</uri>
      </author>
      <author>
        <name>Gerhardt, Lisa</name>
      </author>
      <author>
        <name>Cheema, Ravi</name>
      </author>
      <author>
        <name>Hazen, Damian</name>
      </author>
      <author>
        <name>Wright, Nicholas J</name>
        <uri>https://orcid.org/0000-0003-1883-6108</uri>
      </author>
    </item>
    <item>
      <title>South Dakota Region Scientific Deep Dive</title>
      <link>https://escholarship.org/uc/item/3s07q0d8</link>
      <description>South Dakota Region Scientific Deep Dive</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/3s07q0d8</guid>
      <pubDate>Thu, 30 Sep 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Zurawski, Jason</name>
        <uri>https://orcid.org/0000-0001-8389-4705</uri>
      </author>
    </item>
    <item>
      <title>A Framework for International Collaboration on ITER Using Large-Scale Data Transfer to Enable Near-Real-Time Analysis</title>
      <link>https://escholarship.org/uc/item/3297b74f</link>
      <description>The global nature of the ITER project along with its projected approximately petabyte-per-day data generation presents not only a unique challenge but also an opportunity for the fusion community to rethink, optimize, and enhance our scientific discovery process. Recognizing this, collaborative research with computational scientists was undertaken over the past several years to create a framework for large-scale data movement across wide-area networks to enable global near-real-time analysis of fusion data. This would broaden the available computational resources for analysis/simulation and increase the number of researchers actively participating in experiments. An official demonstration of this framework for fast, large data transfer and real-time analysis was carried out between the KSTAR tokamak in Daejeon, Korea, and Princeton Plasma Physics Laboratory (PPPL) in Princeton, New Jersey. Streaming large data transfer, with near-real-time movie creation and analysis of the KSTAR...</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/3297b74f</guid>
      <pubDate>Tue, 31 Aug 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Churchill, RM</name>
      </author>
      <author>
        <name>Chang, CS</name>
      </author>
      <author>
        <name>Choi, J</name>
      </author>
      <author>
        <name>Wang, R</name>
      </author>
      <author>
        <name>Klasky, S</name>
      </author>
      <author>
        <name>Kube, R</name>
      </author>
      <author>
        <name>Park, H</name>
      </author>
      <author>
        <name>Choi, MJ</name>
      </author>
      <author>
        <name>Park, JS</name>
      </author>
      <author>
        <name>Wolf, M</name>
      </author>
      <author>
        <name>Hager, R</name>
      </author>
      <author>
        <name>Ku, S</name>
      </author>
      <author>
        <name>Kampel, S</name>
      </author>
      <author>
        <name>Carroll, T</name>
      </author>
      <author>
        <name>Silber, K</name>
      </author>
      <author>
        <name>Dart, E</name>
        <uri>https://orcid.org/0000-0002-8229-5433</uri>
      </author>
      <author>
        <name>Cho, BS</name>
      </author>
    </item>
    <item>
      <title>Complete genome sequence of Xylanimonas cellulosilytica type strain (XIL07T)</title>
      <link>https://escholarship.org/uc/item/81s5v1qj</link>
      <description>Xylanimonas cellulosilytica Rivas et al. 2003 is the type species of the genus Xylanimonas of the actinobacterial family Promicromonosporaceae. The species X. cellulosilytica is of interest because of its ability to hydrolyze cellulose and xylan. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the large family Promicromonosporaceae, and the 3,831,380 bp long genome (one chromosome plus an 88,604 bp long plasmid) with its 3485 protein-coding and 61 RNA genes is part of the GenomicEncyclopedia ofBacteria andArchaea project.</description>
      <guid isPermaLink="true">https://escholarship.org/uc/item/81s5v1qj</guid>
      <pubDate>Fri, 20 Aug 2021 00:00:00 +0000</pubDate>
      <author>
        <name>Foster, Brian</name>
      </author>
      <author>
        <name>Pukall, Rüdiger</name>
      </author>
      <author>
        <name>Abt, Birte</name>
      </author>
      <author>
        <name>Nolan, Matt</name>
      </author>
      <author>
        <name>Glavina Del Rio, Tijana</name>
      </author>
      <author>
        <name>Chen, Feng</name>
      </author>
      <author>
        <name>Lucas, Susan</name>
      </author>
      <author>
        <name>Tice, Hope</name>
      </author>
      <author>
        <name>Pitluck, Sam</name>
      </author>
      <author>
        <name>Cheng, Jan-Fang</name>
        <uri>https://orcid.org/0000-0001-7315-7613</uri>
      </author>
      <author>
        <name>Chertkov, Olga</name>
      </author>
      <author>
        <name>Brettin, Thomas</name>
      </author>
      <author>
        <name>Han, Cliff</name>
      </author>
      <author>
        <name>Detter, John C</name>
      </author>
      <author>
        <name>Bruce, David</name>
      </author>
      <author>
        <name>Goodwin, Lynne</name>
      </author>
      <author>
        <name>Ivanova, Natalia</name>
      </author>
      <author>
        <name>Mavromatis, Konstantinos</name>
      </author>
      <author>
        <name>Pati, Amrita</name>
      </author>
      <author>
        <name>Mikhailova, Natalia</name>
      </author>
      <author>
        <name>Chen, Amy</name>
      </author>
      <author>
        <name>Palaniappan, Krishna</name>
        <uri>https://orcid.org/0000-0003-4484-7505</uri>
      </author>
      <author>
        <name>Land, Miriam</name>
      </author>
      <author>
        <name>Hauser, Loren</name>
      </author>
      <author>
        <name>Chang, Yun-Juan</name>
      </author>
      <author>
        <name>Jeffries, Cynthia D</name>
      </author>
      <author>
        <name>Chain, Patrick</name>
      </author>
      <author>
        <name>Rohde, Manfred</name>
      </author>
      <author>
        <name>Göker, Markus</name>
      </author>
      <author>
        <name>Bristow, Jim</name>
      </author>
      <author>
        <name>Eisen, Jonathan A</name>
        <uri>https://orcid.org/0000-0002-0159-2197</uri>
      </author>
      <author>
        <name>Markowitz, Victor</name>
      </author>
      <author>
        <name>Hugenholtz, Philip</name>
      </author>
      <author>
        <name>Kyrpides, Nikos C</name>
        <uri>https://orcid.org/0000-0002-6131-0462</uri>
      </author>
      <author>
        <name>Klenk, Hans-Peter</name>
      </author>
      <author>
        <name>Lapidus, Alla</name>
      </author>
    </item>
  </channel>
</rss>
