Bulk Data Movement for Climate Dataset: Efficient Data Transfer Management with Dynamic Transfer Adjustment
Many scientific applications and experiments, such as high energy and nuclear physics, astrophysics, climate observation and modeling, combustion, nano-scale material sciences, and computational biology, generate extreme volumes of data with a large number of files. These data sources are distributed among national and international data repositories, and are shared by large numbers of geographically distributed scientists. A large portion of data is frequently accessed, and a large volume of data is moved from one place to another for analysis and storage. One challenging issue in such efforts is the limited network capacity for moving large datasets to explore and manage. The Bulk Data Mover (BDM), a data transfer management tool in the Earth System Grid (ESG) community, has been managing the massive dataset transfers efficiently with the pre-configured transfer properties in the environment where the network bandwidth is limited. Dynamic transfer adjustment was studied to enhance the BDM to handle significant end-to-end performance changes in the dynamic network environment as well as to control the data transfers for the desired transfer performance. We describe the results from the BDM transfer management for the climate datasets. We also describe the transfer estimation model and results from the dynamic transfer adjustment.