An Online Scheduling Algorithm with Advance Reservation for Large-Scale Data Transfers
- Author(s): Balman, Mehmet
- et al.
Scientific applications and experimental facilities generate massive data sets that need to be transferred to remote collaborating sites for sharing, processing, and long term storage. In order to support increasingly data-intensive science, next generation research networks have been deployed to provide high-speed on-demand data access between collaborating institutions. In this paper, we present a practical model for online data scheduling in which data movement operations are scheduled in advance for end-to-end high performance transfers. In our model, data scheduler interacts with reservation managers and data transfer nodes in order to reserve available bandwidth to guarantee completion of jobs that are accepted and confirmed to satisfy preferred time constraint given by the user. Our methodology improves current systems by allowing researchers and higher level meta-schedulers to use data placement as a service where they can plan ahead and reserve the scheduler time in advance for their data movement operations. We have implemented our algorithm and examined possible techniques for incorporation into current reservation frameworks. Performance measurements confirm that the proposed algorithm is efficient and scalable.