A Deep Deterministic Policy Gradient Based Network Scheduler for Deadline-Driven Data Transfers
Published Web Locationhttps://sdm.lbl.gov/oapapers/ifip20.pdf
We consider data sources connected to a software defined network (SDN) with heterogeneous link access rates. Deadline-driven data transfer requests are made to a centralized network controller that schedules pacing rates of sources and meeting the request deadline has a pre-assigned value. The goal of the scheduler is to maximize the aggregate value. We design a scheduler (RL-Agent) based on Deep Deterministic Policy Gradient (DDPG). We compare our approach with three heuristics: (i) PFAIR, which shares the bottleneck capacity in proportion to the access rates, (ii) VDRatio, which prioritizes flows with high value-to-demand ratio, and (iii) VBEDF, which prioritizes flows with high value-to-deadline ratio. For equally valued requests and homogeneous access rates, PFAIR is the same as an idealized TCP algorithm, while VBEDF and VDRatio reduce to the Earliest Deadline First (EDF) and the Shortest Job First (SJF) algorithms, respectively. In this scenario, we show that RL-Agent performs significantly better than PFAIR and VDRatio and matches and in over-loaded scenarios out-performs VBEDF. When access rates are heterogeneous, we show that the RL-Agent performs as well as VBEDF even though the RL-Agent has no knowledge of the heterogeneity to start with. For the value maximization problems, we show that the RL-Agent out-performs the heuristics for both homogeneous and heterogeneous access networks. For the general case of heterogeneity with different values, the RL-Agent performs the best despite having no prior knowledge of the heterogeneity and the values, whereas the heuristics have full knowledge of the heterogeneity and VDRatio and VBEDF have partial knowledge of the values through the ratios of value to demand and value to deadline, respectively.