The dissertation investigates redundant communication between servers for large-scaleweb and cache requests and redundant data movement between accelerators for compute-intensive applications. Redundancy is an impending and critical issue for data centers designed for hardware accelerators and disaggregated resources. The dissertation makes the following three contributions to address this. The first contribution of the dissertation is Daronpon. Daronpon dynamically load-balances and reroutes large-scale requests of web and cache applications on a microsecond timescale. Daronpon prevents these requests, stranded on busy servers with network congestion and long queuing delays, from being processed. Daronpon shows improvement in various service time characterizations of different applications. The second contribution of the dissertation is Fianchetto. Fianchetto acts as a compute-enabled bypass for inter-accelerator communication. Fianchetto accelerates the data restructuring needed between accelerators and saves the data movement between accelerators and CPUs for compute-intensive applications. Fianchetto shows improvement in a series of benchmarks involving different application domains. The third contribution of the dissertation is Aurelia. Aurelia leverages the emerging interconnect of CXL to investigate the design of a scalable fabric for accelerators and fabric-attached memory expansion. Aurelia improves routing and transport based on the current specification of CXL and shows performance improvement on machine learning and key-value store applications
Cookie SettingseScholarship uses cookies to ensure you have the best experience on our website. You can manage which cookies you want us to use.Our Privacy Statement includes more details on the cookies we use and how we protect your privacy.