In the recent decade, Intelligent Systems--advanced computer systems that can make useful predictions or decisions based on observations--have become increasingly ubiquitous: from personal assistants to self-driving cars. These intelligent systems are incarnations of Artificial Intelligence (AI), powered by the recent advances in Deep Neural Networks (DNNs) that now exhibit superhuman performance in many tasks such as image classification, game playing, and protein-folding problems. Such astounding performance of DNNs depend on two key ingredients: Data and Computing Power. In the current era of big data, the rate of data generation has reached an overwhelming level that is beyond the capabilities of conventional computing systems. The hardware design has gone through a significant change and has exploded in diversity to cope with the rate of data generation and the computational intensity of DNNs. Nevertheless, developing Compilers to optimize the code for them remains an open challenge.
DNNs have made significant strides in context-sensitive natural language translation. These advances can be seen as an opportunity to utilize DNNs for the compilation of DNNs, themselves, which in fact is a series of translation tasks. To this end, the dissertation begins by introducing an effort to integrate deep reinforcement learning to improve the compilers' capability to adapt to unseen search spaces in code optimization. This marks an initial step in leveraging AI for Optimized Execution of AI on commodity platforms. Although the exciting results from the work shows the potential for leveraging machine intelligence for compilation, it does not fully justify relinquishing the swathe of conventional optimization techniques and the foundational algorithms that have been curated with human ingenuity over decades. As such, the dissertation also explores the other end of the spectrum--Foundational Algorithms--to tackle the problem of memory footprint in neural execution. This work achieves memory-optimal scheduling building on dynamic programming, a well-known foundational algorithm in computer science. Having observed that both worlds--AI and Foundational Algorithms--can bring significant benefits to compiler optimization, this dissertation culminates to an ambitious effort to take advantage of the best of both worlds. The dissertation presents Hybridization of AI and Foundational Algorithms for optimized execution of AI, where we utilize mathematical embeddings to extract core information from the hardware specification while using meta learning to fuse those information into compilers for improved compilation performance.
Intelligent systems comprise components from various domains that are not limited to DNNs. Therefore, it naturally makes Cross-Domain Multi-Acceleration our next step. To this end, this dissertation devises a set of abstractions for various application domains and their hardware, then a virtual machine for execution of end-to-end applications. The work sets the foundations for cross-domain multi-acceleration to expand the scope of the aforementioned AI-enabled compilation techniques to the end-to-end intelligent systems.