Recent economics in computer architecture, specifically the end of power-density-performance scaling trends and the inefficiencies in modern processors, has motivated more companies to develop custom hardware. Custom hardware improve important metrics that impact revenue: latency, performance, and power. This has led to the widespread deployment of Field Programmable Gate Arrays (FPGAs) in datacenters, automobiles, communications equipment, and more.
However, hardware development is tedious, time consuming, and costly. There are many challenges: languages are domain specific, few reusable libraries exist, and hardware compilation is slow. In addition, development tools are expensive with a high-degree of vendor lock-in and domain-specific knowledge. These obstacles have limited hardware development to companies that can afford the associated costs and a few knowledgeable hardware engineers.
Applications for hardware pipelines exist at all scales. Machine learning, computer vision, and text processing are common targets for custom hardware and are pervasive in modern applications. While the details differ between specific applications the computations to well known patterns. These patterns provide a means for abstracting hardware development and increase the accesibility of hardware development for non-hardware engineers.
This thesis presents work on increasing the accessibility of hardware development to non-hardware engineers through the use of common parallel patterns. By abstracting recurring patterns in hardware development we can create languages and architectures with performance comparable to custom hardware circuits, and reduce the cost of hardware development.