Search

Scholarly Works (2 results)

Article
Peer Reviewed

Tailor: Altering Skip Connections for Resource-Efficient Inference.

UC San Diego Previously Published Works (2024)

Creative Commons 'BY' version 4.0 license

Article
Peer Reviewed

Tailor: Altering Skip Connections for Resource-Efficient Inference

UC San Diego Previously Published Works (2023)

Deep neural networks use skip connections to improve training convergence. However, these skip connections are costly in hardware, requiring extra buffers and increasing on- and off-chip memory utilization and bandwidth requirements. In this paper, we show that skip connections can be optimized for hardware when tackled with a hardware-software codesign approach. We argue that while a network’s skip connections are needed for the network to learn, they can later be removed or shortened to provide a more hardware efficient implementation with minimal to no accuracy loss. We introduce Tailor , a codesign tool whose hardware-aware training algorithm gradually removes or shortens a fully trained network’s skip connections to lower their hardware cost. Tailor improves resource utilization by up to 34% for BRAMs, 13% for FFs, and 16% for LUTs for on-chip, dataflow-style architectures. Tailor increases performance by 30% and reduces memory bandwidth by 45% for a 2D processing element array architecture.