Search

Scholarly Works (1 results)

Soft multipliers have been designed and optimized to provide area, latency, and energy gains for

FPGA-based accelerators. While approximation techniques and FPGA architecture have been used

to produce these gains, certain unorthodox precisions have been overlooked. This work presents

the novel use of 3x3 multipliers that take advantage of commercial Xilinx and Altera FPGA lookup

table architecture to compose higher order soft multipliers. Generalized recursive methods of

composing higher order multipliers are presented, and a novel area-efficient fast 8x8 truncated soft

multiplier is designed for use with popular quantization methods and accelerators for domains such

as machine learning that are robust to errors induced from quantization and truncation. This design

improves upon a comparable 8x8 truncated soft multiplier using state-of-the-art soft multiplier

approximation techniques by 37.5% in both area and energy while maintaining the same delay and

producing an exact product.