Skip to main content
eScholarship
Open Access Publications from the University of California

Polynomial Datapaths Optimization

  • Author(s): Parta, Hojat
  • Advisor(s): Ercegovac, Milos D
  • et al.
Abstract

The research presented focuses on optimization of polynomials using algebraic manipulations at the high level and digital arithmetic techniques at the implementation level. Previous methods lacked any algebraic understanding of the polynomials or only exposed limited potential. We have treated the polynomial optimization problem in abstract algebra allowing us algebraic freedom to transform polynomials. Unlike previous attempts where only a set of limited benchmarks have been used, we have focused on a class of polynomials called “Volterra series” with a wide range of applications in modeling nonlinear systems. We have exposed Volterra series characteristic, in which some inputs are delayed inputs, to store some of the products in these polynomials for the subsequent clock cycles. This results in replacement of multiplications with memory elements. In addition to variable factorization, we have also introduced factorization based on common coefficient divisors allowing us to expose more possible common subexpressions. These optimization have resulted in reduction of 41.0% and 12.9% in number of multiplication and addition respectively in comparison with common subexpression elimination technique used in compilers. At implementation level, instead of relying on conventional arithmetic operations, we have used redundant arithmetic to increase the speed of the polynomial evaluation. To avoid manual and tedious generation of such systems, we have developed a generator. This generator parses the polynomials into expression trees and automatically outputs Verilog HDL code. The generator uses interval arithmetic to find the minimum required bit-width in our evaluation of each variable. We have addressed the issues with truncation of redundant representation which unlike conventional arithmetic is not trivial. We have also used the non-uniform arrival time of operands in multi-operand addition to decrease the delay of adder trees. These improvements in implementation have resulted in 72.7% increase in the speed of synthesized circuit in comparison with the implementation based on conventional operators. Finally we have used our methods on an application from digital communication domain. We have implemented a digital pre-distorter to reverse the effect of non-linearity in a satellite communication channel modeled using Volterra series. We have compared our results with previous attempts which relied on hand-optimization of similar systems. Our optimization techniques have resulted in an increase of 55.9% in speed comparing to direct evaluation of Volterra series. When area optimization is the main objective, our approach has resulted in a decrease of 37.1% and 73.0% in number of DSP's and Adaptive Logic Modules (ALM) respectively.

Main Content
Current View