Nicolau, Alexandru; Wang, Haigeng

Speedup of banded linear recurrences in the presence of resource constraints

1991

Abstract

An m-th order linear recurrence system of N equations computes Xi =Ci+ L:!~f-m aijXj for 1 ::; i ::; N. Linear recurrences have a role of central importance in computer design, numerical analysis, program analysis, image processing and vision. However, programs containing banded linear recurrences are difficult to parallelize due to loop-carried dependences. In this paper, we first present a family of schedules, called the exact schedules, for parallel evaluation of low order ( m ::; 2) banded linear recurrences with an execution time (2m2 + 3m)N/(p + (m(m + 1)(2m + 1))/(2(2 + llog mj))) for 0 < m::; 2 , N > (p + 5)(2p + 3)/6 and number of processors p > m. We show that the exact schedules achieve the strict time lower bound under matrix multiplication model. Next, we derive another family of schedules, called the pipelined schedules, with better program-space efficiency and with an execution time of pipeline startup time+ (2m2 + 3m)N /(p + (2m + 1)/2) for m = 1, and pipeline startup time+ (6m + 2)N /(p + (2m + 1) for m > 1, m < p ::; 4m + 1, and pipeline startup time+(2m2 +3m)N/(p+(m-1)(2m+1)) form> l,p > 4m+l. This is the first parallel algorithm that achieves this time bound, which improved the fastest prevoiusly published algorithm by a factor 2:'. (p+ 2m2 - m-1)/(p+ m + 1/2) form> 1. We illustrate the technique by parallelizing loops containing linear recurrences and demonstrate the available speedup on a VLIW architecture with experimental results obtained using our pipelined schedules.

Main Content

For improved accessibility of PDF content, download the file to your device.

Donald Bren School of Information and Computer Sciences

Speedup of banded linear recurrences in the presence of resource constraints