The strict time lower bound and optimal schedules for parallel prefix with resource constraints
- Author(s): Wang, Haigeng
- Nicolau, Alexandru
- Siu, Kai-Yeung S.
- et al.
Parallel prefix is a fundamental common operation at the core of many important applications, e.g., the Grand Challenge problems, circuit design, digital signal processing, graph optimizations, and computational geometry. Given x1 , ... , XN, parallel prefix computes x_1 o x_2 o ... ox_k, for 1 p(p+ 1)/2, we derive Harmonic Schedules and show that the Harmonic Schedules achieve the strict optimal time (steps), |2(N - 1)/(p + l)|. We also derived Pipelined Schedules, optimal schedules with |2( N - 1 )/(p + 1 )| + |(p - 1)/2| - 1 time, which take a constant overhead of |(p - 1)/2| time steps more than the strict optimal time but have the smallest loop body. Both the Harmonic Schedules and the Pipelined Schedules are simple, concise, with nice patterns of computation organizations, and easy to program. For prefix of N elements on p processors in N