Skip to main content
eScholarship
Open Access Publications from the University of California

Software pipelining of non-vectorizable loosely nested loops

Abstract

This paper presents a new technique to parallelize non-vectorizable loosely nested loops. Loosely nested loops represent the general form of nested loops. Previously, the attempt of parallelizing nested loops, e.g. the wavefront method, has been limited to tightly nested loops, a restricted class of general nested loops. Our method overcomes this limitation. It consists of two steps: computing the exact time step of each statement instance and capturing and expressing the parallel statement instances whose time steps are equal. We provide efficient algorithms for both steps and illustrate their practieality by parallelizing the well-known SOR algorithm. The parallel SOR was run on a Sequent machine with 1 to 7 physical processors resulting significant speed-ups.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View