Evaluation of cache-based superscalar and cacheless vector architectures for scientific computations
Skip to main content
eScholarship
Open Access Publications from the University of California

Evaluation of cache-based superscalar and cacheless vector architectures for scientific computations

  • Author(s): Oliker, Leonid
  • Canning, Andrew
  • Carter, Jonathan
  • Shalf, John
  • Skinner, David
  • Ethier, Stephane
  • Biswas, Rupak
  • Djomehri, Jahed
  • Van der Wijngaart, Rob
  • et al.
Abstract

The growing gap between sustained and peak performance for scientific applications is a well-known problem in high end computing. The recent development of parallel vector systems offers the potential to bridge this gap for many computational science codes and deliver a substantial increase in computing capabilities. This paper examines the intranode performance of the NEC SX-6 vector processor and the cache-based IBM Power3/4 superscalar architectures across a number of scientific computing areas. First, we present the performance of a microbenchmark suite that examines low-level machine characteristics. Next, we study the behavior of the NAS Parallel Benchmarks. Finally, we evaluate the performance of several scientific computing codes. Results demonstrate that the SX-6 achieves high performance on a large fraction of our applications and often significantly out performs the cache-based architectures. However, certain applications are not easily amenable to vectorization and would require extensive algorithm and implementation reengineering to utilize the SX-6 effectively.

Main Content
Current View