Skip to main content
eScholarship
Open Access Publications from the University of California

A Parallel Error Diffusion Implementation on a GPU

Published Web Location

https://doi.org/10.1117/12.872616
Abstract

In this paper, we investigate the suitability of the GPU for a parallel implementation of the pinwheel error di�usion. We demonstrate a high-performance GPU implementation by efficiently parallelizing and unrolling the image processing algorithm. Our GPU implementation achieves a 10-30x speedup over a two-threaded CPU error di�ffusion implementation with comparable image quality. We have conducted experiments to study the performance and quality tradeoff�s for di�fferences in image block sizes. We also present a performance analysis at assembly level to understand the performance bottlenecks.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View