CPU/GPU Multiprecision Mandelbrot Set

Eric Bainville - Dec 2009

Simple OpenCL implementation

On the GPU, the computation of the entire image is divided into a large number of threads. For example in our code we used 128x128 threads. Each thread computes a rectangle of nx by ny pixels of the image: nx = width / 128 and ny = height / 128.

The kernel code is similar to the C code, with two extra lines to map the thread coordinates returned by get_global_id(dim) to pixel coordinates in the output image. We have to use float values since double is not yet implemented in all OpenCL drivers.

__kernel void compute_hw(__global uint * a,
			 __global uint * colormap,
			 int nx,int ny,
			 int offset,int lda,
			 real_t leftX,real_t topY,
			 real_t stepX,real_t stepY,
			 int maxIt)
  for (int iy=0;iy<ny;iy++) for (int ix=0;ix<nx;ix++)
    int xpix = get_global_id(0)*nx + ix;
    int ypix = get_global_id(1)*ny + iy;

    real_t xc = leftX + (real_t)xpix * stepX;
    real_t yc = topY  - (real_t)ypix * stepY;
    int it;
    real_t x,y;
    x = y = (real_t)0;
    for (it=0;it<maxIt;it++)
      real_t x2 = x*x;
      real_t y2 = y*y;
      if (x2+y2 > (real_t)4) break; // Out!
      real_t twoxy = (real_t)2*x*y;
      x = x2 - y2 + xc;
      y = twoxy + yc;
    uint color = (it < maxIt)?colormap[it]:0xFF000000;
    a[offset+xpix+lda*ypix] = color;

In the next page, we see how to implement fixed-point arithmetic to work with reals more precise than double.