OpenCL Fast Fourier Transform

Eric Bainville - May 2010, updated March 2011

Writing OpenCL code for single and double precision

Support for double precision floating-point type double in OpenCL kernels requires an extension. Today (AMD APP SDK 2.3), AMD does not provide a fully compliant cl_khr_fp64 extension, but provides the cl_amd_fp64 extension. The following code is used in our kernels to handle single or double precision:


#if defined(cl_khr_fp64)  // Khronos extension available?
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
#elif defined(cl_amd_fp64)  // AMD extension available?
#pragma OPENCL EXTENSION cl_amd_fp64 : enable

// double
typedef double real_t;
typedef double2 real2_t;
#define FFT_PI 3.14159265358979323846
#define FFT_SQRT_1_2 0.70710678118654752440


// float
typedef float real_t;
typedef float2 real2_t;
#define FFT_PI       3.14159265359f
#define FFT_SQRT_1_2 0.707106781187f


A macro is defined by the OpenCL C compiler for each available extension, here for example cl_khr_fp64. This macro can be tested to enable the extension with #pragma OPENCL EXTENSION cl_khr_fp64 : enable. The definition of CONFIG_USE_DOUBLE is passed as compilation option to clBuildProgram.

In the kernel code, we will use the real_t, real2_t types instead of float or double, and use the FFT_... constants.

We are now ready to start experimenting OpenCL FFT kernels: Radix-2 kernel.