OpenCL Multiprecision

Eric Bainville - Jan 2010

Introduction

In the GPU Benchmarks page we studied the relative performance of multithread C and OpenGL applied to some primitive operations related to multiprecision computations.

In this page, we will continue working on multiprecision computations in OpenCL only. Our objective is to implement a multiplication algorithm working on million-digit numbers.

Representation of numbers

A number will be represented by an array int x[n]. The associated value is Σ_i=0..n-1 x[i]*Bⁱ. B is the representation base. The coefficients x[i] are in the interval [-M,+M], with M≤2³¹-1 to fit on an int.

As soon as 2M+1>B, the representation is redundant: several representations exist for the same number. By choosing M large enough, we can eliminate sequential carry propagation, and design algorithms working in parallel on all elements of an array.

For example with B=100, the number 880 may be represented on two digits by:
+008 +080 =8.100+80,
+009 -020 =9.100-20,
+007 +180 =7.100+180.

In the next section we see how large number products are usually computed.