Basic Concepts: out of resources with clEnqueueReadBuffer

Oops
“Oops! The best way to learn, when you love trial-on-error”™

In the series “Basic Concepts” various basics of GPGPU and OpenCL are discussed. This time we go into a typical one: when an error does not imply the actual problem. It is therefore good to have an overview of all errors with their descriptions.

When you get an out-of-resources error or when you get a crash when using clEnqueReadBuffer, you are sort of left in the dark. What does it mean? And how can you solve it?

Typical: one driver crashes/segfaults and another one gives this error.

Officially the error is defined as:

CL_OUT_OF_RESOURCES if there is a failure to allocate resources required by the OpenCL implementation on the device.

Which means that there can more reasons than the device being out of resources. A better name would have been CL_RESOURCE_ALLOCATION_ERROR. It can be thrown by various functions, but we focus on this one function. It cannot by thrown by clEnqueWriteBuffer, as that depends on the limits of the host.

Finding out the cause

The oldest trick of ‘m all: try to use the CPU and check what the error is then. CPUs are great to detect data-races (correct on CPU, not on GPU) and CPUs are a bit more stable when you have buggy code plus have more RAM. Be sure to install both Intel’s and AMD’s drivers.

Calling clFinish at each line, helps you pinpoint the actual line it happens or to get an error instead of a crash.

Then you have the following options:

  1. 9 out of 10 times you have a pointer problem at the host or are writing out of bounds. So you try to write to an illegal memory location, or try to cram in an 35×35 float* into 10x10x10 float* space (buffer-overflow). Double check the host memory-sizes, and if the host-pointers are correct.
  2. You read out of bounds on the device. Double-check the used memory-sizes.
  3. You might have hit a limit of the driver, such as the 5s timeout if the NVidia card is also being used as a display. Rule out you have used up all memory by using both smaller and larger(!) objects. Also note down memory object sizes over time. Be sure you clean up non-used objects. Fragmentation of device-memory can also be the problem it eventually goes wrong.

The last one I have not encountered myself, but found on the Nvidia forums. I recently had this error (type 1), because I had introduced clear naming in the code I was working on. When I introduced the standard ‘h_‘ and ‘d_‘ prefixes for all variables, I immediately found the cause.

Hope it has helped you understand the resource allocation error. If you found other reasons, please share via the comments and I’ll add it. If you have requests what to discuss in this series, let me know via Twitter or the comments.

Related Posts

FIA_F1_Austria_2018_Nr._33_Verstappen

The Art of Benchmarking

How fast is your software? The simpler the software setup, the easier to answer this question. The more complex the software, the more the answer will ...

5yearsSC

Birthday present! Free 1-day Online GPGPU crash course: CUDA / HIP / OpenCL

...  to the most normal projects. The training is very basic, but there will be a lot of aha!s that will help in making better ...

network-of-boxes

Problem solving tactic: making black boxes smaller

We are a problem solving company first, specialised in HPC - building software close to the processor. The more projects we finish, the more it's clea ...

stocks

Improving FinanceBench

If you're into computational finance, you might have heard of FinanceBench. It's a benchmark developed at the University of Deleware and is aimed a ...