In the release notes for 378.66 graphics drivers for Windows (February 2017), NVIDIA officially spoke about supporting OpenCL 2.0 for the first time. Unfortunately, this is partial support only and, as NVIDIA said, these new [OpenCL 2.0] features are available for evaluation purposes only.
We did our own tests on a GTX 1080 on Windows and could confirm that for Windows the green team is halfway there. NVIDIA still has to implement pipes, enable non-uniform work-group sizes (this happens when in ND-range global_work_size is not divisible by the local_work_size), and fix a few bugs in device side enqueue.
Today we decided to test out NVIDIA latest driver (378.13) for 64-bit Linux and check its support for OpenCL 2.0.
NVIDIA, OpenCL 2.0 and Linux
Just like on Windows, our GTX 1080 reports that it is an OpenCL 1.2 devices. It is understandable since support for OpenCL 2.0 is only in beta stage. In the following table you’ll find an overview of the 2.0 functions supported by this Linux driver.
|OpenCL 2.0 feature||Supported||Notes|
|SVM||Yes||Only coarse-grained SVM is supported. Fine-grained SVM is not.|
|Device side enqueue||Partially. Surprisingly, it|
works better than
|Almost OpenCL programs with device side queue we have tested work.|
Some advanced examples with multi-level device side kernel enqueuing
and/or CLK_ENQUEUE_FLAGS_WAIT_WORK_GROUP fail. When using device
side queue, it's only possible to use 1D nd-range with uniform work
groups (or without specifying local size). 2D and 3D nd-ranges
|Pipes||No||Pipe functions are defined in libOpenCL.so in 378.13 drivers,|
but using them cause run-time errors.
|Generic address space||Yes|
|C11 Atomics||Partially||Using atomic_flag_* functions cause an CL_BUILD_ERROR error.|
The host-side functions clSetKernelExecInfo(), clCreateSamplerWithProperties() and clCreateCommandQueueWithProperties() are also present and working.
As you can see, the support for OpenCL 2.0 on Linux is almost exactly the same as on Windows. But in contrast with the Windows-drivers, we were able to successfully compile and run several more kernels that use device side queue. It may indicate that this feature is being actively developed and maybe in future drivers it will work much better – for both Linux and Windows.
What you can do to make it better
As NVIDIA only adds new functionality to OpenCL driver when requested, it is very important that they receive these requests. So when you or your employer is a paying customer, do keep requesting the features you need. Know that NVIDIA knows that lacking required functionality will be bad for their sales.