Updated: OpenCL and CUDA programming training – now online

Update: due to Corona, the Amsterdam training has been cancelled. We’ll offer the training online on dates that better suit the participants.

As it has been very busy here, we have not done public trainings for a long time. This year we’re going to train future GPU-developers again – online. For now it’s one date, but we’ll add more dates in this blog-post later on.

If you need to learn solid GPU programming, this is the training you should attend. The concepts can be applied to other GPU-languages too, which makes it a good investment for any probable future where GPUs exist.

This is a public training, which means there are attendees from various companies. If you prefer not to be in a public class, get in contact to learn more about our in-company trainings.

It includes:

  • Four days of training online
  • Free code-review after the training, to get feedback on what you created with the new knowledge;
  • 1 month of limited support, so you can avoid StackOverflow;
  • Certificate.

Trainings will be done by employees of Stream HPC, who all have a lot of experience with applying the techniques you are going to learn.


Most trainings have around 40% lectures, 50% lab-sessions and 10% discussions.


  • The training is guaranteed to take place. If you are the only one, you’ll simply get personal training.
  • The below schedule is indicative when it comes to lab-sessions – some lab-sessions can be (partly) skipped or replaced if time is getting too limited.

Day 1: OpenCL/CUDA Foundations

This is close to our standard OpenCL crash course. We start with the basic concepts, write our fist OpenCL program, discuss the architectures, discuss the difficulties of GPU-programming, compare to CUDA and C++, and end with writing simple code that runs on a laptop (CPU or GPU).

  • Generic GPGPU model
  • OpenCL language
  • CUDA/HIP language
  • Memory objects
  • General hardware overview
  • Task-parallelism and data-parallelism
  • Mapping code to CPUs and GPUs
  • Comparison to other languages like SYCL, OpenMP and OpenACC

Day 2 + 3: Optimise a program from scratch

During the day, we will increase the level of requirements and touch all important aspects of OpenCL-programming.

As we have several GPU-servers available for developing, we can provide you with login-credentials, a git-account and a short how-to for using the extra GPUs from NVidia and AMD, optionally Intel. This way you can use different graphic cards to find out which optimisations work and don’t work.

Optimisations we discuss during this day:

  • Host-code
  • Data-flow
  • Memory handling
  • Data transfer speed increase
  • Memory alignments
  • Scheduling
  • Parallelism increase
  • Latency reduction
  • The most important kernel-optimisations from the different vendors

Lab sessions

During the days we use various lab-sessions to support the explained theory. We’ll discuss most of the following:

  • Clinfo
  • ColorBalance
  • Matrix-multiplication
  • Convolution
  • Histogram
  • Contrast Stretching
  • Frame correlation
  • Fixing non-optimal and broken code

Day 4: Tools, special subjects and final project


There are various tools you need to understand to get code that runs well on GPUs. This day you will learn to use various vendor-provided and open source tools to help you analyse your code.

  • Software correctness: data-races.
  • Profiling: timing and finding hot-spots
  • Reporting: let tools create useful reports.
  • Debugging: finding bugs and learning what actually happens.

The tools are discussed and used along the following four subjects:

  • Software correctness: data-races.
  • Profiling: timing and finding hot-spots
  • Reporting: let tools create useful reports.
  • Debugging: finding bugs and learning what actually happens.

We only discuss tools last, as you need to understand the concepts before having something solve it for you.

Special subjects

Often these are different per training, as these are defined by the attendees of the training. Subjects that have been discussed more frequently:

  • Splitting work over CPU&GPU: Running different kernels on CPU and GPU, to make maximum use of the whole computer.
  • GL-CL interop: Understanding how interoperability with OpenGL works. With this the results can be shown on the screen with minimal latency.
  • Optimising data-throughput when using multiple kernels.

Final Project

The final project you will need to use all you’ve learnt and try to get the fastest code from class.


  • Fixing non-optimal and/or broken code


We will send a questionnaire to understand the needs of each trainee. For larger groups, we also have a separate phone call with the representative.


Attendees need to bring their own laptops for the lab sessions. The only requirement is for the laptops to be equipped with an OpenCL capable CPU or GPU and OpenCL drivers are correctly installed. A complete list with OpenCL compliant devices can be found here. Regarding the software, laptops need to have installed the following software:

  • cmake 3.1 or higher: lab-sessions are available in cmake, so almost all IDEs are supported.
  • An IDE or text-editor with coding-support.
  • One or more OpenCL SDKs, for each OpenCL-device in the computer.
  • A C/C++ compiler suite. Examples of supported suites are Microsoft Visual Studio, Apple Xcode and GNU GCC/G++. We will send a small project in advance, which can be used to test compilers.
  • ssh/putty: optional. Needed when working at StreamHPC’s GPU servers.
  • git: optional. When easy transfer of lab-work between several computers/servers is needed, this is required. Lab-sessions can also be downloaded as zip-files.


We prefer to focus on the core of the training, and therefore we ask a set of skills to be there. This avoids the others needing to wait.

Attendees are required to have intermediate programming experience and good C/C++ knowledge. This means that you should at least be able to write an application in C/C++ from scratch, high-level debug it with GDB and be very (!) comfortable working with pointers. When this is not the case, do contact us to provide pre-training material.

Info Online training


The costs are €2500. Additional days for personal training and consultancy are excluded – please ask us for a quote.

Reserve your spot today!

If you’re interested, do fill in the pre-training questionnaire already.

Initiating the reservation can be done by email or phone.
Email: info@streamhpc.com
Phone: +31 854865760