
You’ve seen the speedups possible on GPUs. We secretly know that many of these techniques would also work on modern multi-core CPUs. If after the first optimisations the GPU still gets an 8x speedup, the GPU is the obvious choice. When it’s 2x, would the better choice be a bigger CPU or a bigger GPU? Currently the GPU is chosen more often.
Now AMD, Intel and AMD have 28+ core CPUs, the answer to that question might now lean towards the CPU. With a CPU that has 32 cores and 256bit vector-computations via AVX2, each clock-cycle 32 double4 can be computed. A 16-core AVX1 CPU could work on 16 double2’s, which is only a fourth of that performance. Actual performance compared to peak-performance is comparable to GPUs here. Continue reading “CPU Code modernisation – our hidden expertise”

A month ago IWOCL (OpenCL workshop) and DHPCC++ (C++ for GPUs) took place. Meanwhile many slides and posters have been
Most of our projects are around performance optimisation, but we’re cleaning up bugs too. This is because you can only speed up software when certain types of bugs are cleared out. A few months ago, we got a different type of request. If we could solve bugs in MESA 3D that appear in games.
In the perfect world all software is fast, giving us time to do actual work. Unfortunately we live in an unperfect world, and we have to spend extra time controlling our anger as the software keeps us waiting.







ROCm is AMD’s open source Linux-driver that brings compute to HSA-hardware. It does not provide graphics and therefore focuses on monitor-less applications like machine learning, math, media processing, machine vision, large scale simulations and more.

As of 1 April we are 7 years old. Because of all the jokes on that day, this post is a bit later.
With 


The main strength of Artificial Intelligence is it’s easy to understand by anybody. This results in new applications in all industries at a rapid pace. Are there new possibilities generated or have the possibilities always been possible? The answer is both.


Would you like to run CUDA-kernels on the OpenCL framework? Or Python or Rust? SPIRV is the answer! Where source-to-source translations had several limitations, SPIRV 1.1 even supports higher level languages like C++.
How interesting is SPIRV really?