If the algorithm was not designed for parallel execution, it just can’t be directly ported to OpenCL. We have the expertise in both algorithm design and GPU-programming to walk the shortest path to optimal performance.

Designing parallel version of algorithms is an intensive but important task of the speed-up process. By focusing on programmability-aspects such as caching, parallelisability and data-locality while redesigning the algorithms form the ground up, the performance can be maximised. For example a recursive algorithm is very understandable, but it uses much more memory than the stack-based version. We have experience in converting algorithms to be used in various parallel programming languages.

The algorithm-document will help you to fully understand what has changed, how it works and how you can continue from where we delivered.

In short:

  • Redesign algorithms for optimal performance on parallel architectures
  • Implementation in OpenCL, CUDA, OpenMP, MPI and more.
  • Full documentation of the redesign process.

Check out the blog-series on Programming Theories to learn more about what we do to make your software more scalable!

 

 


Want to know more? Book a meeting!

We are the acknowledged experts in CUDA, HIP, OpenCL, Vulkan and performance optimization for CPUs and GPUs. We proudly boast a portfolio of satisfied customers worldwide, and can also help you build high performance software. Book a meeting to learn more.