We help our customers get faster, more responsive and/or more precise software. But what does that mean? What is it what we do here in Amsterdam?
Our work is for the large part under NDA, so unfortunately we cannot share what we have done some of the work we’re very proud of.
Below is an selection of blog posts discussing our demos and Github-links showing our work and open source software.
- Building rocRAND. The world’s fastest random number generator is built for AMD GPUs, and it’s open source. With random numbers generated at several hundreds of gigabytes per second, the library makes it possible to speed up existing code numerous times. The code is faster than Nvidia’s cuRAND and is therefore the preferred library to be used on any high-end GPU.
- Building AMD’s optimized version of CUB. Highly optimized for Vega20 GPUs. Now porting CUB-based software to AMD is a lot simpler.
- Building AMD’s optimized version of Thrust. Highly optimized for Vega20 GPUs. Lots of software for CUDA is Thrust based, and now has no lock-in anymore.
- Porting Gromacs from CUDA to OpenCL. Till we ported the simulation software end of 2014, it has been CUDA-only. Porting took several man-months to manually port all code. You can now download the source, build it and run it on AMD/Intel hardware – see here for more info. All is open source, so you can see our code.
- Porting Manchester’s UNIFAC to OpenCL@XeonPhi. Even though XeonPhi Knights Corner is not a very performant accelerator, we managed to get a 160x speedup from single threaded code. Most of the speedup is due to clever code-optimisations.
- Android video filter demo [to be published]. Real-time Android-app, where the webcam stream has several real-time OpenGL filters applied to make it look like an old movie. This was a proof-of-concept to show we could apply our knowledge to Android and OpenGL.
- Speeding up Excel. A heavy financial algorithm is offloaded to a GPU, resulting in a big speedups.
- Flooding simulation. Software that simulates flooding of land, which we ported to multi-GPU on OpenCL and got a 35x speedup over MPI.
- Cartoonizer. The webcam or video stream is “cartoonized” using several image filters on an FPGA using OpenCL.
Do you need a secret weapon too? We like to work together with you, to build fast software together. Get in touch to discuss your needs and goals.
Want to know more? Get in contact!
We are the acknowledged experts in OpenCL, CUDA and performance optimization for CPUs and GPUs. We proudly boast a portfolio of satisfied customers worldwide, and can also help you build high performance software. E-mail us today