There are many research papers that claim enormous speed-ups using an accelerator. From our experience a large part is because of code-modernisations (parallisation & optimisation), which makes the claim look false. That’s why we offer peer-reviews for half our rate for CUDA and OpenCL software. The final costs depend on the size and complexity of the code.
We will profile your CPU and Accelerator code on our machines and review the code. The results are the effect of the code-modernisations and the effect of using the accelerator (GPU, XeonPhi, FPGA). With this we hope that we stimulate the effect of code-modernization gets more research attention over using “miracle hardware”.
Don’t misunderstand: GPUs can still get an average of 8x speedup (or 700% speed improvement) over optimised code, which is still huge! But it’s simply not the 30-100x speed-up claimed in the slide at the right.