For years we haven been complaining on this blog what AMD was lacking and what needed to be improved. And as you might have concluded from the title of this blogpost, there has been a lot of progress.
AMD is back! It will all come together in the beginning of 2017, but you’ll see a lot of progress already the coming weeks and months.
AMD quietly recognised and solved various totally new problems in HPC, becoming the hidden innovator everybody needed.
This blog is to give an overview of how AMD managed to come back and what it took to get to there. Their market cap supports it, as you can see.

Are you around at ISC and have an opinion on portable open standards? Then you should join the discussion with other professionals at ISC. Some suggestions for discussions:

We have been talking about GPUs, FPGAs and CPUs a lot, but there are more processors that can solve specific problems. This time I’d like you to give a quick introduction to grid-processors.
OpenCL header files
Want to get an overview of what Heterogeneous Systems Architecture (HSA) does, or want to know what terminology has changed since version 1.0? Read further.
10 years ago we had CPUs from Intel and AMD and GPUs from ATI and NVidia. There was even another CPU-makers VIA, and GPU-makers S3 and Matrox. Things are different now. Below I want to shortly discuss the most noticeable processors from each of the big three.
Tesla K80 (Kepler)
Radeon Nano and FirePro S9300X2 (Fiji)
XeonPhi Knights landing
During the panel discussion some very interesting questions were asked, I’d like to share with you.

The coming month we’re travelling every week. This generates are a lot of opportunities where you can meet the StreamHPC team! For appointments, send an email to 





Random numbers are important elements in stochastic simulations, but they also show up in machine learning and applications of Monte Carlo methods such as within computational finances, fluid dynamics and molecular dynamics. These are classical fields in high-performance computing, which StreamHPC has experience in.
If there would be one rule to get the best performance, then it’s avoiding data-transfers. Therefore it’s important to have lots of bandwidth and GFLOPS per processor, and not simply add up those numbers. Everybody who has worked with MPI, knows why: transferring data between processors can totally kill the performance. So the more is packed in one chip, the better the results.




At the university of Newcastle they use OpenCL for researching the performance balance between software and hardware. This resource management isn’t limited to shared memory systems, but extends to mixed architectures where batches of co-processors and other resources make it much more complex problem to solve. They chose OpenCL as it gives both inter-node and intra-node resource-management.
Warning: below is raw material, and needs some editing.





















