At the Thalesian talk about OpenCL I gave in London it was quite hard to find a way to talk about OpenCL for a very diverse public (without falling back to listing code-samples for 50 minutes); some knew just everything about HPC and other only heard of CUDA and/or OpenCL. One of the subjects I chose to talk about was how to integrate OpenCL (or GPGPU in common) into existing software. The reason is that we all have built nice, cute little programs which were super-fast, but it’s another story when it must be integrated in some enterprise-level software.
The most important step is making your software ready. Software engineering can be very hectic; managing this in a nice matter (i.e. PRINCE2) just doesn’t fit in a deadline-mined schedule. We all know it costs less time and money when looking at the total picture, but time is just against.
Let’s exaggerate. New ideas, new updates of algorithms, new tactics and methods arrive on the wrong moment, Murphy-wise. It has to be done yesterday, so testing is only allowed when the code will be in the production-code too. Programmers just have to understand the cost of delay, but luckily is coming to the rescue and says: “It is my responsibility”. And after a year of stress your software is the best in the company and gets labelled as “platform”; meaning that your software is chosen to include all small ideas and scripts your colleagues have come up “which are almost the same as your software does, only little different”. This will turn the platform into something unmanageable. That is a different kind of software-acceptance!
Needs, Problems & Solutions
We need speed, easy conversion from one domain to the chosen IT-domain and need it for a low price. The problem is that speed is limited by the hardware (and the software-architecture) and there is no money for change-management. The logical thing would be to add GPGPU to speed up calculations and that’s it.
As you have read above this should be the last step. First the software need to be ready, by separating the flow and algorithms and use good version-management. Instead of building more and more into one program, all small projects should stay small and provide a service. These small services can be linked together and monitored, while the report-generation gathers it’s information in a flexible way.
Good design-patters are strategy and chain of responsibility, depending on taste. While the chain of responsibility pattern is a good way for not worrying the data will be picked up by some algorithm-implementation in the line, the strategy can help to support change-management where the beta-report can use the algorithm-implementation-that-had-to-be-finished yesterday and the production-live report uses the latest stable one.
The reason why you and I should think about engineering the software around the accelerated new piece of library-code, is that GPGPU is new and it just might not work as expected or it cannot deal with that specific kind of data nobody thought about. The kernels should have pre-conditions to check validity of the input-data, but actually that is too late.
Hey, we just want speed!
And you’re right! All these extra layers could make the software less flexible. So that makes it even nicer to find ways of data-transport that is fast enough. But look at your architecture first; loading data from a database-server from another location could be a bigger bottle-neck than the calculation-throughput. If you have measured every step in the process, you are ready for integrating OpenCL and buy those wannahave-hardware of NVIDIA, AMD, Intel* or IBM*. As a certified Java-programmer I insist, and I promise you get more of the speed you want.
StreamHPC will release a cross-platform plugin for Eclipse to easily write OpenCL-programs. For who don’t know Eclipse, it is a cross-platform multi-language IDE. For any requests, please comment below or send me an e-mail. For pre-orders and questions about i.e. volume licenses, please mail us at sales[at]StreamHPC[dot]eu