A part of reaction on my earlier post was “VB Programmers span the whole range from hobbyists to scientists and professions. There is no reason that VB programmers should be left out of the loop in GPU programming. (…) CUDA and OpenCL are fundamentally lower level languages that do not have the same level of user-friendlyness that VB gives, and so they require a bit more care to program. (…)“. By selecting parts, I probably put the emphasis very wrong, but it brought me an idea: “how a Visual Basic Wizzard for OpenCL” would look like.
It is actually very interesting to think how to get GPGPU-power to a programmer who’s used to “and then and then and then”; how do you get parallel processing into the mind of an iterative thinker? Simplification is not easy!
By thinking about an OpenCL-wizard, you start to understand the bottlenecks of, and need for OpenCL initialisation.
Actually it could better be built into an existing framework like OpenMP (or something alike in Python, Java, etc) or the IDE could give hints that the standard-function could be replaced by a GPGPU-version. But we just want a wizard which generates “My First OpenCL Program”, which puts a smile on the face of programmers who use their mouse a lot.
The screenshots were taken from an explanation how to do some Oracle-integration and were edited. I use the British spelling, so nobody would believe these screens would be from Microsoft, I hope.
We want a new project and luckily there is a “New OpenCL VB Project Wizard”. I was always amazed by the great names those wizards had.
The order of the functions would be based on common-practise (if any connection), or by random. Incompatible kernels will be greyed out to avoid problems. For each ‘function’ there are various kernel-templates in the library. I put a few here, but a category-like browser would be better. I thought a lot about this step, being the most crucial, but so I leave it open for your comments now.
A wizard always wants to think for the user, so a sample of a stream must be provided to determine it’s type. Output can be stream also, or XLS of course.
The first question asks in a difficult way how the initialisation should be: well prepared from the start or reinitialisation each time?The second question asks how aggressively it can use resources. By selecting the precision, a lot will happen: extensions will be needed, different kernels will be selected, etc.
Which questions were not asked? And for what reason?
- How many available devices should be used? Default is all.
- Fall back to CPU? No, that would complicate the program enormously.
This list should be huge, like each line of each brand. But since it must be kept simple, only high-end devices are asked for. I know AMD Fusion is not on the market yet and the first version will probably not have a high-end GPU – but it would be great.
As you can see, the question about extensions are based on the demanded precision in the behaviour-screen.
So what now?
Now you’ve seen the pictures, you probably have better ideas. I even edited some screens again, while I was writing the final image-captions. Please let me know what you think, since it will help others to simplify their wrappers around OpenCL.
Edit 7-oct-2010: there actually exists something as described for CUDA in Visual Studio: Cuda VS Wizard.
Disclaimer: StreamHPC gives trainings in GPGPU/OpenCL. We like to show you how to learn GPGPU and OpenCL and like to share our enthusiasm via this blog, but of course our training is more thorough.