How to install OpenCL on Windows

Posted by Anca Hamuraru on 16 March 2015 with 19 Comments

Getting your Windows machine ready for OpenCL is rather straightforward. In short, you only need the latest drivers for your OpenCL device(s) and you’re ready to go. Of course, you will need to add an OpenCL SDK in case you want to develop OpenCL applications but that’s equally easy.

Before we start, a few notes:

The steps described herein have been tested on Windows 8.1 only, but should also apply for Windows 7 and Windows 8.
We will not discuss how to write an actual OpenCL program or kernel, but focus on how to get everything installed and ready for OpenCL on a Windows machine. This is because writing efficient OpenCL kernels is almost entirely OS independent.

If you want to know more about OpenCL and you are looking for simple examples to get started, check the Tutorials section on this webpage.

Running an OpenCL application

If you only need to run an OpenCL application without getting into development stuff then most probably everything already works.

If OpenCL applications fail to launch, then you need to have a closer look to the drivers and hardware installed on your machine:

Check that you have a device that supports OpenCL. All graphics cards and CPUs from 2011 and later support OpenCL. If your computer is from 2010 or before, check this page. You can also find a list with OpenCL conformant products on Khronos webpage.
Make sure your OpenCL device driver is up to date, especially if you’re not using the latest and greatest hardware. With certain older devices OpenCL support wasn’t initially included in the drivers.

Here is where you can download drivers manually:

Intel has hidden them a bit, but you can find them here with support for OpenCL 2.0.
AMD’s GPU-drivers include the OpenCL-drivers for CPUs, APUs and GPUs, version 2.0.
NVIDIA’s GPU-drivers mention mostly CUDA, but the drivers for OpenCL ~~1.1~~ 1.2 are there too.

In addition, it is always a good idea to check for any other special requirements that the OpenCL application may have. Look for device type and OpenCL version in particular. For example, the application may run only on OpenCL CPUs, or conversely, on OpenCL GPUs. Or it may require a certain OpenCL version that your device does not support.

A great tool that will allow you to retrieve the details for the OpenCL devices in your system is Caps Viewer.

Developing OpenCL applications

Now it’s time to put the pedal to the metal and start developing some proper OpenCL applications.

The basic steps would be the following:

Make sure you have a machine which supports OpenCL, as described above.
Get the OpenCL headers and libraries included in the OpenCL SDK from your favourite vendor.
Start writing OpenCL code. That’s the difficult part.
Tell the compiler where the OpenCL headers are located.
Tell the linker where to find the OpenCL .lib files.
Build the fabulous application.
Run and prepare to be awed in amazement.

Ok, so let’s have a look into each of these.

OpenCL SDKs

For OpenCL headers and libraries the main options you can choose from are:

NVIDIA – CUDA Toolkit. You can grab the OpenCL samples here.
AMD – ~~AMD APP SDK. Also works with Intel’s CPUs.~~
- Headers and OpenCL.lib are here: https://github.com/GPUOpen-LibrariesAndSDKs/OCL-SDK/releases
- Samples are here: https://github.com/OpenCL/AMD_APP_samples
- Math libraries are here: https://github.com/clMathLibraries
Intel – the previous Intel SDK for OpenCL is now integrated into Intel’s new tools, such as Intel INDE (which has a free starters edition) or Intel Media Server Studio. Grab any of these in order to have everything ready for building OpenCL code.

As long as you pay attention to the OpenCL version and the OpenCL features supported by your device, you can use the OpenCL headers and libraries from any of these three vendors.

OpenCL headers

Let’s assume that we are developing a 64bit C/C++ application using Visual Studio 2013. To begin with, we need to check how many OpenCL platforms are available in the system:

[raw]

#include<stdio.h>
#include<CL/cl.h>

int main(void)
{
    cl_int err;
    cl_uint numPlatforms;

    err = clGetPlatformIDs(0, NULL, &numPlatforms);
    if (CL_SUCCESS == err)
         printf("\nDetected OpenCL platforms: %d", numPlatforms);
    else
         printf("\nError calling clGetPlatformIDs. Error code: %d", err);

    return 0;
}

[/raw]

We need to specify where the OpenCL headers are located by adding the path to the OpenCL “CL” is in the same location as the other CUDA include files, that is, CUDA_INC_PATH. On a x64 Windows 8.1 machine with CUDA 6.5 the environment variable CUDA_INC_PATH is defined as “C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include”

If you’re using the AMD SDK, you need to replace “$(CUDA_INC_PATH)” with “$(AMDAPPSDKROOT)/include” or, for Intel SDK, with “$(INTELOCLSDKROOT)/include“.

OpenCL libraries

Similarly, we need to let the linker know about the OpenCL libraries. Firstly, add OpenCL.lib to the list of Additional Dependencies:

Secondly, specify the OpenCL.lib location in Additional Library Directories:

As in the case of the includes, If you’re using the AMD SDK, replace “$(CUDA_LIB_PATH)” with “$(AMDAPPSDKROOT)/lib/x86_64” , or in the case of Intel with “$(INTELOCLSDKROOT)/lib/x64“.

And you’re good to go! The application should now build and run. Now, just how difficult was it? Happy OpenCL-coding on Windows!

If you have any question or suggestion, just leave a comment.

The magic of clGetKernelWorkGroupInfo

Posted by Vincent Hindriksen on 22 October 2015

It’s not easy to get the available private memory size – actually it’s impossible to get this information directly from the device/drivers, using the OpenCL API. This can only be explained after you dive deep into clGetKernelWorkGroupInfo – the function that tells you how well your kernel fits on the device. It is strange this function is not often discussed.

Memory sizes

CL_KERNEL_LOCAL_MEM_SIZE

Returns the amount of local memory, in bytes, being used by a kernel (per work-group). Use CL_DEVICE_LOCAL_MEM_SIZE to find out the maximum.

CL_KERNEL_PRIVATE_MEM_SIZE

Returns the minimum amount of private memory, in bytes, used by each work-item in the kernel.

Work sizes

CL_KERNEL_GLOBAL_WORK_SIZE

This answers the question “What is the maximum value for global_work_size argument that can be given to clEnqueueNDRangeKernel?”. The result is of type size_t[3].

CL_KERNEL_WORK_GROUP_SIZE

The is the same for local_work_size. The kernel’s resource requirements (register usage etc.) are used, to determine what this work-group size should be.

CL_KERNEL_COMPILE_WORK_GROUP_SIZE

If __attribute__((reqd_work_group_size(X, Y, Z))) is used, then (X, Y, Z) is returned, else (0, 0, 0).

CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE

It returns a performance-hint: if the total number of work-items is a multiple of this number, then you’ll get good results. So no more remembering 32 or 64 for specific GPUs, but simply kick in a call to this function.

Combined with clDeviceInfo’s CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS, you can fine-tune your workgroup-size in case you need the group-size to be as large as possible.

OpenCL Developer support by NVIDIA, AMD and Intel

Posted by Vincent Hindriksen on 14 April 2011 with 2 Comments

There was some guy at Microsoft who understood IT very well while being a businessman: “Developers, developers, developers, developers!”. You saw it again in the mobile market and now with OpenCL. Normally I watch his yearly speech to see which product they have brought to their own ecosphere, but the developers-speech is one to watch over and over because he is so right about this! (I don’t recommend the house-remixes, because those stick in your head for weeks.)

Since OpenCL needs to be optimised for each platform, it is important for the companies that developers start developing for their platform first. StreamComputer is developing a few different Eclipse-plugins for OpenCL-development, so we were curious what was already there. Why not share all findings with you? I will keep this article updated – know this article does not cover which features are supported by each SDK.

Continue reading “OpenCL Developer support by NVIDIA, AMD and Intel” →

Happy New Year!

Posted by Vincent Hindriksen on 1 January 2011

About a year ago this site was launched and a half year ago StreamHPC as a company was official for the Chamber of Commerce. It has been a year of hard work, but the reason for this all started after seeing the cover of a book about bore-outs. The result is there with a growing number of visitors from all over the world (from 62 countries since 23-Dec-2010) and new twitter-followers every week. Now some mixed news for 2011:

We are soon going to release a few plugins for Eclipse, both free and paid, to simplify your development.
2011 will be the year of hybrid processors (Intel SandyBridge and AMD Fusion), which will make OpenCL much more popular.
2011 is also going to be the year of the smart-phone (prognosis: in 2011 more smart-phones will be sold than PCs). So even more OpenCL-potential.
At 31-Dec-2010 we migrated the site to a faster server to reduce waiting-time also online.
The book will be released in parts, to avoid more delays.
There will be around ten (short) articles published in January. Both developers and managers will be served.
Our goal is to expand. We have shown you our vision, but we want to show you more.

In a few words: 2011 is going to be exciting! We wish all our readers, business-partners, friends, family and (new) customers a super-accelerated 2011!

StreamHPC – we accelerate your computations

Support matrix of Compute SDKs

Posted by Vincent Hindriksen on 29 March 2011 with 1 Comment

Multi-Core Processors and the SDKs

The empty boxes tell IBM and ARM have a lot of influence. With NVIDIA’s current pace with introducing new products (hardware and CUDA), they could also take on ARM.

The matrix is restricted to current better-known compute technologies OpenCL, CUDA, Intel ArrBB, Pathscale ENZO, MS DirectCompute and AccelerEyes JacketLib.

X = All OSes, including MAC
D = Developer (private alpha or private beta)
P = Planned (as i.e. stated in Intel’s Q&A)
U = Unofficial (IBM’s OpenCL-SDK is promoted for their POWER-line)
L = Linux-only
W= Windows-only
? = Unknown if planned

Continue reading “Support matrix of Compute SDKs” →

Terms and Conditions

Algemene Voorwaarden Trainingen StreamHPC v1.0

Algemene Voorwaarden Consultancy StreamHPC 2.0

OpenCL Potentials: Investment-industry

Posted by Vincent Hindriksen on 11 October 2011 with 4 Comments

This is the second in the series “OpenCL potentials“. I chose this industry because it is the finest example where you are always late, even if you were first. So it always must be faster if you want to make the better analyses. Before I started StreamHPC I worked for an investment-company, and one of the things I did was reverse engineering a few megabytes of code with the primary purpose of updating the documentation. I then made a proof-of-concept to show the data-processing could be accelerated with a factor 250-300 using Java-tricks only and no GPGPU. That was the moment I started to understand that real-time data-computation was certainly possible. Also that IO is the next bottle-neck after computional power. Though I am more interested in other types of research, I do have my background and therefore try to give an overview for this sector and why it matters.

Continue reading →

Processors that can do 20+ GFLOPS per Watt (2012)

Posted by Vincent Hindriksen on 27 August 2012 with 44 Comments

energy-efficient — System for communicating power-efficiency of new equipment. “A” being best, “F” being worst. 2011-A is incomparable with 2012-A.

For yearly power-usage there is a rule-of-thumb which states that a device that is continuously on, costs the amount of Watt times 1.5 in Euro per year. So the computer in front of me, that takes around 107 Watt, costs me €160 a year if I would leave it on. A moderate cluster with several GPUs of a few hundred Watts each, would cost a few thousand Euros a year. I would say: very doable for most companies.

So why is the performance per Watt? There is more to a Watt than just the costs. The energy to cool a cluster is quite high, as most of the energy escapes via heat. And then there is the increase in demand for portable power. In cases you are thinking of sweeping you credit card for a top 10 supercomputer, then these energy-costs are extremely high.

In this article I try to get an overview of who is entering the 20+ GFLOPS/Watt area. All processors that do less than 20 GFLOPS/Watt, need to have other advantages to survive. And you’ll see that all the green processors are programmed with OpenCL, the technology StreamHPC is all about.

IMPORTANT: The total power used is sometimes including and sometimes excluding memory-transfers. So the comparison below IS NOT FAIR. The graphics cards are including memory-transfers, while the CPUs and SoCs are not.

Continue reading “Processors that can do 20+ GFLOPS per Watt (2012)” →

Training

Public trainings in Amsterdam

As Amsterdam is easy to reach from anywhere in Europe, we’re giving most of our public trainings in our offices. See the below list what is upcoming:

[eme_events scope=future limit=10]

In-company trainings globally

We at StreamHPC train IT-experts in OpenCL, CUDA and GPU directives world-wide. All trainings can be given in English or Dutch; on request printed materials can be translated into your local language.

We offer:

crash courses in GPUs, FPGAs and directives,
in-dept trainings, and
in-house trainings.

What customers said

“Normally you just get told to type in specific commands in some order, which you can find in the text books too. StreamHPC focused on teaching the backgrounds to get a mindset for GPU-programming and to better understand the hardware. After that I understood the SDK-examples much better.”

All OpenCL SDKs now in our Knowledge Base

Posted by Vincent Hindriksen on 31 October 2012 with 3 Comments

For who hasn’t seen the latest addition to our knowledge base, we have added a list of all (almost) available OpenCL-SDKs. You can find it in the menu under “Knowledge Base” -> “SDKs…“.

This list shows how important OpenCL is getting, as developers now can write compute-intensive parallel software on CPUs, GPUs, ARM-based accelerators and even FPGAs. This growth of OpenCL-devices is very exciting and important news, and that’s why it has got its own section on the site.

The the current list is (in random order):

AMD GPUs & CPUs
ZiiLabs ARM Tablet
Altera FPGA board – available in Q2/Q3 2013
Adapteva Parallella board – available in Q2/Q3 2013
Intel CPUs
Samsung Exynos 5 board – available in December 2012
IBM POWER-processor

Currently looking into:

Intel Xeon Phi
Nintendo Wii U dev
Sony Playstation 4 Orbis
Vivante
Xilinx
NVidia GPUs
Qualcomm

The SDK of NVIDIA is on the second list, what you maybe did not unexpected. We have to wait until they have put their official statement on what they are going to do with CUDA and OpenCL.

While you are there, also check the other parts of the Knowledge Base:

What is… -> Explanations of terminology. Put your requests in a comment.
Event&Talks -> A list of events which StreamHPC attends, give talks at and helps organise. Interesting for both managers and engineers.
Self Study – The part of the site most visited after the blog. This is for the engineers who want to start learning programming GPUs.

This section will be updated and extended continuously with information not available anywhere else.

StreamHPC has been in the OpenCL business since 2010 as one of the few. We have been the most visible and known OpenCL-specialist ever since.

Image Processing

Vd-Sharp Vd-Blur2 Vd-Edge3 At StreamHPC, there is broad experience in the parallel, high-performance implementation of image filters. We have significantly improved the performance of various image processing software. For example, we have supported Pixelmator in achieving outstanding processing speeds on large image data, and users frequently praise the software’s speed in comparisons with competing software products.

StreamHPC is currently hosting an educational initiative that supports interested individuals in their efforts of porting algorithms from the open-source GEGL image processing framework to fast parallel versions based on OpenCL. GEGL is used by the popular image manipulation software Gimp as well as other free software. For more information on this project, look at our website OpenCL.org, which we dedicate to spreading knowledge on OpenCL.

Online Tutorials are here

Posted by Vincent Hindriksen on 16 September 2016

46188854 - beautiful smiling female student using online education service. young woman looking in laptop display watching training course and listening it with headphones. modern study technology concept — Online training

We’re going online with our presentations and tutorials. This makes it easy to reach more people and make our trainings more flexible.

We’re starting with short introductory trainings, but we have bigger plans. Keep an eye on our events (shared on Twitter, LinkedIn, this blog and the newsletter) to see what the offerings are. And you’re very welcome to join!

On 4 October (new date) there will be an OpenCL 101 of two hours for free. Target timezone is East-America and Europe.

Agenda Online OpenCL 101

Introductions (20 minutes)
- StreamHPC
- GPUs and paralellism
- OpenCL
By example: Getting started with OpenCL (30 minutes)
By example: Porting a simple program to OpenCL (30 minutes)
Q&A in parallel (30 minutes). Ask us any question, for instance:
- General OpenCL.
- OpenCL on GPUs.
- OpenCL on FPGAs.
- What algorithms work well with GPUs, CPUs and FPGAs.
- StreamHPC services.
The next steps (5 minutes).
Closing words (5 minutes).

Tutorial server

You can already test if the tutorial server works for you by looking around in our demo room. The tutorial itself will be in another room. Use your own name and password “ap“.

[bigbluebutton token=89b561b86fff]

See you soon!

Gedit OpenCL Syntax Highlighting

Posted by Vincent Hindriksen on 1 February 2011 with 1 Comment

Update 17-06-2011: updated version of opencl.lang and added opencl_host.lang.

When learning a language it is nice to do it the hard way, so you take the default txt-file editor provided with your OS. No colours, not help, no nothing, pure hard-core learning. But in Linux-desktop Gnome the default editor Gedit is quite powerful without doing too much, has an official Windows-port and has a OSX Darwin-port. It took just a few hours to understand how highlighting in Gedit works and to get it implemented. I got some nice help from the work done at the cuda-highlighter by Hüseyin Temucin (for showing how to extend the c-highlighter the best way) and the VIM OpenCL-highlighter by Terence Ou (for all the reserved words). This is work in progress; I will tell about updates via Twitter.

Get it

Windows-users first need to download Gedit for Windows. OSX-folks can check Darwin-ports. Then the files opencl.lang (.cl-files) and opencl_host.lang (extension of c to highlight OpenCL-keywords) needs to be put in /usr/share/gtksourceview-2.0/language-specs/ (or in ~/.local/share/gtksourceview-2.0/language-specs/ for local usage only), or for Window in C:Program Filesgeditsharegtksourceview-2.0language-specs or for OSX in /Applications/gedit.app/Contents/Resources/share/gtksourceview-2.0/language-specs/. Make sure all Gedit-windows are closed so the configuration will be re-read, and then open a .cl-file with Gedit. If you have opened cl-files as C or Cuda, you have to set the highlighting to OpenCL manually (under view -> highlighting). For host-code you always need to set the highlighting manually to “OpenCL host”. You might want to associate cl-files with Gedit.

Alternatives

VIM: http://www.vim.org/scripts/script.php?script_id=3157

Notepad++: http://sourceforge.net/tracker/?func=detail&aid=2957794&group_id=95717&atid=612384

SciTE: http://forums.nvidia.com/index.php?showtopic=106156

StreamHPC is working on Eclipse-support and I’ve understood also work is done for Netbeans-support. Let me know if there are more alternatives.

Ask your question

Do you have a question? We are happy to answer all your questions on any subject discussed at this website.

Due to spam floods, we removed the form.

info@streamhpc.com

We try to answer your question within 24 hours.

Bits&Chips actie voor OpenCL training

U bent op deze pagina terecht gekomen via de nieuwsbrief van Bits&Chips of op aanraden van een vriend of collega.

Uw doel is om een zware berekening of beeldverwerking te versnellen. Wij leren u dat door u enkele concepten aan te reiken, zodat u uw software op een andere manier ontwerpt en programmeert. Met veel snellere software als resultaat. Dit doen wij aan de hand van OpenCL, een programmeertaal voor parallelle processoren.

De training

In 3½ dag leert u:

Ontwerpen van een parallelle software architectuur,
Gebruik te maken van de grafische kaart als co-processor,
De programmeertaal OpenCL,
Parallelle algoritmes implementeren.

De helft is uitleg door de trainer, de andere helft practica. Op de 4de dag bespreken we een probleem gezamenlijk, om alles samen te laten komen.

Na de training kunt u (zonder speciale tools) langzame code herkennen, deze opnieuw ontwerpen en porteren naar de grafische kaart. U krijgt het lesboek en wat basis-software mee naar huis, zodat u eenvoudig verder kunt oefenen.

De data

De volgende 3 trainingen hebben al enkele aanmeldingen. Ze hebben een speciaal thema en doelgroep.

17-20 maart, Amsterdam. Thema: beeldverwerking.
14-17 april, Amsterdam. Thema: wiskunde/algebra
14-17 juli, Amsterdam. Thema: signaalverwerking

U kunt naar iedere training, als u OpenCL wilt leren. De gebruikte boeken zijn per training verschillend.

De kleine lettertjes

Enkele belangrijke aspecten van de training:

De voertaal is Engels, ivm trainees uit Europa.
Een goede basis-kennis van C is noodzakelijk. U kunt gratis een speciale thuisstudie opgestuurd krijgen.
U dient tijd in te roosteren om na de training te oefenen, zodat u niet snel alles snel weer vergeet.
U heeft een laptop nodig.
Het programmeren gebeurt meestal op een Linux-server. De basis-kennis hiervoor wordt ter plekke uitgelegd.

Aanmelding

De normale kosten voor de training zijn €1850 per persoon. Als u Bits&Chips noemt, krijgt u €100 korting.

Het enige wat u nu hoeft te doen is een mailtje te sturen naar trainings@streamhpc.com met daarin uw gewenste datum en/of onderwerpen. Wij sturen u daarna een vragenlijst terug als voorbereiding op de training.

Voor vragen kunt u terecht bij +31854865760 (kantoor) or +31645400456 (mobiel).

What does Khronos has more to offer than OpenCL and OpenGL?

Posted by Vincent Hindriksen on 24 November 2014

The OpenCL standard is from the not-for-profit industry consortium Khronos Group. But they do a lot more, like the famous standard OpenGL for graphics. Focus of the group has always been on multimedia and getting the fastest results out of the hardware.

Now open source and open standards are getting more important, collabroations like the Khronos Group, get more attention. At StreamHPC we are very happy with this trend, as the business models are more focused on collaborations and getting things done than on making sure the customer cannot ever leave.

Below is an overview of the most important APIs that Khronos has to offer.

OpenCL related

OpenCL: compute
WebCL: web compute
SPIR/SPIR-V: intermedia language for compute-kernels, like those of OpenCL and OpenGL’s GSLS
SYCL: high-level language for OpenCL

OpenGL related

Vulkan: state-less graphics
OpenGL: graphics
OpenGL ES: embedded graphics
WebGL: web graphics
glTF: runtime asset format for WebGL, OpenGL ES, and OpenGL
OpenGL SC: Graphics for Safety Critical operations
EGL: interface between rendering APIs such as OpenGL ES and the underlying native platform window system, such as X.

Streaming input and output

OpenMAX: interface for multimedia codecs, platforms and hardware
StreamInput: interface for sensors
OpenVX: OpenCV-alternative, built for performance.
OpenKCam: interface for cameras and sensors

Others

COLLADA: 3D model file format
OpenSL ES: embedded audio
OpenVG: vector Graphics

One video called “OpenRoad” to show them all:

http://www.youtube.com/watch?v=ckD0op6OgMQ

Want to learn more? Feel free to ask in the comments, or check out https://www.khronos.org/

Apple’s dragging OpenCL compiler problem

Posted by Vincent Hindriksen on 9 May 2015 with 1 Comment

Remember the times that the OpenCL compilers where not that good as they’re now? Correct source-code being rejected, typos being accepted, long compile times, crashes during compiling and other irritating bugs. These made the work of an OpenCL developer in “the old days” quite tiresome – you needed a lot of persistence and report bugs. Lucky on desktops the drivers have improved a lot.

Apple’s buggy OpenCL compiler

Now to Apple. There have always been complaints about the irritating bugs that were in Apple’s compiler. Recently the Luxrender community started to make more complaints, as the guy responsible for the OSX port decided to quit. This was due to utter frustration: code that worked on every other OS, simply did not work on OSX. Luxrender’s Paolo Ciccone stood up and made this extremely public, by writing an open letter to Apple’s CEO Tim Cook (posted below).

The letter is not specific about the kind of bugs and and therefore asked him via Twitter which were the bugs he was talking about. He explained me that it’s very simple:

https://twitter.com/RealityPaolo/status/595972568961519616

Here at StreamHPC we could write around those bugs in most cases, but Luxrender has bigger and more complex kernels than we used in our projects – then it’s simply impossible to write around, as the compiler simply crashes. It seems that OSX still has those old compilers, Linux and Windows used to have years ago.

Metal

Metal is the OpenCL-alternative on iOS 8 and up.

If you’re thinking that Metal could be a reason – that language looks very much like OpenCL, as it’s simply OpenCL as Apple would like it to be. Porting between the two languages is therefore quite simple. This also means that with some small fixes a Metel-kernel could be compiled by existing OpenCL-compiler. Ok, there is much more than the compute part, but the message is that more complex Metal wouldn’t be possible using this driver-stack.

If we end up in a situation that Metal comes to OSX and is more stable than OpenCL, only then we can say that Apple tries to block OpenCL in favour of their own APIs.

The letter

I’m really happy that Paolo Ciccone had the guts to publicly complain. This is the letter he wrote:

Dear Mr. Cook.

I’m sorry to bother you but we have tried all other channels and nothing worked.

I’m part of a group of developers of a physically-based renderer called LuxRender. LuxRender has been written to use OpenCL to accelerate its enormous amount of computation necessary to generate photo-realistic scenes. You can see some of the images generated by Lux at http://luxrender.net. Lux is an Open Source program.

Apple has defined OpenCL and we have adopted this API instead of the proprietary CUDA in order to be able to work with all kind of hardware on all major platforms. It made sense for an OSS to use an open standard.

The reason why I’m writing to you is that, after waiting for years, we still have broken GPU drivers on OS X. Scenes that render perfectly well on Windows and even on Linux simply abort on OS X. This is happening with both AMD and nVidia GPUs.

The problem is unsolvable from our side. We need updated, fixed drivers for OS X. The problem is so bad hat our main OS X developer has announced, today, that he is giving up OS X. He simply can’t do his job.

I kindly request that you look into this and give us working AMD and nVidia drivers in an upcoming, possibly soon, update of OS X. We are more than willing to work with your engineers, if you need any kind of specific help in identifying the problem.

Thank you for your attention.

Paolo Ciccone

If you want to help, also post this letter on your blog or in a forum. The more this is shared, the better. Especially Apple’s forum, asking for the official statement.

OpenCL.org internship/externship

Posted by Vincent Hindriksen on 15 July 2016

Want to help build an important website? OpenCL.org’s components have been designed and partly built, but still a lot of work needs to be done. We’re seeking an intern (or “extern” when not in Amsterdam) who can help us build the site. This internship is not about GPUs!

To complete the tasks, the following is required:

Technical expertise:
- HTML5, CSS
- PHP
- Javascript
- jQuery
- Node.js
- Mediawiki
- XSLT
Can-do mentality
Able to plan own work
Good communication-skills
Available for 3 to 6 months

We don’t expect you know all tools, so we will guide you in learning new tools and techniques. Write us a “email of interest” to info@streamhpc.com, and write what you can and what your objectives for an internship would be.

We’re looking forward to see your letter!

AMD is back!

Posted by Vincent Hindriksen on 16 June 2016 with 4 Comments

AMD_Logo-and-wordmark-1024x768 For years we haven been complaining on this blog what AMD was lacking and what needed to be improved. And as you might have concluded from the title of this blogpost, there has been a lot of progress.

AMD is back! It will all come together in the beginning of 2017, but you’ll see a lot of progress already the coming weeks and months.

AMD quietly recognised and solved various totally new problems in HPC, becoming the hidden innovator everybody needed.

This blog is to give an overview of how AMD managed to come back and what it took to get to there. Their market cap supports it, as you can see.

amd-market-cap-history — AMD’s market cap is back at 2012 levels (source)

Continue reading “AMD is back!” →

Slow Software Hotline

Posted by Vincent Hindriksen on 15 June 2017

In the perfect world all software is fast, giving us time to do actual work. Unfortunately we live in an unperfect world, and we have to spend extra time controlling our anger as the software keeps us waiting.

Therefore we have opened the Slow Software Hotline – reachable via both phone and email. It has one goal: make you feel happy again.

Reporting is easy. Just name the commercial software that needs to be sped up and why. We’ll do the rest. If you need help with initial anger management due to the slow (and/or buggy) software, we’re happy to help with breathing practises.

Phone: +31 854865760

Email: hotline@streamhpc.com

We will not sit still until all software is fast. Speeding up all software out there. One at the time.

Start your GPU-career here

Posted by Vincent Hindriksen on 25 September 2018

GPUs have been our mysterious friends and known enemies for years, as they let us run code in expected and unexpected ways. GPUs have solved problems for many of our customers. GPUs have such a high rate of evolvement, that they’ll remain important for the years to come.

Problem is that programming GPUs is not an easy task. Where do you learn to program GPUs? We found these to be the main groups:

Universities
Research centers
GPU vendors (AMD, Nvidia, Intel, Qualcomm, ARM)
Self-study

This is far from enough. Add to that, that only a very select group learns the craft at a company. We’d like to change that, and we think now is the time for us to be able to deliver on this.

In January we’ll our internal training program will start with 4 to 8 developers. Focus in on fully understanding recent GPU-architectures, CUDA and OpenCL. It will consist of lectures, workshops, discussions, paper reading and ofcourse coding for one month. The months after that will have guidance, paper presentations, code reviews and time for self-study. The exact form will differ per person.

The hard side

The current measurable requirements are:

EU citizen or already having a working permit
Great at C/C++
High interest in algorithmic optimisations
Any performance improvement focus (i.e. Assembly, clean code) is a plus
Any GPU experience (i.e. OpenGL, DirectX, self-study) is a plus
High interest in performance
Willing to move to Amsterdam
Willing to work for Stream HPC for at least 2 years

The soft side

We’re looking for people that fit our culture and we think we can train. This means that the selection is based for a large part on “the spark”. Therefore the application starts with a speed date, and we’re sorry for not finding a better wording for this. This is a 20 minute discussion about what we like and what we don’t. This can be done via phone, Skype or in person, during the evening, in the weekends or during your lunch break.

How to apply

Read about our company culture. Look at the jobs we have open. These describe the requirements after the training. Then write us a motivational letter: explain us why this is exactly what you want, why you’re capable and why you’re a cultural fit. If you find it hard to write such letter, then just start with answering the list of requirements. It’s a big bonus to share code (Github, Gitlab, zip-file). Send your email to jobs@streamhpc.com

Other jobs

Feeling more senior? We have other jobs: