Interest in OpenCL

Since more than a year I have this blog and I want to show the visitors around the world. Why? Then you know where OpenCL is popular and where not. I chose an unknown period, so you cannot really reverse engineer how many visitors I have – but the nice thing is that not much changes between a few days and a month. Unluckily Google Analytics is not really great for maps (Greenland as big as Africa, hard to compare US states to EU countries, cities disappear at world-views, etc), so I needed to do some quick image-editing to make it somewhat clearer.

At the world-view you see that the most interest comes from 3 sub-continents: Europe, North America and South-East Asia. Africa is the real absent continent here, except some Arab countries and South-Africa only some sporadic visits from the other countries. What surprises me is that the Arab countries are among my frequent visitors – this could be a language-issue, but I expected about the same number of visitors as from i.e. China. Latin America has mostly only interest from Brazil.

Continue reading “Interest in OpenCL”

For Developers

Self-study material

We can keep everything for ourselves, but we like to share resources. It will take some time to learn it all, but you can always take our course for more experienced programmers.

[list1]

[/list1]

Please let us know if something is missing to complete the lists of books and tutorials.

[infobox type=”information”][widgets_on_pages id=Trainings][/infobox]

OpenCL feedback and bugs

Certainly the developers who started in 2009/2010 know how buggy the first drivers were. As OpenCL is a large project and is not in hands of one hardware-manufacturer, it might be difficult to get driver-errors over. But luckily Khronos provides two ways to give feedback via them: the OpenCL forums and the “Khronos Public Bugzilla”.

I have a request for the next version

The OpenCL forums are the right place for you. You can also discuss possible bugs here, if you are not sure and want others to test your code.

I found a bug!

Got to the Khronos Public Bugzilla, log in (using the email from your Khronos account, or make a new account). If you found a bug in a driver, fill it in like below under “conformance tests”.

opencl-bugreport

Best is to mention your bug-report on the forums and on twitter, so others can take a look at it. If nobody seems to react to it, send us a message and we’ll put some pressure where needed.

AMD OpenCL coding competition

The AMD OpenCL coding competition seems to be Windows 7 64bit only. So if you are on another version of Windows, OSX or (like me) on Linux, you are left behind. Of course StreamHPC supports software that just works anywhere (seriously, how hard is that nowadays?), so here are the instructions how to enter the competition when you work with Eclipse CDT. The reason why it only works with 64-bit Windows I don’t really get (but I understood it was a hint).

I focused on Linux, so it might not work with Windows XP or OSX rightaway. With little hacking, I’m sure you can change the instructions to work with i.e. Xcode or any other IDE which can import C++-projects with makefiles. Let me know if it works for you and what you changed.

Continue reading “AMD OpenCL coding competition”

USB-stick sized ARM-computers

Now that smartphones get more powerful and internet makes it possible to have all functionality and documents with you anywhere, the computer needs to be reinvented. You see all big IT-companies searching for how that can be, from Windows Metro to complete docking stations to replace the desktop by your phone. A turbulent market.

One of the new products are USB-stick sized computers. Stick them into a TV or monitor, zap in your code and you have your personal working environment. You never need to carry laptops to your hotel-room or conference, as long as a screen is available – any screen.

There are several USB-computers entering the market, but I wanted to introduce you to two. Both of these see a future in a strong processor in a portable device, and both do not have a real product with these strong processors. But you can expect that in 2013 you can have a device that can do very fast parallel processing to have a smooth Photoshop experience… at your key-ring.

Continue reading “USB-stick sized ARM-computers”

Location/Address

StreamHPC

Verwulft 9B
2011GJ Haarlem
Netherlands, Europe

phone: +31 6 45400456

 

Taking on OpenCL

Quote by Dr. Kelso (from the series “Scrubs”) – click for video

OpenCL is getting more and more important and for more developers a skill worth having. At StreamHPC we saw this coming in 2010 and have been training people in OpenCL since. A few weeks ago I got a question on how to take on OpenCL, which could be interesting for more people: how to take on OpenCL. In other words: the steps to take to learn OpenCL the quickest. Since the last time I wrote on learning OpenCL is almost two years ago, it is a good time to share more recent insights on this matter.

Taking on OpenCL takes four main steps in this order:
  1. Understanding the hardware and architectures.
  2. Thinking both in parallel and in vectors.
  3. Learning the OpenCL language itself.
  4. Profiling and debugging.

You see that is a whole difference from learning for instance Java with a Pascal-background. Learning VHDL for programming FPGAs comes closer, though you don’t need to tinker with timings when doing OpenCL. Let’s go through the steps.

Continue reading “Taking on OpenCL”

The CPU is dead. Long live the CPU!

Scene from Gladiator when is decided on the end of somebody’s life.

Look at the computers and laptops sold at your local computer shop. There are just few systems with a separate GPU, neither as PCI-device nor integrated on the motherboard. The graphics are handled by the CPU now. The Central Processing Unit as we knew it is dying.

To be clear I will refer to an old CPU as “GPU-less CPU”, and name the new CPU (with GPU included) as plain “CPU” or “hybrid Processor”. There are many names for the new CPU with all their own history, which I will discuss in this article.

The focus is on X86. The follow-up article is on whether the king X86 will be replaced by king ARM.

Know that all is based on my own observations; please comment if you have nice information.

Continue reading “The CPU is dead. Long live the CPU!”

Careers

We have jobs for people who get bored easily.

In return, we offer a solution to boredom, since performance engineering is hard.

As the market demand for affordable high performance software grows, StreamHPC continuously looks for great and humble people to join our team(s). With upcoming products, new markets like OpenCL on low-power ARM processors and compute-intensive applications like AI and VR, we expect continuous growth.

We currently have two types of jobs:

  • Software Performance Engineers. The heroes who make software go vroom vroom
  • Growth and Support. The heroes who make the people and the company go vroom vroom

For the second group, you will find the most changes in the job-ads, as we seek people who want to join us for the years to come.

Find our jobs and Apply

To apply, go to our recruitment website. You can read our hiring process here.

Want to know more about getting a job at Stream HPC or in high-tech software in general? We have collected lots of information, as we want you to succeed and not be blocked by the process. You can find our writings in the menu under “Join us! Jobs” and on our blog under “Careers & Advice“.

What we do

StreamHPC stands for faster software. We do this by designing and building parallel software, and by training.

Building your performing and extremely fast software

We are all about accelerating (existing) software. The main focus is on:

  • Extending functional requirements to cover implementations of parallel software
  • Doing code-reviews
  • Implementing algorithms in a parallel form,
  • Porting software, and
  • Reverse engineering and documenting existing software.
Implementing software in parallel makes your product more scalable.

Whenever there is a software performance problem, you can call us.

Training you in performance programming

Parallel software is quite different from serial software, but we can teach you all about it.

Basically, there are three levels:

  • Parallel programming in languages like C, C++, Java and .NET,
  • Cache and memory optimisation, and
  • GPU-programming.

Our trainings have a balance between theoretical background, practical use-cases and lab-sessions.

Performance programming is all about caches, memory-buses and parallelism.
Subjects hardly mentioned in most programming books.

What is Performance Engineering?

Software Performance Engineering is increasing the throughput and speed of software by making better use of the hardware possibilities. It uses faster algorithms and apply less data-intensive programming-concepts.

With new software, performance-requirements can be specified beforehand. This can be supported by specifying benchmarks in the test-cases.

Post-production happens more often, as requirements outgrow the defined ones. and it contains the following phases:

[list2]

  1. Reverse engineering the code and compare with the original requirement-documents
  2. Measuring the code performance to find bottle-necks.
  3. Redesigning the code such that it supports current requirements.
  4. Implementing optimizations.

[/list2]

Most times if performance engineering is needed for a software less than 3 years old, or when the code was not designed or managed well. Reverse-engineering the code could reduce time when rebuilding the software.

StreamHPC is famous for porting software to accelerators like the GPUs. But performance engineering is much more than that, as accelerators can only be used when the original code meets minimum quality standards to be able to increase performance.

LEAP-conference call for papers

921752_m
Building bridges in a new industry

Embedded processors always have had the focus on low-energy. Now a combination of Moore’s law, the frequency-wall and multi-processor developments have made it possible for these processors to compete in completely new market segments. Most notable due to impressive advancements in graphics IP.
We are now looking at four groups who are interested in learning from each other:

  • The embedded processor market
  • The FPGA market
  • The HPC and server market
  • The GPGPU market

And answer the question: how can we get more out of low-energy processors by looking at other industries?

The goal of the LEAP conference is to bring these three groups together. Creating the windows to each other and paving roads over the newly constructed bridges. This makes it one of its kind. Half of the conference is focused on quality information sharing and the other half on networking. For more information, check the website of the LEAP-conference. StreamHPC is a co-organiser.

Call for papers is now open! Programme is filled!

Continue reading “LEAP-conference call for papers”

12-14 June: OpenCL Training Amsterdam

From 12 to 14 June StreamHPC will give a 3-day course in OpenCL (was 3 to 5 June). Here you will learn how to develop OpenCL-programs.

A separate ticket for only the first day can be bought, as then will be a crash-course into OpenCL. Module basics.

The second and third day will all about parallel-algorithm design, optimisation and error-handling. Module optimisation with several new subjects added.

The last part of the third day is reserved for special subjects, as requested by the attendees. Continue reading “12-14 June: OpenCL Training Amsterdam”

ERSA-NVIDIA award for “Best Young Entrepreneur”

ersa-logoStreamHPC supports the ERSA conference, 22-25 July in Las Vegas. At that conference there will be an award given to “Best Young Entrepreneur” and I’d like you to send in a proposal. The winner gets an NVIDIA Tesla K20!

Young entrepreneurs and academics with a great product/project are invited to present their solution. As the event draws around 2000 people, you get the attention needed to show-case your new company or research-group. Your solution does not need to be based on FPGAs or GPUs, as long as Von Neumann’s architecture is not in it.

Read the information below or directly go to the ERSA-NVIDIA awards-homepage.
“Von Neumann’s architecture lasted for 75 years.”
That genius can no longer lead us into the new age of computing that is upon us. This competition seeks to acknowledge those pioneers that are helping to build the new computing landscape”

Submission of Proposals for ERSA-NVIDIA award Candidates
Deadline: 6 May 2013 31 May 2013 – extended deadline!
Send proposals to org@ersaconf.org

The Award is devoted for entrepreneurs developing tools, advanced technologies and opportunities for supporting applications, both academic and commercial, across broad area of high-performance, embedded systems implemented as multicore systems and reconfigurable heterogeneous parallel processing systems.

The Award Committee includes:

Leading Universities

  •  Stanford University, USA, Prof. Michael Flynn
  •  Imperial College London, UK, Prof. Wayne Luk
  • Karlsruhe Institute of Technology, Germany, Prof. Joerg Henkel
  • Keio University, Japan, Prof. Hideharu Amano
  • Shanghai Jiao Tong University, China, Prof. Simon See

Leading Companies (tentative list):

  • NVIDIA, Can Ozdoruk, Product Manager
  • Altera, Steve Casselmanm, Principal Engineer
  • National Instruments, Hugo Andrade, Principal Architect

For more info go to: http://ersaconf.org/awards/

If you have any question, just ask them in the comments or send us an email.

Applied GPGPU-days Amsterdam 2013

6754632287-2December 2013: Videos are not ready yet, but link will be put here.

Amsterdam, 20 June – Applied GPGPU-days in Amsterdam. Keep your agenda free for this event.

What can you do with GPUs to speed up computations? This year we can see various examples where OpenCL and CUDA have been used. We hope to give you an answer if you can use GPUs for your software, research or algorithm.

After the success of last year (fully booked with 66 attendees), we now have reserved a larger location with place for 100 people. Difference with last year is that we focus more on applications, less on technical aspects.

The program has been made public recently:

Title of talk Company/Institute Presenter
Introduction to GPGPU and GPU-architectures StreamHPC Vincent Hindriksen
Blender Cycles & Tiles: Enhancing user experience AtMind bv Monique Dewanchand & Jeroen Bakker
XeonPhi vs K20: The fight of the titans SURFsara Evghenii Gaburov
A real-time simulation technique for ship-ship and ship-port interaction PMH bv Jo Pinkster
CUDA Accelerated Neural Networks LIACS Ana Balevic
Efficient Reconstruction of Biological Networks via Transitive Reduction on GPUs TU Eindhoven Anton Wijs
Running Petsc on GPUs with an example from fluid dynamics SURFsara Thomas Geenen
Connected Component Labelling, an embarrassingly sequential algorithm Leeuwarden University Jaap van de Loosdrecht
Visualizing sound and vibrations using a GPU and a 1024-channel microphone array TU Eindhoven Wouter Ouwens
Gravitational N-body simulations on 1 to many GPUs Leiden observatory Jeroen Bédorf

A few demos will be shown.

For more information, see the Platform Parallel webpage. Also to find other events by the platform.

Tickets are €75,-. If you are from a Dutch university or research institute affiliated with SURF, your ticket has been fully sponsored by SURFsara.

Associated events in the Netherlands

For the technical aspects (GPU-programming techniques, optimisation, etc) we have a special day: the GPU Dev Day 2013. More information on the Platform Parallel webpage. Date and place will be made public in June.

The first Khronos Meetup Benelux will take place just before the Applied GPGPU day, on 19 June in Amsterdam. More information on the meetup-page.

A list of Desktop GPU architectures

p3-architectureUPDATED in February 2017

Some optimisation tricks work really well on one architecture, and are useless on others. And even with better drivers, the older architectures need some help. In other words, it helps to know what architecture the GPU has. Therefore you get some help from your friends at StreamHPC.

Below you’ll find a list of the architecture names of all OpenCL-capable GPU models of Intel, NVIDA and AMD. It does not contain the professional lines for now – first we are focusing on getting the general models right.

Understand it took a lot of time to gather the below information, and normally we share such information only with our clients.

Continue reading “A list of Desktop GPU architectures”

ARM forums to find useful information for OpenCL development

OpenCL on ARM is hot, but it just is getting started. Currently it takes some time to find needed information about the processors concerning

For OpenCL-discussions the best place is the Khronos OpenCL board. So where can you go when you want to ask questions specifically on ARM-based GPUS like MALI, PowerVR, Adreno and Vivante?

ARM’s new community site for all

ARM just launched the Connected Community (ARM CC). It is the place to connect to, when you have general information-needs of ARM-IP, such as ARM MALI, Cortex A9 and Cortex A15.

arm-forums

And here is how ARM themselves explains this initiative on one slide:

ARMConnectedCommunityIntro

Be sure to connect to StreamHPC. We hope this will indeed be the central place for the whole ecosystem, including Imagination, Qualcomm and Vivante.

ARM MALI

Mali-developer

The MALI Developer Center has its forums on ARM Connected Community.

Imagination PowerVR

The graphics-section of their developer forums seems to be the best place.

imgtec dev forums

(Not @ ARM CC)

Qualcomm Adreno

Qualcomm has dev-forums too and has a section called Mobile Gaming & Graphics Optimization (Adreno™).

qualcomm-forum-adreno

(Not @ ARM CC)

Vivante

Vivante does not have a forum, but Freescale does. The i.MX forums seem to be the best place to ask your questions.

freescale-forums

@ARM CC

Others

Where do find a good source to find and share interesting information on mobile GPUs? Share it with the others via the comments – chances increase your questions gets answered when more people visit the forums.

The Exascale rat-race vs getting-things-done with HPC

slide-12-638
IDC Forecasts 7 Percent Annual Growth for Global HPC Market – HPCwire

When the new supercomputer “Cartesius” of the Netherlands was presented to the public a few months ago, the buzz was not around FLOPS, but around users. SARA CEO Dr. Ir. Anwar Osseyran kept focusing on this aspect. The design of the machine was not pushed by getting into the TOP500, but by improving the performance of the current users’ software. This was applauded by various HPC experts, including StreamHPC. We need to get things done, not to win a virtual race of some number.

In the description about the supercomputer, the top500-position was only mentioned at the bottom of the page:

Cartesius entered the Top500 in November 2013 at position 184. This Top500 entry only involves the thin nodes resulting in a Linpack performance (Rmax) of 222.7 Tflop/s. Please note that Cartesius is designed to be a well balanced system instead of being a Top500 killer. This is to ensure maximum usability for the Dutch scientific community.

What would happen if you go for a TOP500 supercomputer? You might get a high energy bill and an overpriced, inefficient supercomputer. The first months you will not have full usage of the machine, and you won’t be able to easily turn off some parts, hence the spill of electricity. This results, finally, in that it is better to run unoptimized code on the cluster than to take time for coding.

The inefficiency is due to the fact that some software is data-transfer limited and other is compute-limited. No need to explain that if you go for a Top 500 and not for software optimized design, you end up buying extra hardware to get all kinds of algorithms performing. Cartesius therefore has “fat nodes” and “light nodes” to get the best bang per buck.

There is also a plan for expanding the machine over the years (on-demand growth), such that the users will remain happy instead of having an adrenaline-shot at once.

The rat-race

The HPC Top 500 is run by the company behind ISC-events. They care about their list being used, not if there is Exascale now or later. There is one company who has a particular interest in Exascale: Intel and IBM. It hardly matters anymore how it begun. What is interesting is that Intel has bought Infiniband and is collecting companies that could make them the one-stop shop for a HPC-cluster. IBM has always been strong in supercomputers with their BlueGene HPC-line. Intel has a very nice infographic on Intel+Exascale, which shows how serious they are.

But then the big question comes: did all this pushing speed up the road to Exascale? Well, no… just the normal peaks and lows round the logarithmic theoretic line:

Top500-exponential-growth
source: CNET

What I find interesting in this graph is that the #500 line is diverging from the #1 line. With GPGPU is would was quite easy to enter the top 500 3 years ago.

Did the profits rise? Yes. While PC-sales went down, HPC-revenues grew:

Revenues in the high-performance computing (HPC) server space jumped 7.7 percent last year to $11.1 billion surpassing the $10.3 billion in revenues generated in 2011, according to numbers released by IDC March 21. This came despite a 6.8 percent drop in shipments, an indication of the growing average selling prices in the space, the analysts said. (eWeek.)

So, mainly the willingness of buying HPC has increased. And you cannot stay behind when the rest of the world is focusing on Exascale, can you?

Read more

Keep your feet on the ground and focus on what matters: papers and answers to hard questions.

Did you solve a compute problem and got published with an sub-top250 supercomputer? Share it in the comments!

The OpenCL event of the year: IWOCL 2014 – Bristol, UK, 12 & 13 May

iwoclKhronos has supported and organised for the second time the International Workgroup on OpenCL (IWOCL, pronounced as “eye-wok-ul”). Last year the event took place at Georgia Tech, Atlanta, Georgia, in the United States. This year the event will be held in Europe: Bristol University, Bristol, England, UK.

IWOCL 2013 Presentations

Last year there was a varying programme:

  • Porting a Commercial Application to OpenCL: A Case Study
  • Demonstrating Performance Portability of a Custom OpenCL Data Mining Application to the Intel Xeon Phi Coprocessor
  • Parallelization of the Shortest Path Graph Kernel on the GPU
  • OpenCL-based Approach to Heterogeneous Parallel TSP Optimization
  • clMAGMA: High Performance Dense Linear Algebra with OpenCL
  • Multi-Architecture ISA-Level Simulation of OpenCL
  • Optimizing OpenCL Applications on the Intel Xeon Phi

You can see and download these presentations here. This year the organisation tries to offer a equally exciting programme.

Workshop means it’s an active event

It’s all about sharing, but not just by letting you sit and listen. Below you’ll find some of the options.

Present your work

Did you use OpenCL in your software or research? You are very welcome to present your experience and results. IWOCL is the premier forum for the presentation and discussion of new designs, trends, algorithms, programming models, software, tools and ideas for OpenCL.

Abstract Submission Deadline: Friday 31 January, 2014

It can be in the form of:

  • Research paper
  • Technical presentation
  • Workshops and Tutorial
  • Poster

(StreamHPC’s Vincent Hindriksen is on the Conference Sessions Committee)

Communicate with the workgroup

20-P1020816
Khronos booth at SC13 – some you will see again at IWOCL

The OpenCL workgroup likes to communicate with OpenCL’s users. IWOCL provides a formal channel for community feedback to the Khronos Group’s OpenCL workgroup. This is one of the best moments to be heard, discuss a hack/bug or share a great idea that should be in the next version of OpenCL.

Meet OpenCL developers and enthusiasts

During the breaks, social events and during presentations, you can discuss all your ideas and thoughts on on-topic and off-topic subjects, or you can also join existing talks.

If you are new into compute acceleration, you’ll find many people who are willing to explain what it does and add their personal view.

Test-drive software

We will bring some hardware, on which you can test your kernels. (We’ll put more info about this later!)

Sponsor and present your product

There will be booths available for the sponsors, where you can show your product to the public.

Stay up to date on the event

We will try keep you up-to-date as much as possible, but IWOCL has some channels to keep you informed:

We’ll put on a link when tickets are ready to be sold.

Let others know you plan to be on the event by saying hi in the comments.

Hope to see you there!

FortranCL working example

f90
The ’96 book is still available here, and has some good explanations of numerical mathematics. Oh, the good old times..

Last week I needed to get Fortran working with OpenCL. As the example-page is not up-to-date and not much documentation is on the interwebs outside the official page, this was not as straight-forward as I hoped. The test-suite and this article provided code I could actually use. First I wanted to have things in a module, second I needed to control which device I wanted to use, third I needed function-names that could be used in a larger project. The result is below, and hopefully usable for the Fortran folks around who want to add some OpenCL-kernels to their existing code.

It uses the two-step initialisation we know from C, for safe memory allocation. It is based on the utils.f90 from the test-suite.

The only good way to translate is the Rose-compiler – which is a pain to install. I tried various f2c-scripts (from the 90’s, but they all failed. I must say that continuous switching between Fortran-mode and C-mode was the hardest part of the porting.

If you have tips&tricks to use OpenCL from Fortran, let everybody know in the comments. Also let me know if the code doesn’t work for you, or you have improvements (like better error-handling).

The rest of utils.f90 (which I renamed to clutils.f90 for better integration) is mostly the same – only this subroutine needed changes:

(...)

subroutine cl_initialize(platform_id, device_id, device, context, command_queue)
!use ISO_C_BINDING
type(cl_device_id),     intent(out)     :: device
type(cl_context),       intent(out)     :: context
type(cl_command_queue), intent(out)     :: command_queue
integer                                 :: platform_id
integer                                 :: device_id

integer :: platform_count, device_count, ierr
character(len = 100) :: info
type(cl_platform_id) :: platform
type(cl_platform_id), allocatable, target :: platform_ids(:)
type(cl_device_id), allocatable, target :: device_ids(:)

! get the platform ID
call clGetPlatformIDs(platform_count, ierr)
if(ierr /= CL_SUCCESS) call error_exit('Cannot get CL platform.')
allocate(platform_ids(platform_count))
call clGetPlatformIDs(platform_ids, platform_count, ierr)
if(ierr /= CL_SUCCESS) call error_exit('Cannot get CL platform.')

if (platform_id .gt. platform_count .or. platform_id .lt. 1) platform_id = 0
platform = platform_ids(platform_id)

! get the device ID
call clGetDeviceIDs(platform, CL_DEVICE_TYPE_ALL, device_count, ierr)
if(ierr /= CL_SUCCESS) call error_exit('Cannot get CL device.')
allocate(device_ids(device_count))
call clGetDeviceIDs(platform, CL_DEVICE_TYPE_ALL, device_ids, device_count, ierr)
if(ierr /= CL_SUCCESS) call error_exit('Cannot get CL device.')

if (device_id .gt. device_count .or. device_id .lt. 1) device_id = 1
device = device_ids(device_id)

! get the device name and print it
call clGetDeviceInfo(device, CL_DEVICE_NAME, info, ierr)
print*, "CL device: ", info

! create the context and the command queue
context = clCreateContext(platform, device, ierr)
command_queue = clCreateCommandQueue(context, device, CL_QUEUE_PROFILING_ENABLE, ierr)

end subroutine cl_initialize

(...)

Continue reading “FortranCL working example”

PRACE Spring School 2014

prace-spring-school-2014On 15 – 17 April 2014 a 3-day workshop around HPC is organised. It is free, and focuses on bringing industry and academy together.

Research Institute for Symbolic Computation (RISC) / Johannes Kepler University Linz Kirchenplatz 5b (Castle of Hagenberg) 4232 Hagenberg Austria

The PRACE Spring School 2014 will take place on 15 – 17 April 2014 at the Castle of Hagenberg in Austria. The PRACE Seasonal School event is hosted and organised jointly by the Research Institute for Symbolic Computation / Johannes Kepler University Linz (Austria), IT4Innovations / VSB-Technical University of Ostrava (Czech Republic) and PRACE.

The 3-day program includes:

  • A 1-day HPC usage for Industry track bringing together researchers and attendees from industry and academia to discuss the variety of applications of HPC in Europe.
  • Two 2-day tracks on software engineering practices for parallel & emerging computing architectures and deep insight into solving multiphysical problems with Elmer on large-scale HPC resources with lecturers from industry and PRACE members.

The PRACE Spring School 2014 programme offers a unique opportunity to bring users, developers and industry together to learn more about efficient software development for HPC research infrastructures. The program is free of charge (not including travel and accommodations).

Applications are open to researchers, academics and industrial researchers residing in PRACE member countries, and European Union Member States and Associated Countries. All lectures and training sessions will be in English.

Applications are open to researchers, academics and industrial researchers residing in PRACE member countries, and European Union Member States and Associated Countries. All lectures and training sessions will be in English. Please visit http://prace-ri.eu/PRACE-Spring-School-2014/ for more details and registration.

At StreamHPC we support such initiatives.

Intel promotes OpenCL as THE heterogeneous compute solution

opencl-intel-videoAt Intel they have CPUs (Xeon, Ivy Bridge), GPUs (Isis) and Accelerators (Xeon Phi). OpenCL enables each processor to be used to the fullest and they now promote it as such. Watch the below video and see their view on why OpenCL makes a difference for Intel’s customers.

This is important, because till recently Intel was more pushing OpenMP and their proprietary solutions. I think it has something to do with the specialised processors that can be programmed with OpenCL, such as DSPs and FPGAs. Intel has always made generic processors that solve problems best for most. Customers of OpenCL happen to be the ones that could not be served with generic processors and preferred FPGAs and DSPs, before they tried GPUs. By showing that Intel can do OpenCL, they show they are a trustworthy partner to handle the problems in a few years, when  the current problems can be handled by Intel processors.

Of course the Xeon Phi is also a good reason. The latest drivers have shown a huge improvement in performance, and that has increased Intel’s confidence in OpenCL for sure.

At StreamHPC we are very happy that Intel now openly promotes OpenCL and invests in it – this will increase trust in the programming language.

A small side-note. The differences between the Windows-drivers and Linux-drivers are somewhat vague: under Linux, the CPU is visible, but not supported officially. This makes development of multi-processor software not as straightforward as discussed in the video. Probably this will be more extensive in the future, as Intel only officially supports OpenCL on a processor when it’s very stable.