What does Khronos has more to offer than OpenCL and OpenGL?

opencl_from_accelerate_your_worldThe OpenCL standard is from the not-for-profit industry consortium Khronos Group. But they do a lot more, like the famous standard OpenGL for graphics. Focus of the group has always been on multimedia and getting the fastest results out of the hardware.

Now open source and open standards are getting more important, collabroations like the Khronos Group, get more attention. At StreamHPC we are very happy with this trend, as the business models are more focused on collaborations and getting things done than on making sure the customer cannot ever leave.

Below is an overview of the most important APIs that Khronos has to offer.

OpenCL related

  • OpenCL: compute
  • WebCL: web compute
  • SPIR/SPIR-V: intermedia language for compute-kernels, like those of OpenCL and OpenGL’s GSLS
  • SYCL: high-level language for OpenCL

OpenGL related

  • Vulkan: state-less graphics
  • OpenGL: graphics
  • OpenGL ES: embedded graphics
  • WebGL: web graphics
  • glTF: runtime asset format for WebGL, OpenGL ES, and OpenGL
  • OpenGL SC: Graphics for Safety Critical operations
  • EGL: interface between rendering APIs such as OpenGL ES and the underlying native platform window system, such as X.

Streaming input and output

  • OpenMAX: interface for multimedia codecs, platforms and hardware
  • StreamInput: interface for sensors
  • OpenVX: OpenCV-alternative, built for performance.
  • OpenKCam: interface for cameras and sensors

Others

One video called “OpenRoad” to show them all:

http://www.youtube.com/watch?v=ckD0op6OgMQ

Want to learn more? Feel free to ask in the comments, or check out https://www.khronos.org/

Privacy Policy

Who we are

We are a group of companies, based in the Netherlands, Hungary and Spain. We help our customers get their code run fast by optimizing the computations and using accelerators. We do this since 2010.

Comments

When visitors leave comments on the site we collect the data shown in the comments form, and also the visitor’s IP address and browser user agent string to help spam detection.

An anonymised string created from your email address (also called a hash) may be provided to the Gravatar service to see if you are using it. The Gravatar service Privacy Policy is available here: https://automattic.com/privacy/. After approval of your comment, your profile picture is visible to the public in the context of your comment.

Forms

Form-data is sent to self-hosted software and is not read by any third-party party.

Tracking

We use anonymized tracking to find out:

  • Which pages are visited how often
  • Which subjects are popular
  • Which pages are clicked through
  • From which countries or states the visitors are

During a visit/session, you get a random ID.

Cookies

If you leave a comment on our site you may opt in to saving your name, email address and website in cookies. These are for your convenience so that you do not have to fill in your details again when you leave another comment. These cookies will last for one year.

Tracking cookies last for 24 hours.

Embedded content from other websites

Articles on this site may include embedded content (e.g. videos, images, articles, etc.). Embedded content from other websites behaves in the exact same way as if the visitor has visited the other website.

These websites may collect data about you, use cookies, embed additional third-party tracking, and monitor your interaction with that embedded content, including tracking your interaction with the embedded content if you have an account and are logged in to that website.

Who we share your data with

None of the data is shared with any third party. Marketing reports don’t contain any personal data.

How long we retain your data

If you leave a comment, the comment and its metadata are retained indefinitely. This is so we can recognize and approve any follow-up comments automatically instead of holding them in a moderation queue.

Anonymous tracking data is not thrown away, to find trends over the years.

What rights you have over your data

If you have left comments, you can request to receive an exported file of the personal data we hold about you, including any data you have provided to us. You can also request that we erase any personal data we hold about you. This does not include any data we are obliged to keep for administrative, legal, or security purposes.

Where your data is sent

Visitor comments and forms are checked through an automated spam detection service, ReCAPTCHA and Akismet.

Reporting problems

We are not in the business of monetizing user data, and believe in finding new customers through content.

As software and plugins change after updates, we are sometimes surprised that more is collected than we configured.

If anything is incorrect or not legal, please email to privacy@streamhpc.com. If you have generic questions, go to the contact page or email to info@streamhpc.com.

Porting Manchester’s UNIFAC to OpenCL@XeonPhi: 160x speedup

Example of modelled versus measured water activity ('effective' concentration) for highly detailed organic chemical representation based on continental studies using UNIFAC
Example of modelled versus measured water activity (‘effective’ concentration) for highly detailed organic chemical representation based on continental studies using UNIFAC

As we cannot use the performance results for most of our commercial projects because they contain sensitive data, we were happy that Dr. David Topping from the University of Manchester was so kind to allow us to share the data for the UNIFAC project. The goal for this project was simple: port the UNIFAC algorithm to the Intel XeonPhi using OpenCL. We got a total of 485x speedup: 3.0x for going from single-core to multi-core CPU, 53.9x for implementing algorithmic, low-level improvements and a new memory layout design, and 3.0x for using the XeonPhi via OpenCL. To remain fair, we used the 160x speedup from multi-core CPU in the title, not from serial code. Continue reading “Porting Manchester’s UNIFAC to OpenCL@XeonPhi: 160x speedup”

Social Media: Facebook, LinkedIn and Twitter

SC-Facebook

Facebook

We have presence on Facebook, via a company page: StreamHPC

Also check out Khronos’ OpenCL fanpage to hear more news on OpenCL.

LinkedIn

Via http://www.linkedin.com/company/StreamHPC you can hear more about company-specific news. It has the news comparable to the newsletter.

Twitter

You can also follow us on Twitter. We have several accounts:
[columns]
[one_half title=”General”]
 StreamHPC.
Our main account. Everything GPGPU, OpenCL and extreme software performance. .
OpenCL:Pro
with focus on jobs and internships. .
OpenCLHPC.
on OpenCL usage in HPC. .
WebCLNews
on the current state of WebCL. .
OpenCLGuru
to answer your questions on OpenCL – at your service. .
[/one_half]
[one_half title=”Hardware specific”]

OpenCLonAMD
on the current state of OpenCL on AMD-processors. .
 OpenCLonARM
on the current state of OpenCL on ARM-processors. .
 OpenCLonFPGAs
on the current state of OpenCL on FPGAs. .
 OpenCLonDSPs
on the current state of OpenCL on DSPs. .
OpenCLonRISC
on the current state of OpenCL on RISC. .
[/one_half]
[/columns]
We hope you enjoy our Twitter channels! If you have suggestions, just tweet us!

OpenCL on Altera FPGAs

On 15 November 2011 Altera announced support for OpenCL. The time between announcements for having/getting OpenCL-support and getting to see actually working SDKs takes always longer than expected, so to get this working on FPGAs I did not expect anything before 2013. Good news: the drivers are actually working (if you can trust the demos at presentations).

There have been three presentations lately:

In this article I share with you what you should not have missed on these sheets, and add some personal notes to it.

Is OpenCL the key that finally makes FPGAs not tomorrow’s but today’s technology?

Continue reading “OpenCL on Altera FPGAs”

FortranCL working example

f90
The ’96 book is still available here, and has some good explanations of numerical mathematics. Oh, the good old times..

Last week I needed to get Fortran working with OpenCL. As the example-page is not up-to-date and not much documentation is on the interwebs outside the official page, this was not as straight-forward as I hoped. The test-suite and this article provided code I could actually use. First I wanted to have things in a module, second I needed to control which device I wanted to use, third I needed function-names that could be used in a larger project. The result is below, and hopefully usable for the Fortran folks around who want to add some OpenCL-kernels to their existing code.

It uses the two-step initialisation we know from C, for safe memory allocation. It is based on the utils.f90 from the test-suite.

The only good way to translate is the Rose-compiler – which is a pain to install. I tried various f2c-scripts (from the 90’s, but they all failed. I must say that continuous switching between Fortran-mode and C-mode was the hardest part of the porting.

If you have tips&tricks to use OpenCL from Fortran, let everybody know in the comments. Also let me know if the code doesn’t work for you, or you have improvements (like better error-handling).

The rest of utils.f90 (which I renamed to clutils.f90 for better integration) is mostly the same – only this subroutine needed changes:

(...)

subroutine cl_initialize(platform_id, device_id, device, context, command_queue)
!use ISO_C_BINDING
type(cl_device_id),     intent(out)     :: device
type(cl_context),       intent(out)     :: context
type(cl_command_queue), intent(out)     :: command_queue
integer                                 :: platform_id
integer                                 :: device_id

integer :: platform_count, device_count, ierr
character(len = 100) :: info
type(cl_platform_id) :: platform
type(cl_platform_id), allocatable, target :: platform_ids(:)
type(cl_device_id), allocatable, target :: device_ids(:)

! get the platform ID
call clGetPlatformIDs(platform_count, ierr)
if(ierr /= CL_SUCCESS) call error_exit('Cannot get CL platform.')
allocate(platform_ids(platform_count))
call clGetPlatformIDs(platform_ids, platform_count, ierr)
if(ierr /= CL_SUCCESS) call error_exit('Cannot get CL platform.')

if (platform_id .gt. platform_count .or. platform_id .lt. 1) platform_id = 0
platform = platform_ids(platform_id)

! get the device ID
call clGetDeviceIDs(platform, CL_DEVICE_TYPE_ALL, device_count, ierr)
if(ierr /= CL_SUCCESS) call error_exit('Cannot get CL device.')
allocate(device_ids(device_count))
call clGetDeviceIDs(platform, CL_DEVICE_TYPE_ALL, device_ids, device_count, ierr)
if(ierr /= CL_SUCCESS) call error_exit('Cannot get CL device.')

if (device_id .gt. device_count .or. device_id .lt. 1) device_id = 1
device = device_ids(device_id)

! get the device name and print it
call clGetDeviceInfo(device, CL_DEVICE_NAME, info, ierr)
print*, "CL device: ", info

! create the context and the command queue
context = clCreateContext(platform, device, ierr)
command_queue = clCreateCommandQueue(context, device, CL_QUEUE_PROFILING_ENABLE, ierr)

end subroutine cl_initialize

(...)

Continue reading “FortranCL working example”

Qt Creator OpenCL Syntax Highlighting

With highlighting for Gedit, I was happy to give you the convenience of a nice editor to work on OpenCL-files. But it seems that one of the most popular IDEs for C++-programming is Qt Creator. So you receive another free syntax highlighter. You need at least Qt Creator 2.1.0.

The people of Qt have written everything you need to know about their Syntax highlighting, which was enough help to create this file. You see that they use the system of Kate, so logically this file works with this editor too.

In this article there is all you need to know to use Qt Creator with OpenCL.

Installing

First download the file to your computer.

Under Windows and OSX you need to copy this file to the directory shareqtcreatorgeneric-highlighter in the Qt installation dir (i.e. c:Qtqtcreator-2.2.1shareqtcreatorgeneric-highlighter). Under Linux copy this file to ~/.kde/share/apps/katepart/syntax or to /usr/share/kde4/apps/katepart/syntax (all users). That’s all, have fun!

AMD GPUs & CPUs

[infobox type=”information”]

Need a programmer for OpenCL on AMD FirePro, Radeon or APU? Hire us!

[/infobox]

AMD has support for all their recent GPUs and CPUS, and has good performance on products starting from 2010/2011:

[list1]

[/list1]
AMD does not provide a standard SDK kit which contains both hardware and software, as their hardware is available at many computer-shops.

SDK

The OpenCL SDK (software) needs to be downloaded in several steps:

[list1]

[/list1]

CodeXL replaces the following software in de AMD APP software family:

[list1]

[/list1]
These are still available for download.

Training

There is (free) training material available:

[list1]

[/list1]

Other AMD software for OpenCL

The APP Math Libraries contain FFT and BLAS functions optimised for AMD GPUs.

OpenCL-in-Java can be done using Aparapi.

Support matrix of Compute SDKs

Multi-Core Processors and the SDKs

The empty boxes tell IBM and ARM have a lot of influence. With NVIDIA’s current pace with introducing new products (hardware and CUDA), they could also take on ARM.

The matrix is restricted to current better-known compute technologies OpenCL, CUDA, Intel ArrBB, Pathscale ENZO, MS DirectCompute and AccelerEyes JacketLib.

X = All OSes, including MAC
D = Developer (private alpha or private beta)
P = Planned (as i.e. stated in Intel’s Q&A)
U = Unofficial (IBM’s OpenCL-SDK is promoted for their POWER-line)
L = Linux-only
W= Windows-only
? = Unknown if planned

Continue reading “Support matrix of Compute SDKs”

Google blocked OpenCL on Nexus with Android 4.3

renderscript-eats-openclImportant: this is only for Google-branded Nexus phones – other brands are free to do what they want, and they most-probably will.

Also important: this doesn’t mean that OpenCL on Android devices will be over, but that there is a bump in the road now Google tries to lock-in customers to their own APIs.

The big idea behind OpenCL is that higher level languages and libraries can be built on top of it. This is exactly what was done under Android: RenderScript Compute (a higher-level language)  was implemented using OpenCL for ARM Mali GPUs.

Having OpenCL drivers on Android has several advantages, such that OpenCL can directly be used on Android and that there is room for other high-level languages that have OpenCL as back-end. Especially the latter is what probably made Google decide to cripple the OpenCL-drivers.

Google seems to be afraid of competition, and that’s a shame, as competition is the key factor that drives innovation. The OpenCL community is not the only one complaining about Google’s intentions concerning Android. Read page 3 of that article to understand how Google is controlling handset-vendors and chip-makers.

Google’s statement

In February OpenCL drivers were discovered on two Nexus tablets using a MALI T604 GPU. Around the same time there was one public answer from Google employee Tim Murray (twitter) why Google did not want to choose OpenCL: Continue reading “Google blocked OpenCL on Nexus with Android 4.3”

An OpenCL-on-FPGAs presentation in a bar

What do you do when you want to explain OpenCL and FPGAs and OpenCL-on-FPGAs to a beer drinking crowd in just 15 minutes? Well, you simply can’t go deep into the matter. On a Thursday evening, 5 November 2015,  I was standing on a chair for a beer-loving group of Hackers and Founders with my laser-powered presenter, trying not to loose everybody. It was not the first time I stood on that particular chair – some years ago I presented about OpenCL-on-GPUs.

Below you find the full presentation – feel free to use and change the slides for yourself (PDF here).

Do you want us to present OpenCL, accelerators or performance engineering in a talk tailored for your audience? Just give us a call.

OpenCL integer rounding in C

Square_rounding
Square pant rounding can simply be implemented with “return (NAN);“.

Getting about the same code in C and OpenCL has lots of advantages, when maximum optimisations and vectors are not needed. One thing I bumped into myself was that rounding in C++ is different, and decided to implement the OpenCL-functions for rounding in C.

The OpenCL-page for rounding describes many, many functions with this line:

destType convert_destType<_sat><_roundingMode>(sourceType)

So for each sourceType-destType combination there is a set of functions: 4 rounding modes and an optional saturation. Easy in Ruby to define each of the functions, but takes a lot more time in C.

The 4 rounding modes are:

Modifier Rounding Mode Description
_rte Round to nearest even
_rtz Round towards zero
_rtp Round toward positive infinity
_rtn Round toward negative infinity

The below pieces of code should also explain what the functions actually do.

Round to nearest even

This means that the numbers get rounded to the closest number. In case of 3.5 and 4.5, they both round to the even number 4. Thanks for Dithermaster, for pointing out my wrong assumption and clarifying how it should work.

inline int convert_int_rte (float number) {
   int sign = (int)((number > 0) - (number < 0));
   int odd = ((int)number % 2); // odd -> 1, even -> 0
   return ((int)(number-sign*(0.5f-odd)));
}

I’m sure there is a more optimal implementation. You can fix that in Github (see below).

Round to zero

This means that positive numbers are rounded up, negative numbers are rounded down. 1.6 becomes 1, -1.6 also becomes 1.

inline int convert_int_rtz (float number) {
   return ((int)(number));
}

Effectively, this just removes everything behind the point.

Round to positive infinity

1.4 becomes 2, -1.6 becomes 1.

inline int convert_int_rtp (float number) {
   return ((int)ceil(number));
}

Round to negative infinity

1.6 becomes 1, -1.4 becomes 2.

inline int convert_int_rtp (float number) {
   return ((int)floor(number));
}

Saturation

Saturation is another word for “avoiding NaN”. It makes sure that numbers are between INT_MAX and INT_MIN, and that NaN returns 0. If not used, the outcome of the function can be anything (-2147483648 in case of convert_int_rtz(NAN) on my computer). Saturation is more expensive, so therefore it’s optional.

inline float saturate_int(float number) {
  if (isnan(number)) return 0.0f; // check if the number was already NaN
  return (number>MAX_INT ? (float)MAX_INT : number

Effectively the other functions become like:

inline int convert_int__sat_rtz (float number) {
   return ((int)(saturate_int(number)));
}

Doubles, longs and getting started.

Yes, you need to make functions for all of these. But you could ofcourse also check out the project on Github (BSD licence, rudimentary first implementation).

You’re free to make a double-version of it.

Copyright

All content, media, theme and blogs are copyright 2010-2012 StreamHPC and Vincent Hindriksen, unless otherwise stated. For questions about using material for your own business, blog or personal usage, please contact us to ask for permission. We protect our copyrights by any means necessary.

OpenCL is a trademark of Apple Computers Inc.

We work a lot with open source software, such as WordPress and Eclipse. The brochures are created with Inkscape and svgslides. We believe that the base of innovation should be for everybody, so everybody can build on top of that. That’s why you get all the information from the blog for free, as sharing information could give us necessary information back in return.

Used photos

Many photos link to the origin and sometimes tell a story or show the webpage of an artist. The images bellow are not linked, as they are used in the slider.

All other photos and images are bought from paid services: 123RF and Big Stock Photo. Please contact us if you would like a to know the link of a certain photo or image.

Aparapi: OpenCL in Java

Edit: Aparapi has been open sourced and many issues have already been fixed and improved.

If you have an AMD GPU/APU, you should try Aparapi. This software lets you write OpenCL-code in Java pretty high-level. The idea is that is sort of that it processes the Java intermediate code to search for loops and then create optimised OpenCL-kernels. Just download Aparapi and try the two examples. As the current version is still in alpha, it is not flawless yet. What I think is important when having worked with Aparapi is that you learn how to keep it simple – like you know that you can gain most speed on straight roads and turns slow down.

The Aparapi-team tries to avoid explicit defining of local memory, but it is still possible by using the @Local annotation. Such decisions show the team wants Aparapi to be high-level. It also integrates well with JavaCL and JOCL, so for the kernels you already have created, you can mix. You can also check out a video introducing Aprapi (it is video 15, if #-linking doesn’t work).

Time to create your own project. As not all errors are documented or are solved in the upcoming version, below you will find a list of common errors and how to easily solve them.

Continue reading “Aparapi: OpenCL in Java”

NVIDIA: mobile phones, tablets and HPC (cloud)

If you want to see what is coming up in the market of consumer-technology (PC, mobile and tablet), then NVIDIA can tell you the most. The company is very flexible, and shows time after time it really knows in which markets is currently operates and can enter. I sometimes strongly disagree with their marketing, but watch them closely as they are in the most important markets to define the near future in: PCs, Mobile/Tablet and HPC.
You might think I completely miss interconnects (buses between processors, devices and memory) and memory-technologies as clouds have a large need for high-speed data-transport, but the last 20 years have shown that this is a quite stable developing market based on IP-selling to the hardware-vendors. With the acquisition of Cray’s interconnect technology, we have seen this is serious business for Intel, so things might change indeed. For this article I want to focus on NVIDIA’s choices.

Improving FinanceBench for GPUs Part II – low hanging fruit

We found a finance benchmark for GPUs and wanted to show we could speed its algorithms up. Like a lot!

Following the initial work done in porting the CUDA code to HIP (follow article link here), significant progress was made in tackling the low hanging fruits in the kernels and tackling any potential structural problems outside of the kernel.

Additionally, since the last article, we’ve been in touch with the authors of the original repository. They’ve even invited us to update their repository too. For now it will be on our repository only. We also learnt that the group’s lead, professor John Cavazos, passed away 2 years ago. We hope he would have liked that his work has been revived.

Link to the paper is here: https://dl.acm.org/doi/10.1145/2458523.2458536

Scott Grauer-Gray, William Killian, Robert Searles, and John Cavazos. 2013. Accelerating financial applications on the GPU. In Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units (GPGPU-6). Association for Computing Machinery, New York, NY, USA, 127–136. DOI:https://doi.org/10.1145/2458523.2458536

Improving the basics

We could have chosen to rewrite the algorithms from scratch, but first we need to understand the algorithms better. Also, with the existing GPU-code we can quickly assess what are the problems of the algorithm, and see if we can get to high performance without too much effort. In this blog we show these steps.

Continue reading “Improving FinanceBench for GPUs Part II – low hanging fruit”

Imagination Technologies PowerVR

iamgination-tec-640_large

[infobox type=”information”]

Need a PowerVR programmer? Hire us!

[/infobox]

Currently there are two  PowerVR GPU architectures with OpenCL support: the 5 series (scroll down) and the 6 series (introduced in 2014).

PowerVR 6

In 2013 companies will launch processors using IP from Imagination Technologies, the PowerVR G6230 and G6430. Named licensees are:

  • ST-Ericcson: NovaThor A9600 – not available yet or even mentioned on their own webpage.
  • Texas Instruments: no products anounced, latest OMAP5-series are PowerVR 5 based.
  • Renesas Electronics: no products announced, latest are based on Series5 (SGX54x and 53x).
  • MediaTek: no products anounced, latest MT6577 is based on Series5.
  • HiSilicon: licensed, but no products announced.

While a lot of news was around this platform, it has been delayed several times. Their latest designs in the series are running on an FPGA and more details will be given at CES 2013 (source).

powervr_series6_architecture_original

Performance

Below you see where the PowerVR6 stands. It is clocked much higher (from 250MHz to 600MHz), which suggests it will be baked sub-sub 45 nm. The PowerVR 5 is used in for example the iPad2 and delivers around 70GFlops. The PowerVR6 G62x0 is promised to deliver 200GFlops and up. The TFLOPS barrier is promised to be broken with the series.

3_cpu_vs_gpu_GFLOPS_bars
Comparison between PowerVR 5 and 6 series.

 http://withimagination.imgtec.com/powervr/powervr-series6xe-gpus-bring-opengl-es-3-0-graphics-everyone

http://blog.imgtec.com/news/accelerate-design-closure-for-ip-cores-from-imagination-dok-design-flows

 

The below image shows the 6-series are baked on 32nm and below. It shows different series-identifiers though. The “3” in G6x30 is the addition of “frame buffer compression logic” (source).

A-FHVSlCYAAdd1n

PowerVR 5

The chipset that currently dominates mobile devices, from the PSP to tablets to phones to Apple iPad.

Drivers

Imagination only sells IP and refers to their licensees for driver-support. If you have ideas what to do with OpenCL -on-PowerVR, you can request for an NDA here.

Texas Instruments has the drivers available, but only under an SLA. Contact your TI-representative for the most recent information.

Samsung delivers drivers with their Exynos 5 Octa development board (Odroid XU).

Boards

Let me know if you know a TI-board and can give me a description of the business-requirements to get hands on drivers.

Exynos 5410: ODROID-XU

  • ARM Cortex-A15 Quad 1.6GHz + Cortex-A7 Quad 1.2GHz
  • PowerVR 5 SGX544 MP3 GPU
  • 2GB LPDDR3 (12.8GB/s memory bandwidth)
  • Lots of IO-ports (see image below). No wifi without dongle.
  • CCI-400 bug seems to be fixed (source). Not clear how.

201307292206337254

OpenCL info from the FAQ:

Which OpenGL and OpenCL are included in Android?
OpenGL ES 1.1 and OpenGL ES 2.0
OpenCL 1.1 Embedded Profile

Will It run Ubuntu or other Linux distros?
Currently we supports only Ubuntu 13.04 server version with only serial console.
We need to develop HDMI/LCD driver for Xorg display.
We are trying to release Linux BSP with OpenGL/OpenCL in Q4 of 2013.

Buy here. Forum here.

Drivers

See this page for explanation how to write the images. The links don’t get updated, so check the above links for the latest versions.

Minimum price for board+eMMC+shipping is $273,-. Price including some needed (eMMC, shipping), possibly needed (HDMI, USB-UART) and convenience add-on’s (SD, wifi) is:

odroid-cart

Board includes adaptor and case. An eMMC is needed for a fast OS, but a micro-SD also works – although slower.

Without the integrated power analysis tool, it’s $30,- less. Ordering only the board is $10 less shipping.

Devices

Once I get more (public) info on OpenCL-drivers, this section will be extended.

Kindle Fire HD

As seen on Engadget, Imagination Technologies is working with Amazon and Texas Instruments to deliver OpenCL-enabled Kindle Fire HDs.

http://www.youtube.com/watch?v=-twOwM4LP9o

The chipset is a OMAP 4470 by Texas Instruments, which contains a PowerVR SGX544 GPU running at 384MHz. It only delivers 24.5GFLOPS.

ST-Ericsson NovaThor LP9600 (Nova A9600)

This chipset has a dual-core ARM Cortex-A15 2,3 GHz, 28 nm, PowerVR series6 and “d-channel LP-DDR2”. It should become available in Q1 2013.

Since Ericcson will leave ST-Ericcson after a transition period, it is unclear if the chipset is delayed.

 

Imagination

Imagination is best known for their GPUs in Apples iDevices. They have support for:

  • OpenCL
  • Apple Metal
  • Vulkan
  • OpenGL
  • Google RenderScript

Imagination is a strong supporter of Khronos APIs OpenCL and Vulkan.

OpenCL

Currently there are two PowerVR GPU architectures with OpenCL support: the 5 series (scroll down) and the 6 series (introduced in 2014).

PowerVR 6

In 2013 companies will launch processors using IP from Imagination Technologies, the PowerVR G6230 and G6430. Named licensees are:

  • ST-Ericcson: NovaThor A9600 – not available yet or even mentioned on their own webpage.
  • Texas Instruments: no products anounced, latest OMAP5-series are PowerVR 5 based.
  • Renesas Electronics: no products announced, latest are based on Series5 (SGX54x and 53x).
  • MediaTek: no products anounced, latest MT6577 is based on Series5.
  • HiSilicon: licensed, but no products announced.

While a lot of news was around this platform, it has been delayed several times. Their latest designs in the series are running on an FPGA and more details will be given at CES 2013 (source).

powervr_series6_architecture_original

Performance

Below you see where the PowerVR6 stands. It is clocked much higher (from 250MHz to 600MHz), which suggests it will be baked sub-sub 45 nm. The PowerVR 5 is used in for example the iPad2 and delivers around 70GFlops. The PowerVR6 G62x0 is promised to deliver 200GFlops and up. The TFLOPS barrier is promised to be broken with the series.

3_cpu_vs_gpu_GFLOPS_bars
Comparison between PowerVR 5 and 6 series.

 http://withimagination.imgtec.com/powervr/powervr-series6xe-gpus-bring-opengl-es-3-0-graphics-everyone

http://blog.imgtec.com/news/accelerate-design-closure-for-ip-cores-from-imagination-dok-design-flows

The below image shows the 6-series are baked on 32nm and below. It shows different series-identifiers though. The “3” in G6x30 is the addition of “frame buffer compression logic” (source).

A-FHVSlCYAAdd1n

PowerVR 5

The chipset that currently dominates mobile devices, from the PSP to tablets to phones to Apple iPad.

Drivers

Imagination only sells IP and refers to their licensees for driver-support. If you have ideas what to do with OpenCL -on-PowerVR, you can request for an NDA here.

Texas Instruments has the drivers available, but only under an SLA. Contact your TI-representative for the most recent information.

Samsung delivers drivers with their Exynos 5 Octa development board (Odroid XU).

Boards

Let me know if you know a TI-board and can give me a description of the business-requirements to get hands on drivers.

Exynos 5410: ODROID-XU

  • ARM Cortex-A15 Quad 1.6GHz + Cortex-A7 Quad 1.2GHz
  • PowerVR 5 SGX544 MP3 GPU
  • 2GB LPDDR3 (12.8GB/s memory bandwidth)
  • Lots of IO-ports (see image below). No wifi without dongle.
  • CCI-400 bug seems to be fixed (source). Not clear how.

201307292206337254

OpenCL info from the FAQ:

Which OpenGL and OpenCL are included in Android?
OpenGL ES 1.1 and OpenGL ES 2.0
OpenCL 1.1 Embedded Profile

Will It run Ubuntu or other Linux distros?
Currently we supports only Ubuntu 13.04 server version with only serial console.
We need to develop HDMI/LCD driver for Xorg display.
We are trying to release Linux BSP with OpenGL/OpenCL in Q4 of 2013.

Buy here. Forum here.

Drivers

See this page for explanation how to write the images. The links don’t get updated, so check the above links for the latest versions.

Minimum price for board+eMMC+shipping is $273,-. Price including some needed (eMMC, shipping), possibly needed (HDMI, USB-UART) and convenience add-on’s (SD, wifi) is:

odroid-cart

Board includes adaptor and case. An eMMC is needed for a fast OS, but a micro-SD also works – although slower.

Without the integrated power analysis tool, it’s $30,- less. Ordering only the board is $10 less shipping.

Devices

Once I get more (public) info on OpenCL-drivers, this section will be extended.

Kindle Fire HD

As seen on Engadget, Imagination Technologies is working with Amazon and Texas Instruments to deliver OpenCL-enabled Kindle Fire HDs.

http://www.youtube.com/watch?v=-twOwM4LP9o

The chipset is a OMAP 4470 by Texas Instruments, which contains a PowerVR SGX544 GPU running at 384MHz. It only delivers 24.5GFLOPS.

4 October talk in Amsterdam on mobile compute

Thursday 4 October I talk on mobile compute at Hackers&Founders Amsterdam on what mobile compute can do. The goal is to initiate new ideas for start-ups, as not many know their mobile phone and tablet is very powerful and next year can be used for compute intensive tasks.

The other talk is from Mozilla on Firefox OS (Edit: it was cancelled), which is actually reason enough to visit this Hackers&Founders Meetup. Entrance is free, drinks are not. Alternatively you could go to the Hadoop User Group Meetup at Science Park, Amsterdam.

Continue reading “4 October talk in Amsterdam on mobile compute”

Memberships

We are active in several foundations, communities and collaborations. Below is an overview.

Khronos

Khronos_500px_Dec14

Associate member of Khronos, the non-profit technology consortium that maintains important languages like OpenCL, OpenGL, SPIR and Vulkan.

The Khronos Group was founded in 2000 to provide a structure for key industry players to cooperate in the creation of open standards that deliver on the promise of cross-platform technology. Today, Khronos is a not for profit, member-funded consortium dedicated to the creation of royalty-free open standards for graphics, parallel computing, vision processing, and dynamic media on a wide variety of platforms from the desktop to embedded and safety critical devices.

High Tech NL

High Tech NL is the sector organization by and for innovative Dutch high-tech companies and knowledge institutes. High Tech NL is committed to the collective interests of the sector, with a focus on long-term innovation and international collaboration.

We’re a member because HighTech NL is one of the few organizations that understands IT is far more than digitisation. Our main focus there is robotics.

HSA Foundation

HSA-logo

Heterogeneous System Architecture (HSA) Foundation is a not-for-profit industry standards body focused on making it dramatically easier to program heterogeneous computing devices. The consortium comprises various semiconductor companies, tools providers, software vendors, IP providers, and academic institutions that develops royalty-free standards and open-source software.

HiPEAC

HiPEAC’s mission is to steer and increase the European research in the area of high-performance and embedded computing systems, and stimulate (international) collaborations.

We’ve sponsored multiple conferences over the years.

ETP4HPC

ETP4HPC is the European Technology Platform (ETP) in the area of High-Performance Computing (HPC). It is an industry-led think-tank comprising of European HPC technology stakeholders: technology vendors, research centres and end-users. The main objective of ETP4HPC is to define research priorities and action plans in the area of HPC technology provision (i.e. the provision of supercomputing systems).

OpenPower

OpenPOWER Foundation is an open, not-for-profit technical membership group incorporated in December 2013. It was incepted to enable today’s data centers to rethink their approach to technology. OpenPOWER was created to develop a broad ecosystem of members that will create innovative and winning solutions based on POWER architecture.

OpenCL.org

opencl-logoLast year we bought OpenCL.org with the purpose to support the OpenCL community and OpenCL-focused companies. In january we launched the first community-project on the website, porting GEGL to OpenCL. See below for more info.

The knowledge section of our homepage will be moved to the OpenCL.org website, but still be maintained by us.

GEGL project

GEGL is a free/libre graph based image processing framework used by GIMP, GNOME Photos, and other free software projects.

In january 2016 we launched an educational initiative that aims to get more developers to study and use OpenCL in their projects. Within this project, up to 20 collaborators will port as many GEGL operations to OpenCL as possible.

The goal of this project is to seek a way for a group to educate themselves in OpenCL, while supporting an open source project. One of the ways is to gamify the porting by benchmarking the kernels and defining winners, and another way is to optimize kernels within StreamHPC to push the limits. Victor Oliveira, who wrote most of the OpenCL code in GEGL, joined the GEGL-OpenCL project to advise.

All work is being done on GitHub. The communication between participants is taking place in a dedicated Slack channel (invite-only).

Want to have a vote on what is the next porting project after GEGL? Vote here.