Scaling mobile GPUs to 1000 GFLOPS

arm_mali_cover_151112297646_640x360On the 20th of April 2013 there was an interesting discussion between Jan Gray and David Kanter. Jan is a specialist in C++ and FPGAs (twitter, homepage). David is a specialist in CPU and GPU architectures (twitter, homepage). Both know their ways well in the field of semiconductors. It is always a joy to follow their short discussions when they happen, but there was something about this one that made me want to share it with special attention.

OpenCL on ARM: Growth-expectation of GFLOPS/Watt of mobile GPUs exceeds Moore’s law. That’s incredible!

Jan Gray: .@OpenCLonARM GFLOPS/W more a factor of almost-over Dennard Scaling. But plenty of waste still to quash. http://www.fpgacpu.org/papers/Gray_AutumnOfMooresLaw_SingularityUniversity_11-06-23.pdf

Jan Gray‏: .@openclonarm Scratch Dennard tweet: reduced capacitance of yet smaller devices shd improve GFLOPS/W even as we approach end of Vdd scaling.

David Kanter: @jangray @OpenCLonARM I think some companies would argue Vdd scaling isn’t dead…

Jan Gray: @TheKanter @openclonarm it’s not dead, but slowing, we’ve gone from 5V to 1V (25x power savings) and have maybe several hundred mVs to go.

David Kanter: @jangray I reckon we have at least 400mV, so ~2X; slower than ideal, but still significant

Jan Gray: @TheKanter We agree, I think.

David Kanter: @jangray I suspect that if GPU scaling > Moore’s Law then they are just spending more area or power; like discrete GPUs in the last decade

David Kanter: @jangray also, most positive comment I’ve heard from industry folks on mobile GPU software and drivers is “catastrophically terrible”

Jan Gray: @TheKanter Many ways to reduce power, soup to nuts. For ex HMC DRAM on interposer for lower energy signaling. I’m sure many tricks to come.

In a nutshell, all the reasons they think mobile GPUs can outpace Moore’s law while staying under a certain power-usage.

It needs some background-info, so let’s start the background of the first tweet, and then explain what has been said. Continue reading “Scaling mobile GPUs to 1000 GFLOPS”

NVIDIA: mobile phones, tablets and HPC (cloud)

If you want to see what is coming up in the market of consumer-technology (PC, mobile and tablet), then NVIDIA can tell you the most. The company is very flexible, and shows time after time it really knows in which markets is currently operates and can enter. I sometimes strongly disagree with their marketing, but watch them closely as they are in the most important markets to define the near future in: PCs, Mobile/Tablet and HPC.
You might think I completely miss interconnects (buses between processors, devices and memory) and memory-technologies as clouds have a large need for high-speed data-transport, but the last 20 years have shown that this is a quite stable developing market based on IP-selling to the hardware-vendors. With the acquisition of Cray’s interconnect technology, we have seen this is serious business for Intel, so things might change indeed. For this article I want to focus on NVIDIA’s choices.

Exposing OpenCL on Android: Q&A with Tim Lewis of ZiiLabs

ZiiLabs has been offering an early access program for OpenCL SDK since last year. This program was very selective in choosing developers and little news has been put on their webpage. Now they are planning to make their Android NDK a standard component, it’s a good time to ask them some questions. GPGPU-consultant Liad Weinberger of Appilo also added a few questions.

The Q&A has been with Tim Lewis, director Marketing and Partner Relations of ZiiLabs, who has taken the time to give some insights in what we can expect around accelerated computations on Android. ZiiLabs has been better known as 3DLabs and has reinvented itself in 2009 (you can read the full history here). Like other companies in the ARM-industry they mostly design chips and let other parties manufacture devices using their schematics, drivers and software. Now to the questions.

Continue reading “Exposing OpenCL on Android: Q&A with Tim Lewis of ZiiLabs”

Waiting for Mobile OpenCL – Q1 2011

About 5 months ago we started waiting for Mobile OpenCL. Meanwhile we had all the news around ARM on CES in January, and of course all those beta-programs made progress meanwhile. And after a year of having “support“, we actually want to see the words “SDK” and/or “driver“. So who’s leading? Ziilabs, ImTech, Vivante, Qualcomm, FreeScale or newcomer nVIDIA?

Mobile phone manufacturers could have a big problem with the low-level access to the GPU. While most software can be sandboxed in some form, OpenCL can crash the phone. But at the other side, if the program hasn’t taken down the developer’s test-phone, the chances are low it will take any other phone. And also there are more low-level access-points to the phone. So let’s check what has happened until now.

Note: this article will be updated if more news comes from MWC ’11.

OpenCL EP

For mobile devices Khronos has specified a profile, which is optimised for (ARM) phones: OpenCL Embedded Profile. Read on for the main differences (taken from a presentation by Nokia).

Main differences

  • Adapting code for embedded profile
  • Added macro __EMBEDDED_PROFILE__
  • CL_PLATFORM_PROFILE capabilityreturns the string EMBEDDED_PROFILE if only the embedded profile is supported
  • Online compiler is optional
  • No 64-bit integers
  • Reduced requirements for constant buffers, object allocation, constant argument count and local memory
  • Image & floating point support matches OpenGL ES 2.0 texturing
  • The extensions of full profile can be applied to embedded profile

Continue reading “Waiting for Mobile OpenCL – Q1 2011”

Waiting for Mobile OpenCL

People who follow me, know my interest in ARM Cortex CPU & Mali GPU and Imagination Technology’s PowerVR, regarding OpenCL-potential. Here is an overview of what I found out until now and has more open ends than answers. I was very hopeful about access to mobile OpenCL for developers, but now I think it will have to wait until 2011. It seems that the mobile phone makers keep the power to themselves in the form of system-libraries instead of giving direct access. Actually a wise choice, given the fact that a bad kernel could easily crash the phone.

Imagination Technology’s PowerVR SGX product provides the GPU-power for the iPhone 4 and various new Android phones. The company does not provide an OpenCL SDK directly to end-users, but leaves the responsibility to the implementers of their IP. Strangely enough they offer SDKs for OpenGL, and say “no comment” to a pretty direct forum-question. In other words: we don’t know what to expect.

Samsung has the only official implementation for ARM (ARM Cortex-A9 MPCore CPU) for OpenCL. Samsung just released their home-made Linux-based mobile phone OS “Bada”, but there is not a sign of OpenCL in their API. Samsung being the only one who could open an API to developers on their phones officially, does not make me smile yet.

There are many, many voices that Apple iOS 4 has support for OpenCL, but then only in the system-libraries. Given the fact that Apple is a big fan of OpenCL, we can assume this is true. See the web for a lot of articles about this.

ZiiLabs is open about their support for OpenCL in their ZMS-05 processor, and has an early access program. Early access still means waiting.

Qualcomm’s Snapdragon does not mention OpenCL at all, while it powers the more powerful smartphones. It mentions a recent job-post, so you know the drill: wait.

Android has good support for OpenGL ES, but not officially for OpenCL.You might have heard of the particle experiment for Android, which is actually OpenGL-based. It also mentions the Android-version of the bullet engine (broken link) for physics-computations, but also no OpenCL. It doesn’t look like Android “Gingerbread” will have support.

So what did you learn after reading this? That developers still won’t have access to OpenCL-API on a mobile platform, and that you have to wait until next year or enter ZiiLabs’ early access program. More about the good and bad of hiding OpenCL on this blog.