Install (Intel) Altera Quartus 16.0.2 OpenCL on Ubuntu 14.04 Linux

quartusTo temporarily increase capacity we put Quartus 16.0.2 on an Ubuntu server, which did not go smooth – but at least smoother than upgrading packages to required versions on RedHat/CentOS. While the download says “Linux” and you’re expecting support for multiple Linux breeds, there is only official support for Redhat 6.5 (and CentOS).

Luckily it was very possible to have a stable installation of Quartus on Ubuntu. As information on this subject was squattered around the net and even incomplete, we decided to share our howto in this blogpost. These tips probably also work for other modern Linux-based operating systems like Fedora, Suse, Arch, etc, as most problems are due to new features and more up-to-date libraries than are provided in RedHat/CentOS.

Note1 : we did not install the FPGA on the Ubuntu-machine and neither fully researched potential problems for doing so – installing the FPGA on an Ubuntu machine is at your own risk. Have your board maker follow this tutorial to test their libraries on Ubuntu.

Note 2: we tested on Ubuntu 14.04. No guarantees if it all works on other version. Let us know in the comments if it works on other versions too. Continue reading “Install (Intel) Altera Quartus 16.0.2 OpenCL on Ubuntu 14.04 Linux”

Starting with Altera OpenCL-on-FPGAs

SVpressgraphicAltera has been very busy adding resources and has kicked off the beginning of June with opening up their OpenCL-program for the general public.
Only Stratix V devices are supported, but that could change later.

Below are all pages and PDFs concerning OpenCL I’ve found while searching Altera’s website.

Evaluation of CPUs, GPUs, and FPGAs as Acceleration Platforms

Altera wanted to know where they could compete with GPUs and CPUs. For a big company their comparisons are quite honest (for instance about their limited access-speed to memory), but they don’t tell everything – like the hours(!) of compilation-time. The idea is that you develop on a GPU and when it’s correct, you port the correctly working software to the FPGA.

If you don’t have any experience working with their FPGAs, best is to ask around.

Medical_OpenCL
Image taken from Altera website.

Continue reading “Starting with Altera OpenCL-on-FPGAs”

OpenCL on Altera FPGAs

On 15 November 2011 Altera announced support for OpenCL. The time between announcements for having/getting OpenCL-support and getting to see actually working SDKs takes always longer than expected, so to get this working on FPGAs I did not expect anything before 2013. Good news: the drivers are actually working (if you can trust the demos at presentations).

There have been three presentations lately:

In this article I share with you what you should not have missed on these sheets, and add some personal notes to it.

Is OpenCL the key that finally makes FPGAs not tomorrow’s but today’s technology?

Continue reading “OpenCL on Altera FPGAs”

How to get full CMake support for AMD HIP SDK on Windows – including patches

Written by Máté Ferenc Nagy-Egri and Gergely Mészáros

Disclaimer: if you’ve stumbled across this page in search of fixing up the ROCm SDK’s CMake HIP language support on Windows and care only about the fix, please skip to the end of this post to download the patches. If you wish to learn some things about ROCm and CMake, join us for a ride.

Finally, ROCm on Windows

The recent release of the AMD’s ROCm SDK on Windows brings a long awaited rejuvenation of developer tooling for offload APIs. Undoubtedly it’s most anticipated feature is a HIP-capable compiler. The runtime component amdhip64.dll has been shipping with AMD Software: Adrenalin Edition for multiple years now, and with some trickery one could consume the HIP host-side API by taking the API headers from GitHub (or a Linux ROCm install) and creating an export lib from the driver DLL. Feeding device code compiled offline and given to HIP’s Module API  was attainable, yet cumbersome. Anticipation is driven by the single-source compilation model of HIP borrowed from CUDA. That is finally available* now!

[*]: That is, if you are using Visual Studio and MSBuild, or legacy HIP compilation atop CMake CXX language support.

Continue reading “How to get full CMake support for AMD HIP SDK on Windows – including patches”

Faster Development Cycles for FPGAs

normal-vs-opencl-fpga-flow
The time-difference between the normal and OpenCL flow is large. The final product is as fast and efficient.

VHDL and Verilog are not the right tools when it comes to developing on FPGAs fast.

  • It is time-consuming. If the first cycle takes 3 months, then each subsequent cycle easily takes 2 weeks. Time is money.
  • Porting or upgrading a design from one FPGA device to another is also time-consuming. This makes it essential to choose the final FPGA vendor and family upfront.
  • Dual-platform development on CPU and FPGA needs synchronisation. The code works on either the CPU or the FPGA, which makes the functional tests made for the CPU-version less trustworthy.

Here is where OpenCL comes in.

  • Shorter development cycles. Programming in OpenCL is normally much faster than in VHDL or Verilog. If you are porting C/C++ code onto FPGA the development cycles will be dramatically shorter. Think weeks instead of months – as this news article explains. This means a radically reduced investment as well as providing time for architectural exploration.
  • OpenCL works on both CPUs and FPGAs, so functional tests can be run on either. As a bonus the code can be optimised for GPUs, within a short time-frame.
  • The performance is equal to VHDL and Verilog, unless FPGA-specific optimisations are used, such as vector-widths not equal to a power of two.
  • Vendor Agnostic solution. Porting to other FPGAs takes considerably less time and the compiler solves this problem for you.
  • Both Xilinx and Altera have OpenCL compilersAltera were the first to come out with an OpenCL offering and have a full SDK, which is an add-on to Quartus II. Xilinx also have a stand-alone OpenCL development environment solution called SDAccel.

Support for OpenCL is strong by both Altera and Xilinx

Both vendors suggest OpenCL to overcome existing FPGA design problems. Altera suggest to use OpenCL to speed-up the process for existing developers. So OpenCL is not a third party tool, you need to trust separately.

OpenCL allows a user to abstract away the traditional hardware FPGA development flow for a much faster and higher level software development flow – Altera

Xilinx suggests that OpenCL can enable companies without the needed developer resources to start working with FPGAs.

Teams with limited or no FPGA hardware resources, however, have found the transition to FPGAs challenging due to the RTL (VHDL or Verilog) development expertise needed to take full advantage of these devices. OpenCL eases this programming burden – Xilinx

Why choose StreamHPC?

There are several reasons to choose letting us to do the porting and protoyping of your product.

  • We have the right background, as our team consists of CPU, GPU and FPGA developers. Our code is therefore designed with easy porting in mind.
  • Our costs are lower than having the product done in Verilog/VHDL.
  • We give guarantees and support for our products on all platforms the product is ported on.
  • We can port the final OpenCL code to Verilog/VHDL, keeping the same performance. In case you don’t trust a high-level language, we have you covered.
  • Optionally you can get both the code and a technical report with a detailed explanation of how we did it. So you can learn from this and modify the code yourself.
  • You get free advice on when (and not) to use OpenCL for FPGAs.

There are three ways to get in contact quickly:

Contact - call call: +31 854865760 (European office hours)

Contact - mail e-mail: contact@streamhpc.com

Fill in this form – mention when you want to be called back (possible outside normal office hours):

[contact_form]

Want to read more?

We wrote about OpenCL-on-FPGAs on our blog in the previous years.

N-Queens project from over 10 years ago

Why you should just delve into porting difficult puzzles using the GPU, to learn GPGPU-languages like CUDA, HIP, SYCL, Metal or OpenCL. And if you did not pick one, why not N-Queens? N-Queens is a truly fun puzzle to work on, and I am looking forward to learning about better approaches via the comments.

We love it when junior applicants have a personal project to show, even if it’s unfinished. As it can be scary to share such unfinished project, I’ll go first.

Introduction in 2023

Everybody who starts in GPGPU, has this moment that they feel great about the progress and speedup, but then suddenly get totally overwhelmed by the endless paths to more optimizations. And ofcourse 90% of the potential optimizations don’t work well – it takes many years of experience (and mentors in the team) to excel at it. This was also a main reason why I like GPGPU so much: it remains difficult for a long time, and it never bores. My personal project where I had this overwhelmed+underwhelmed feeling, was with N-Queens – till then I could solve the problems in front of me.

I worked on this backtracking problem as a personal fun-project in the early days of the company (2011?), and decided to blog about it in 2016. But before publishing I thought the story was not ready to be shared, as I changed the way I coded, learned so many more optimization techniques, and (like many programmers) thought the code needed a full rewrite. Meanwhile I had to focus much more on building the company, and also my colleagues got better at GPGPU-coding than me – this didn’t change in the years after, and I’m the dumbest coder in the room now.

Today I decided to just share what I wrote down in 2011 and 2016, and for now focus on fixing the text and links. As the code was written in Aparapi and not pure OpenCL, it would take some good effort to make it available – I decided not to do that, to prevent postponing it even further. Luckily somebody on this same planet had about the same approaches as I had (plus more), and actually finished the implementation – scroll down to the end, if you don’t care about approaches and just want the code.

Note that when I worked on the problem, I used an AMD Radeon GPU and OpenCL. Tools from AMD were hardly there, so you might find a remark that did not age well.

Introduction in 2016

What do 1, 0, 0, 2, 10, 4, 40, 92, 352, 724, 2680, 14200, 73712, 365596, 2279184, 14772512, 95815104, 666090624, 4968057848, 39029188884, 314666222712, 2691008701644, 24233937684440, 227514171973736, 2207893435808352 and 22317699616364044 have to do with each other? They are the first 26 solutions of the N-Queens problem. Even if you are not interested in GPGPU (OpenCL, CUDA), this article should give you some background of this interesting puzzle.

An existing N-Queen implementation in OpenCL took N=17 took 2.89 seconds on my GPU, while Nvidia-hardware took half. I knew it did not use the full potential of the used GPU, because bitcoin-mining dropped to 55% and not to 0%. 🙂 I only had to find those optimizations be redoing the port from another angle.

This article was written while I programmed (as a journal), so you see which questions I asked myself to get to the solution. I hope this also gives some insight on how I work and the hard part of the job is that most of the energy goes into resultless preparations.

Continue reading “N-Queens project from over 10 years ago”

NVIDIA ended their support for OpenCL in 2012

If you are looking for the samples in one zip-file, scroll down. The removed OpenCL-PDFs are also available for download.

This sentence “NVIDIA’s Industry-Leading Support For OpenCL” was proudly used on NVIDIA’s OpenCL page last year. It seems that NVIDIA saw a great future for OpenCL on their GPUs. But when CUDA began borrowing the idea of using LLVM for compiling kernels, NVIDIA’s support for OpenCL slowly started to fade instead. Since with LLVM CUDA-kernels can be loaded in OpenCL and vice versa, this could have brought the two techniques more together.

What is the cause for this decreased support for OpenCL? Did they suddenly got aware LLVM would decrease any advantage of CUDA over OpenCL and therefore decreased support for OpenCL? Or did they decide so long ago, as their last OpenCL-conformant product on Windows is from July 2010? We cannot be sure, but we do know NVIDIA does not have an official statement on the matter.

The latest action demonstrating NVIDIA’s reduced support of OpenCL is the absence of the samples in their GPGPU-SDK. NVIDIA removed them without notice or clear statement on their position on OpenCL. Therefore we decided to start a petition to get these OpenCL samples back. The only official statement on the removal of the samples was on LinkedIn:

All of our OpenCL code samples are available at http://developer.nvidia.com/opencl, and the latest versions all work on the new Kepler GPUs.
They are released as a separate download because developers using OpenCL don’t need the rest of the CUDA Toolkit, which is getting to be quite large.
Sorry if this caused any alarm, we’re just trying to make life a little easier for OpenCL developers.

Best regards,

Will.

William Ramey
Sr. Product Manager, GPU Computing
NVIDIA Corporation

Continue reading “NVIDIA ended their support for OpenCL in 2012”

Altera published their OpenCL-on-FPGA optimization guide

Altera-doc

Altera has just released their optimisation guide for OpenCL-on-FPGAs. It does not go into the howto’s of OpenCL, but assumes you have knowledge of the technology. Niether does it provide any information on the basics of Altera’s Stratix V or other FPGA.

It is the first public optimisation document, so it is appreciated to send feedback directly. Not aware what OpenCL can do on an FPGA? Watch the below video.

https://www.youtube.com/watch?v=p25CVFMc-dk

Subjects

The following subjects and optimisation tricks are discussed:

  • FPGA Overview
  • Pipelines
  • Good Design Practices
  • Avoid Pointer Aliasing
  • Avoid Expensive Functions
  • Avoid Work-Item ID-Dependent Backward Branching
  • Aligned Memory Allocation
  • Ensure 4-Byte Alignment for All Data Structures
  • Maintain Similar Structures for Vector Type Elements
  • Optimization of Data Processing Efficiency
  • Specify a Maximum Work-Group Size or a Required Work-Group Size
  • Loop Unrolling
  • Resource Sharing
  • Kernel Vectorization
  • Multiple Compute Units
  • Combination of Compute Unit Replication and Kernel SIMD Vectorization
  • Resource-Driven Optimization
  • Floating-Point Operations
  • Optimization of Memory Access Efficiency
  • General Guidelines on Optimizing Memory Accesses
  • Optimize Global Memory Accesses
  • Perform Kernel Computations Using Constant, Local or Private Memory
  • Single Work-Item Execution

Carefully compare these with CPU and GPU optimisation guides to be able to write more generic OpenCL code.

Download

You can download the document here.

If you have any question on OpenCL-on-FPGAs, OpenCL, generic optimisations or Altera FPGAs, feel welcomed to contact us.

The 12 latest Twitter Poll Results of 2018

Via our Twitter channel we have various polls. Not always have we shared the full background of these polls, so we’ve taken the polls of the past half year and put them here. The first half of the year there were no polls, in case you wanted to know.

As inclusive polls are not focused (and thus difficult to answer), most polls are incomplete by design. Still insights can be given. Or comments given.

Below’s polls have given us insight and we hope they give you insights too how our industry is developing. It’s sorted on date from oldest first.

It was very interesting that the percentage of votes per choice did not change much after 30 votes. Even when it was retweeted by a large account, opinions had the same distribution.

Is HIP (a clone of CUDA) an option?

Continue reading “The 12 latest Twitter Poll Results of 2018”

Privacy Policy

Who we are

We are a group of companies, based in the Netherlands, Hungary and Spain. We help our customers get their code run fast by optimizing the computations and using accelerators. We do this since 2010.

Comments

When visitors leave comments on the site we collect the data shown in the comments form, and also the visitor’s IP address and browser user agent string to help spam detection.

An anonymised string created from your email address (also called a hash) may be provided to the Gravatar service to see if you are using it. The Gravatar service Privacy Policy is available here: https://automattic.com/privacy/. After approval of your comment, your profile picture is visible to the public in the context of your comment.

Forms

Form-data is sent to self-hosted software and is not read by any third-party party.

Tracking

We use anonymized tracking to find out:

  • Which pages are visited how often
  • Which subjects are popular
  • Which pages are clicked through
  • From which countries or states the visitors are

During a visit/session, you get a random ID.

Cookies

If you leave a comment on our site you may opt in to saving your name, email address and website in cookies. These are for your convenience so that you do not have to fill in your details again when you leave another comment. These cookies will last for one year.

Tracking cookies last for 24 hours.

Embedded content from other websites

Articles on this site may include embedded content (e.g. videos, images, articles, etc.). Embedded content from other websites behaves in the exact same way as if the visitor has visited the other website.

These websites may collect data about you, use cookies, embed additional third-party tracking, and monitor your interaction with that embedded content, including tracking your interaction with the embedded content if you have an account and are logged in to that website.

Who we share your data with

None of the data is shared with any third party. Marketing reports don’t contain any personal data.

How long we retain your data

If you leave a comment, the comment and its metadata are retained indefinitely. This is so we can recognize and approve any follow-up comments automatically instead of holding them in a moderation queue.

Anonymous tracking data is not thrown away, to find trends over the years.

What rights you have over your data

If you have left comments, you can request to receive an exported file of the personal data we hold about you, including any data you have provided to us. You can also request that we erase any personal data we hold about you. This does not include any data we are obliged to keep for administrative, legal, or security purposes.

Where your data is sent

Visitor comments and forms are checked through an automated spam detection service, ReCAPTCHA and Akismet.

Reporting problems

We are not in the business of monetizing user data, and believe in finding new customers through content.

As software and plugins change after updates, we are sometimes surprised that more is collected than we configured.

If anything is incorrect or not legal, please email to privacy@streamhpc.com. If you have generic questions, go to the contact page or email to info@streamhpc.com.

Altera

Altera-logoAltera makes FPGAs:

  • Cyclone (smallest, low-end)
  • Arria (‘V’ and ’10’ series available now)
  • Stratix (‘V’ series available now, ’10’ in 2017)

Each FPGA can optionally have a built-in ARM CPU (SoC). With the acquisition of Altera by Intel, FPGAs will also get combined with a Xeon.

For HPC the Arria 10 is currently used most.

For low-latency all sizes

Altera FPGA board

ALTERA CORPORATION LOGO

[infobox type=”information”]

Need a FPGA OpenCL programmer? Hire us!

[/infobox]

A LOT OF NEW INFO HAS GONE PUBLIC. STAY TUNED WHILE WE COLLECT THE DATA.

Altera has an Early Access Program for existing customers/partners to their OpenCL SDK. Another route is contacting Bittware.

There is not much information on the exact hardware and software. It could exist of:

The compiler does not generate VHDL or alike, but a FPGA-layout. So no VHDL-knowledge is needed, only OpenCL. Know FPGA dev-boards are quite pricey generally.
For more information, read the sheets from a recent keynote by Deshanand Singh.

 

Demo: cartoonizer on an Altera Arria 10 FPGA

It takes quite some effort to program FPGAs using VHDL or Verilog. Since several years Intel/Altera has OpenCL-drivers, with the goal to reduce this effort. OpenCL-on-FPGAs reduced the required effort to a quarter of the time, while also making it easier to alter the specifications during the project. Exactly the latter was very beneficiary when creating the demo, as the to-be-solved problem was vaguely defined. The goal was to make a video look like a cartoon using image filters. We soon found out that “cartoonized” is a vague description, and it took several iterations to get the right balance between blur, color-reduction and edge-detection. Continue reading “Demo: cartoonizer on an Altera Arria 10 FPGA”

OpenCL Videos of AMD’s AFDS 2012

AFDS was full of talks on OpenCL. You missed them, just like me? Then you will be happy that they put many videos on Youtube!

Enjoy watching! As all videos are around 40 minutes, it is best to take a full day for watching them all. The first part is on openCL itself, second is on tools, third on OpenCL usages, fourth on other subjects.

Continue reading “OpenCL Videos of AMD’s AFDS 2012”

Mobile Processor OpenCL drivers (Q3 2013) + rating

saveFor your convenience: an overview of all ARM-GPUs and their driver-availability. Please let me know if something is missing.

I’ve added a rating, to friendly push the vendors to get to at least an 7. Vendors can contact me, if they think the rating does not reflect reality.

ZiiLabs

SDK-page@StreamHPC

Drivers can be delivered by Creative, when you pledge to order ZMS-40 processors. Mail us for a contact at Creative. Minimum order size is unknown.

This device can therefore only be used for custom devices.

[usr=4]

Vivante

SDK-page@StreamHPC

They are found on public devices. Android-drivers that work on FreeScale processors are openly available and can be found here.

[usr=8]

Even though the processors are not that powerfull, Vivante/FreeScale offers the best support.

Qualcomm

SDK-page@StreamHPC

Drivers are not shipped on devices, according various sources. Android-drivers are in the SDK-drivers though, which can be found here.

[usr=7]

Rating will go up, when drivers are publicly shipped on phones/tablets.

ARM MALI

Samsung SDK-page@StreamHPC

There are lots of problems around the drivers for Exynos, which only seem to work on the Arndale-board when the LCD is also ordered.Android-drivers can be downloaded here.

[usr=5]

All is in execution – half-baked drivers don’t do it. It is unclear whom to blame, but it certainly has had influence on creating a new version of Exynos 5, the octa.

Imagination Technologies

SDK-page@StreamHPC

TI only delivers drivers under NDA. Samsung has one board coming up with OpenCL 1.1 EP drivers.

[usr=5]

Rating will go up, when drivers from TI come available without obstacles, or Samsung delivers what they failed to do with the previous Exynos 5.

Exciting times coming up

Mostly because of a power-struggle between Google and the GPU-vendors, there is some hesitation to ship OpenCL drivers on phones and tablets. Unfortunately, Google’s answer to OpenCL RenderScript Compute, does not provide the needs wanted by developers. Google’s official answer is that it does not want fragmentation nor code that is optimised for a certain GPU. The interpreted answer is that Google wants vendor-lockin and therefore blocks the standard. Whatever the reason is, OpenCL is used as sword to show teeth who has a say about the future of Android – only the advertisement-company Google or also the group of named processor-makers and various phone/tablet-vendors?

In H2 2014 Nvidia will ship CUDA-drivers with their Tegra 5 GPUs, making the soap complete.

There are rumours Apple will intervene and will make OpenCL available on iOS. This would explain why there is put so much effort in showing OpenCL-results by Imagination and Qualcomm

And always keep a close watch on POCL, the vendor-independent OpenCL implementation.

[bordered_box border_color=” background_color=’#C1DAD6′]

Need a programmer for any of the above devices? Hire us!

[/bordered_box]

The 13 application areas where OpenCL and CUDA can be used

visitekaartje-achter-2013-V
Did you find your specialism in the list? The formula is the easiest introduction to GPGPU I could think of, including the need of auto-tuning.

Which algorithms map is best to which accelerator? In other words: What kind of algorithms are faster when using accelerators and OpenCL/CUDA?

Professor Wu Feng and his group from VirginiaTech took a close look at which types of algorithms were a good fit for vector-processors. This resulted in a document: “The 13 (computational) dwarves of OpenCL” (2011). It became an important document here in StreamHPC, as it gave a good starting point for investigating new problem spaces.

The document is inspired by Phil Colella, who identified seven numerical methods that are important for science and engineering. He named “dwarves” these algorithmic methods. With 6 more application areas in which GPUs and other vector-accelerated processors did well, the list was completed.

As a funny side-note, in Brothers Grimm’s “Snow White” there were 7 dwarves and in Tolkien’s “The Hobbit” there were 13.

Continue reading “The 13 application areas where OpenCL and CUDA can be used”

OpenCL at SC13

Unluckily I am not at SC13, so I’ll enjoy from a distance. Luckily I don’t miss one of the most beautiful 15km runs in the Netherlands. When there is more news, I’ll add to this post – below is mostly taken from the Khronos website and the SC2013 website.

OpenCL Booth

Meet members of the OpenCL workgroup in Booth #4137 to get hot news from the OpenCL experts and an OpenCL reference card. Also learn about next year’s plans for IWOCL (International Workshop on OpenCL). Be sure not to miss the BOF on OpenCL 2.0 (in bold in the schedule).

Schedule

Sunday, 17 November

8:30 – 17:00 Tutorials Structured Parallel Programming with Patterns Michael McCool, James Reinders, Arch Robison, Michael Hebenstreit 302
8:30 – 17:00 Tutorials OpenACC: Productive, Portable Performance on Hybrid Systems Using High-Level Compilers and Tools Luiz DeRose, Alistair Hart, Heidi Poxon, James Beyer 401

Monday, 18 November

8:30 – 17:00 Tutorials OpenCL: A Hands-On Introduction Tim Mattson, Alice Koniges, Simon McIntosh-Smith 403

Tuesday, 19 November

11:00 & 15:00 ACM Student Research Competition Poster Reception Introduction to OpenCL on FPGAs AcceleWare Altera booth
17:15 – 19:00 Short presentation local_malloc: malloc for OpenCL __local memory [Poster] John Kloosterman Mile High Pre-Function

Wednesday, 20 November

11:00 & 15:00 Short presentation Introduction to OpenCL on FPGAs AcceleWare Altera booth
16:00 Case Study Accelerating Full Waveform Inversion via OpenCL on AMD GPUs AcceleWare AMD booth
17:30 – 19:00 BOF OpenCL: Version 2.0 and Beyond Tim Mattson, Ben Bergen, Simon McIntosh-Smith 405/406/407

Thursday, 21 November

11:30 – 12:00 Exhibitor Forum OpenCL 2.0: Unlocking the Power of Your Heterogeneous Platform Tim Mattson 501/502
10:30 – 11:00 Papers General Transformations for GPU Execution of Tree Traversals Michael Goldfarb, Youngjoon Jo, Milind Kulkarni 205/207
11:00 – 11:30 Papers A Large-Scale Cross-Architecture Evaluation of Thread-Coarsening Alberto Magni, Christophe Dubach, Michael F.P. O’Boyle 205/207
11:30 – 12:00 Papers Semi-Automatic Restructuring of Offloadable Tasks for Many-Core Accelerators Nishkam Ravi, Yi Yang, Tao Bao, Srimat Chakradhar 205/207
11:30 – 12:00 Short presentation Introduction to OpenCL on FPGAs AcceleWare Altera booth
16:00 – 16:30 Paper Accelerating Sparse Matrix-Vector Multiplication on GPUs using Bit-Representation-Optimized Schemes Wai Teng Tang, Wen Jun Tan, Rajarshi Ray, Yi Wen Wong, Weiguang Chen, Shyh-hao Kuo, Rick Siow Mong Goh, Stephen John Turner, Weng-Fai Wong 401/402/403

There are other interesting SC13-events around OpenCL, so be sure to check the schedule carefully.

image001

Khronos Members Exhibiting at SC13

Complete floor plan is available here.

Below links are updated from the company-homepages to special SC13 landing-pages.

  • Altera Corporation – Booth 4332.
  • AMD – Booth 1113. See this list for a schedule.
  • ARM – Booth 3141.
    • OpenCL on MALI demos at their booth
    • AccelerEyes (#310) showcases ArrayFire (with OpenCL-on-ARM backend) running on Mali T604.
    • Many more – just look for it.
  • Khronos OpenCL – Booth 4137.
  • IBM – Booth 126, 2713. OpenCL on IBM PowerLinux 7R2 and IBM Flex System. Also collaboration with Altera.
  • Intel – Booth 2501, 2701. OpenCL on their CPUs and GPUs.
  • NEC Corporation – 3109. Vector supercomputer – Khronos probably hints to OpenCL running on this machine – see for yourself.
  • NVIDIA – Booth 613. Have fun hearing them say: “Do you use CUDA or are you locked-in to OpenCL?” and variations on this.
  • Texas Instruments – Booth 3725. OpenCL on DSP demo.
  • XilinX – To schedule a private appointment (for an OpenCL-demo) visit Xilinx at the Convey booth (#3547) or the Alpha Data booth (#4237).

The floorplan can be downloaded here or here (mirrored on 15-Nov).

At SC13 and saw great OpenCL demos or news?

Share this info and photos in the comments, for others to pick up.

Big Data

Big_Bang_Data_exhibit_at_CCCB_17Big data is a term for data so large or complex that traditional processing applications are inadequate. Challenges include:

  • capture, data-curation & data-management,
  • analysis, search & querying,
  • sharing, storage & transfer,
  • visualization, and
  • information privacy.

The term often refers simply to the use of predictive analytics or certain other advanced methods to extract value from data, and seldom to a particular size of data set. Accuracy in big data may lead to more confident decision making, and better decisions can result in greater operational efficiency, cost reduction and reduced risk.

At StreamHPC we’re focused on optimizing (predictive) analytic and data-handling software, as these tend to be slow. We solved Big Data problems at two aspects: real-time pre-processing (filtering, structuring, etc) and analytics (including in-memory search on a GPU).

GPU-developers, work for StreamHPC and friends

applyBored at work? Go start working for one of the anti-boring GPU-expert companies: StreamHPC (Netherlands, EU), Appilo (Israel) or AccelerEyes (Georgia, US).

We all look for people who know how to code GPUs. Experience is key, so you need to have at least one year of experience in GPU-programming.

Submit your CV now and don’t give up on your dream leaving behind the boring times.

StreamHPC

Amsterdam-based StreamHPC is a young startup with a typical Dutch, open atmosphere (gezellig). Projects are always different from the previous as we work in various fields. Most work is in parallisation and OpenCL-coding. We are quite demanding of your knowledge on the various hardware-platforms OpenCL works on (GPUs, mobile GPUs, array-processors, DSPs, FPGAs).

See the jobs-page for more information and how to apply.

Appilo

North Israel based Appilo is seeking GPU-programmers for project-basis contracts. Depending on the project, most of the work could usually be performed remotely.

Use the mail on the contact-page at Appilo to send your CV.

AccelerEyes

Atlanta-based AccelerEyes delivers products which are used to accelerate C, C++, and Fortran codes on CUDA GPUs and OpenCL devices. They look for people who believe in their products and want to help make them better.

See their jobs-page for more information and how to apply.

StreamComputing exists 5 years!

5yearsSCIn January 2010 I created the first steps of StreamComputing (redacted: rebranded to StreamHPC in 2017), by registering the website and writing a hello-world article. About 4 months of preparations and paperwork later the freelance-company was registered. Then 5 years later it got turned into a small company with still the strong focus on OpenCL, but with more employees and more customers.

I would like to thank the following people:

  • My parents and grand-mother for (financially) supporting me, even though they did not always understand why I was taking all those risks.
  • My friends, for understanding I needed to work in the weekends and evenings.
  • My good friend Laura for supporting me during the hard times of 2011 and 2012.
  • My girlfriend Elena for always being there for me.
  • My colleagues and OpenCL-experts Anca, Teemu and Oscar, who have done the real work the past year.
  • My customers for believing in OpenCL and trusting StreamComputing.

Without them, the company would never even existed. Thank you! Continue reading “StreamComputing exists 5 years!”