OpenCL CPU/GPU Training (4 days)
#_TOWN, 21 Aug 2017
If you need to learn solid GPU programming, this is the training you should attend. This is a public training with trainees from various companies – get in contact if you want to learn more about our in-company trainings.
Schedule
The training consists of two sessions of two days: Monday, Tuesday, Thursday and Friday, where Wednesday is a resting day. Around 40% consist of lectures and 50% lab-sessions and 10% discussions.
Important:
- The training is guaranteed to take place. If you are the only one, you’ll simply get personal training.
- The below schedule is indicative when it comes to lab-sessions – some lab-sessions can be (partly) skipped or replaced if time is getting too limited.
Day 1: OpenCL Foundations (For beginners only)
This is close to our standard OpenCL crash course. We start with the basic concepts, write our fist OpenCL program, discuss the architectures, discuss the difficulties of GPU-programming, compare to CUDA and C++, and end with writing simple code that runs on a laptop (CPU or GPU).
- OpenCL model
- OpenCL language
- Memory objects
- General hardware overview
- Task-parallelism and data-parallelism
- Mapping OpenCL to CPUs and GPUs
The lab-sessions are all done on your own laptops.
Lab-sessions:
- Clinfo
- ColorBalance
Day 2: Optimise a program from scratch
During the day, we will increase the level of requirements and touch all important aspects of OpenCL-programming.
- As we have several GPU-servers available for developing, we can provide you with login-credentials, a git-account and a short how-to for using the extra GPUs from NVidia and AMD, optionally Intel. This way you can use different graphic cards to find out which optimisations work and don’t work.
Optimisations we discuss during this day:
- Host-code
- Data-flow
- Memory handling
- Data transfer speed increase
- Memory alignments
- Scheduling
- Parallelism increase
- Latency reduction
- The most important kernel-optimisations from the different vendors
Lab-sessions:
- Matrix-multiplication
- Convolution
Day 3: Pause day
We found that most people need to have a break after two days, to let the information sink in.
For the Amsterdam training we offer an optional tour through the city in the afternoon.
Day 4: Memory optimisations
We start with discussing what we learned and did not learn. We focus on a few subjects of day 2 that were not clear enough.
Today we focus on creating and optimising more advanced software:
- Histogram
- Contrast Stretching
- Frame correlation
As each of these take about 3 hours, we try to already start on day two or continue on day five if needed.
Day 5: Tools, special subjects and final project
There are various tools you need to understand to get code that runs well on GPUs. This day you will learn to use various vendor-provided and open source tools to help you analyse your code.
- We have used tools from one vendor and are updating the materials to use open source tools that work on various
The tools are discussed and used along the following four subjects:
- Software correctness: data-races.
- Profiling: timing and finding hot-spots
- Reporting: let tools create useful reports.
- Debugging: finding bugs and learning what actually happens.
Special subjects:
- Splitting work over CPU&GPU: Running different kernels on CPU and GPU, to make maximum use of the whole computer.
- GL-CL interop: Understanding how interoperability with OpenGL works. With this the results can be shown on the screen with minimal latency.
- Optimising data-throughput when using multiple kernels.
Lab-sessions:
- Fixing non-optimal and broken code
The final project you will need to use all you’ve learnt and try to get the fastest code from class.
Prerequisites
We will send a questionnaire to understand the needs of each trainee. For larger groups, we also send a separate questionnaire to the representative.
Laptop
Attendees need to bring their own laptops for the lab sessions. The only requirement is for the laptops to be equipped with an OpenCL capable CPU or GPU and OpenCL drivers are correctly installed. A complete list with OpenCL compliant devices can be found here. Regarding the software, laptops need to have installed the following software:
- cmake 3.1 or higher: lab-sessions are available in cmake, so almost all IDEs are supported.
- An IDE or text-editor with coding-support.
- One or more OpenCL SDKs, for each OpenCL-device in the computer.
- A C/C++ compiler suite. Examples of supported suites are Microsoft Visual Studio, Apple Xcode and GNU GCC/G++. We will send a small project in advance, which can be used to test compilers.
- ssh/putty: optional. Needed when working at StreamHPC’s GPU servers.
- git: optional. Needed when easy transfer of lab-work between several computers/servers is required. Lab-sessions can also be downloaded as zip-files.
Skills
Attendees are required to have intermediate programming experience and good C/C++ knowledge. This means that you should at least be able to write an application in C/C++ from scratch, debug it with GDB and be very (!) comfortable working with pointers.
Info Amsterdam training
Costs
The costs are €3000 for the whole week. Coffee, tea, snacks, fruit, lunch, Friday-drinks and the Amsterdam tour are included in the price. Extra days for personal training and consultancy are excluded – please ask us for a separate quote.
Nearby Hotels
At walking distance there are various hotels. In random order:
We have good experience with the Amsterdam ID Aparthotel, which is nearby, affordable and provides a kitchen in each room. Holiday Inn Express is next to the noisy railway station, so select only as last resort. There are three restaurants in the area, but many more in the city centre.
If you prefer a hotel in the city-center (central station is only 6 minutes by train) or want us to book one of the hotels, give a call to provide your preferences and payment details, and we’ll arrange everything for you. Costs for full handling is €100.
Reserve your spot today!
Contact us today to reserve your spot.
Email: info@streamhpc.com
Phone: +31 854865760
#_MAP