GPU Day 2026

At Stream HPC, we enjoy opportunities to connect with the HPC and accelerator community, exchange ideas, and learn from engineers and researchers working across the GPU ecosystem. Later this month, several members of our team will be attending GPU Day 2026 in Budapest, Hungary.

Now in its 16th edition, GPU Day has become an established annual conference focused on massively parallel computing in science and industrial applications. Organized by the Wigner Scientific Computation Laboratory, the event brings together researchers, developers, students, and industry experts to discuss technologies spanning GPUs, compilers, machine learning, visualization, and emerging accelerator platforms.

For Stream HPC, events like GPU Day are a natural fit. We work with clients across a wide range of hardware platforms and software ecosystems, helping optimize and accelerate applications where performance matters. Conferences like this provide an opportunity to share practical experiences from real-world projects and contribute to the broader HPC community.

This year two Stream HPC engineers will present their work at the conference:

1 – Manual AMDGCN Assembly Analysis & Optimization

Presenter: Nara Prasetya

Performance optimization on GPUs often starts with profiling and identifying memory bottlenecks. But sometimes performance limitations originate elsewhere, like in compiler decisions and generated machine code itself.

In this presentation, Nara explores optimization techniques that go beyond conventional profiling workflows. By analyzing AMDGCN assembly directly, reducing register pressure, and investigating the impact of compiler changes, the talk demonstrates how manual low-level analysis can recover performance that would otherwise remain hidden behind generated code.

2- Evaluating the AdaptiveCpp Single-Pass (SSCP) SYCL compiler for GROMACS on Modern AMD Accelerators

Presenter: Bálint Soproni

SYCL supports multiple implementation strategies, including both Single-Source Multiple Compiler Passes (SMCP) and Single-Source Single Compiler Pass (SSCP) approaches. AdaptiveCpp’s SSCP JIT compiler has previously shown promising performance gains, but its impact on large production applications has remained relatively unexplored.

Bálint presents work evaluating AdaptiveCpp’s SSCP compiler using GROMACS, a widely-used molecular dynamics package with a mature SYCL backend targeting AMD GPUs. Their results show performance improvements of up to 10–25% for certain workload configurations and increased peak throughput across modern AMD accelerators, demonstrating that SSCP advantages can extend beyond smaller benchmark applications into real production workloads.

GPU Day is only a few weeks away and we are looking forward to meeting fellow developers, researchers, and industry colleagues in Budapest. If you’ll be attending, come say hello and we’d be happy to talk GPUs, performance optimization, SYCL, compilers, and HPC.