NVIDIA’s answer to FirePro S9000: the TESLA K20

Two months ago I wrote about the FirePro S9000 – AMD’s answer to the K10 – and was already looking forward to this K20. Where in the gaming world, it hardly matters what card you buy to get good gaming performance, in the compute-world it does. AMD presented 3.230 TFLOPS with 6GB of memory, and now we are going to see what the K20 can do.

The K20 is very different from its predecessor, the K10. Biggest difference is the difference between the number of double precision capability and this card being more powerful using a single GPU. To keep power-usage low, it is clocked at only 705MHz compared to 1GHz of the K10. I could not find information of the memory-bandwidth.

ECC is turned on by default, hence you see it presented having 5GB. No information yet if this card also has ECC on the local memories/caches.

Performance comparison

As both GPUs are single GPU, the comparison is easy this time.


Functionality TESLA K20 (K20X) FirePro S9000
GPU-Processor count 1 1
Architecture Kepler GK110 Graphics Core Next
Memory per GPU-processor 6 GB GDDR5 -5 GB w/ ECC 6GB GDDR5 – 5 GB w/ ECC
Memory bandwidth per GPU-processor 200 GB/s 264 GB/s
Performance (single precision, per GPU-proc.) 3.52 TFLOPS (3.95 TFLOPS) 3.230 TFLOPS
Performance (double precision, per GPU-proc.) 1.17 TFLOPS (1.31 TFLOPS) 0.806 TFLOPS
Max power usage per GPU-processor 225 Watt (235 Watt) 225 Watt
Greenness (SP) 15.6 GFLOPS/Watt (16.8 GFLOPS/Watt) 14.35 GFLOPS/Watt
Bus Interface PCIe 3.0 x16 PCIe 3.0 x16
Price (per GPU-processor) $3199 (?) $2500
Price per GFLOPS (SP) $0.90 $0.77
Price per GFLOPS (DP) $2.73 $3.10
Cooling Passive Passive

Sources for this table are below.

Conclusion

The most important improvement is found in the double precision performance: from 0.095 TFLOPS of the K10 to a whopping 1.170 TFLOPS. The S9000 cannot compete here.

The memory bandwidth is not known yet.

Tesla K20 is a clear winner in these categories:

  • GFLOPS/Watt (single precision)
  • Single precision performance with a single GPU
  • Double precision performance with a single GPU

The only disadvantage is that the price has increased a lot, which is because of the high price NVIDIA has put on their double precision cores. As silicon is silicon, I am not sure why they do this – marketing-wise a smart move.

Where AMD always had the advantage in double precision, NVIDIA kicks in hard and makes it very hard for AMD to come up with an answer to this GPU. Know that a dual-GPU of the K20 is logically coming up next, and the 705MHz clock gives space to overclock it with 40% – if you want to take the risk. This rises the bar for both Intel and AMD to come up with a faster accelerator under 225Watt for the rest of 2013. I am looking forward to see their answers.

Best&worse – NVIDIA vs AMD

Even though the cards are very comparable on most specifications, they are a winner and loser in one of them.

  • K10: best in GFLOPS/Watt (SP), worst in everything double precision.
  • K20: best in single-GPU performance, worst in price per GFLOPS (SP).
  • S9000: best memory bandwidth, worst in GFLOPS/Watt (SP)
  • S10000: best in performance SP&DP (dual-GPU), worst in power-usage/cooling.
I leave to you what the priorities are when to buy compute-cards – but make a list. Also take a look at the Intel Xeon Phi 5110P.

As usual, put in the comments your voice on points I missed or on anything you agree or disagree with. The comments are unmoderated, but be polite.

Sources

NVIDIA TESLA K20:

http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html
http://semiaccurate.com/2012/11/02/nvidia-tesla-k20-specs-gives-hints-about-28nm-yields/
http://vr-zone.com/articles/nvidia-s-top-end-kepler-unveiled-tesla-k20-comes-with-disappointing-specs-performance/17458.html
http://www.heise.de/newsticker/meldung/Finale-Spezifikationen-von-Nvidias-Tesla-K20-mit-GK110-GPU-enthuellt-1730408.html

AMD FirePro S9000:

http://www.amd.com/us/products/workstation/graphics/firepro-remote-graphics/S9000/Pages/S9000.aspx
http://www.amd.com/us/Documents/FirePro_S9000_Data_Sheet.pdf
http://www.amd.com/us/Documents/SDI-tech-brief.pdf

Related Posts

FIA_F1_Austria_2018_Nr._33_Verstappen

The Art of Benchmarking

...  software? The simpler the software setup, the easier to answer this question. The more complex the software, the more the answer will ...

5yearsSC

Birthday present! Free 1-day Online GPGPU crash course: CUDA / HIP / OpenCL

...  During the training pointer-questions will not be answered.Q: When exactly is it?A: Share the dates that suit you, and we'll use ...

network-of-boxes

Problem solving tactic: making black boxes smaller

...  translate them): Known knowns (facts)Known unknowns (unanswered questions)Unknown knowns (assumptions)Unknown unknowns (missing ...

stocks

Improving FinanceBench

If you're into computational finance, you might have heard of FinanceBench. It's a benchmark developed at the University of Deleware and is aimed a ...