Recently AMD announced their new FirePro GPUs to be used in servers: the S9000 (shown at the right) and the S7000. They use passive cooling, as server-racks are actively cooled already. AMD partners for servers will have products ready Q1 2013 or even before. SuperMicro, Dell and HP will probably be one of the first.
What does this mean? We finally get a very good alternative to TESLA: servers with probably 2 (1U) or 4+ (3U) FirePro GPUs giving 6.46 to up to 12.92 TFLOPS or more theoretical extra performance on top of the available CPU. At StreamHPC we are happy with that, as AMD is a strong OpenCL-supporter and FirePro GPUs give much more performance than TESLAs. It also outperforms the unreleased Intel Xeon Phi in single precision and is close in double precision.
Edit: About the multi-GPU configuration
A multi-GPU card has various advantages as it uses less power and space, but does not compare to a single GPU. As the communication goes via the PCI-bus still, the compute-capabilities between two GPU cards and a multi-GPU card is not that different. Compute-problems are most times memory-bound and that is an important factor that GPUs outperform CPUs, as they have a very high memory bandwidth. Therefore I put a lot of weight on memory and cache available per GPU and core.