We knew Nvidia's new GH200 Grace Hopper processor was fast, but early benchmark results reveal exactly how fast it is. GPTshop.ai, which has built an incredibly powerful desktop computer based on the Grace Hopper processor, provided Phoronix with access to the chip for comparison.
The NVIDIA GH200 is a powerful combination of the 72-core Grace CPU and the H100 Tensor Core GPU. Supports up to 480 GB of LPDDR5 memory and 96 GB of HBM3 memory or 144 GB of HBM3e memory. The Grace CPU is built on Arm Neoverse-V2 cores, each with 1 MB of L2 cache and a total of 117 MB of L3 cache.
The NVIDIA GH200 runs standard AArch64 Linux distributions. For testing purposes, Phoronix used Ubuntu 23.10 with Linux 6.5, which provided a cutting-edge view of the NVIDIA GH200 Linux's performance against other Intel Xeon Scalable, AMD EPYC, and Ampere Altra Max processors.
cpu performance
The GPTshop.ai GH200 system was tested with 72 cores, a Quanta S74G motherboard, 480 GB of RAM, and 960 GB + 1920 GB SAMSUNG SSDs. All server processors tested were operating at their maximum memory frequencies and at the maximum number of supported memory channels.
Initial benchmarks focused on CPU performance, with GPU benchmarks to follow. Unfortunately, there are no power consumption figures yet, as the NVIDIA GH200 does not currently expose any interface in Linux to read the power/energy usage of the GH200. However, the raw initial CPU performance numbers are promising and show the NVIDIA GH200 as a beast in the processor field.
You can see all the test results on the Phoronix site here, but an average of all the results can be seen in the chart below. Although the EPYC 9754 emerged victorious in some ways, the GH200 processor came out on top in some of the tests.
Summing up, Phoronix says: “Geographically across all benchmarks conducted, the performance of the GH200 Grace CPU nearly matched the Intel Xeon Platinum 8592+ Emerald Rapids processor. The Arm Neoverse-V2-based Grace CPU tended to be much faster than the 128-core Ampere Altra Max AArch64 server. It will be interesting to see how AmpereOne can compete, although there is no hardware available to test yet. (Unfortunately, there is also no AMD MI300A hardware to test at this time.) “NVIDIA ARM CPU performance has certainly come a long way since the early days of NVIDIA Tegra benchmarking for ARM performance.”
More CPU benchmark numbers are available through this results file. There are also other benchmarks here from some of the preliminary testing.