> Each AMD EPYC 9v64H CPU physically have 96 Zen 4 cores
> 8 Zen 4 cores per EPYC 9v64H CPU
The two consecutive lines of text give two different core counts. I know initial reporting on this CPU has been unclear, with everyone initially saying 88 cores, then updating to 96. But the author could have spent a couple of words on what the extra 8 cores are used for (best I could find is "used as overhead").
When you're dedicating a whole system to a single VM, you need have some spares for the underlying OS to keep it happy.
The OS needs cores and RAM to be able to keep the system up. Everything from network cards to services need some spare power, otherwise things go very wrong, and the experience of the tenant becomes very bad.
That hypervisor, in the case of Azure, does more than normal... AWS seems to offload a LOT more to extra hardware. only ever seen MS mention hardware offload for networking... could be wrong, mind you...
Yes but one core per chiplet or maybe an entire chiplet is not used. I can speculate why it's set aside but some official information or even a more educated guess would have been very informative.
I'd love to see the benchmarks for this with SMT on: 96x2x4 = 768 CPUs in one system, along with 512GB of HBM that has 6900 GB/s memory bandwidth, and then DDR5.
> Each AMD EPYC 9v64H CPU physically have 96 Zen 4 cores
> 8 Zen 4 cores per EPYC 9v64H CPU
The two consecutive lines of text give two different core counts. I know initial reporting on this CPU has been unclear, with everyone initially saying 88 cores, then updating to 96. But the author could have spent a couple of words on what the extra 8 cores are used for (best I could find is "used as overhead").
I think 88 of the 96 cores per cou are assigned to the VM and the rest are assigned to the underlying hypervisor. I remember seeing that somewhere.
When you're dedicating a whole system to a single VM, you need have some spares for the underlying OS to keep it happy.
The OS needs cores and RAM to be able to keep the system up. Everything from network cards to services need some spare power, otherwise things go very wrong, and the experience of the tenant becomes very bad.
Or you need separate dedicated hardware for the hypervisor, like AWS has with Nitro.
The article says single-tenant, so that would be a waste of 32 cores for just one VM? Seems a lot.
That hypervisor, in the case of Azure, does more than normal... AWS seems to offload a LOT more to extra hardware. only ever seen MS mention hardware offload for networking... could be wrong, mind you...
Ok, I see, maybe these cores are managing things like i/o and hardware monitoring. Thanks for the opening.
Well, Windows has a lot of svchost.exe processes running. /s
Isn’t it 8 chipsets of 12-cores each.
That should have stated:
Yes but one core per chiplet or maybe an entire chiplet is not used. I can speculate why it's set aside but some official information or even a more educated guess would have been very informative.
I wonder what inference of big LLMs might look like with that much cache and memory bandwidth. Not trivial to get a benchmark for that but I wonder
For LLMs you'd want to use the MI300X variant and benchmarks should already be available.
I'd love to see the benchmarks for this with SMT on: 96x2x4 = 768 CPUs in one system, along with 512GB of HBM that has 6900 GB/s memory bandwidth, and then DDR5.
That's a monster of a CPU, wow.