NVIDIA H200, announced in 2024, is a refresh of the H100 built on the same Hopper architecture but with a significantly expanded memory subsystem. With 141GB of HBM3e and roughly 43% more memory bandwidth, it both makes it easier to fit huge models onto a single GPU and delivers significant inference gains — especially in Long Context scenarios. NVIDIA pitched it as a 'do the same work with fewer GPUs' upgrade, particularly aimed at LLM inference workloads. Until the Blackwell generation (B100/B200) ramps up, the H200 is the most important refresh in the spine of frontier serving.
MEVZU N°124ISTANBULYEAR I — VOL. III
Glossary · Beginner · 2024
NVIDIA H200
A memory-expanded H100 refresh, optimised for long-context and very large models.
- EN — English term
- NVIDIA H200
- TR — Turkish term
- NVIDIA H200