NVIDIA's Blackwell architecture has entered full production ramp, with the GB200 NVL72 superchip — a rack-scale system combining 72 Blackwell GPUs and 36 Grace CPUs — now powering AI training clusters at Google, AWS, and Microsoft Azure.
The Performance Leap
The improvement over the previous Hopper architecture is substantial. NVIDIA claims a 4–5x improvement in training throughput for large language models at comparable energy spend, driven by architectural changes to the Tensor Core design, an NVLink interconnect running at 1.8 TB/s, and a new memory subsystem using HBM3e at 192 GB per GPU.
The Power Problem
The consumption figures are significant. The full GB200 NVL72 rack-scale system draws approximately 120 kW. Individual GB200 modules approach 1,200W under sustained AI workloads. For data centre operators, this means building or retrofitting facilities with new power distribution and liquid cooling infrastructure — a capital project adding lead time and cost to every AI cluster deployment.
The power requirements are a genuine constraint on how quickly the industry can deploy Blackwell at scale. New data centre construction with the required electrical and cooling infrastructure takes 18–36 months from planning to operation.
Effect on Consumer Hardware
From a PC hardware perspective, the Blackwell ramp has direct market effects. HBM3e production capacity is near its ceiling, with AI accelerators claiming the majority of output from Samsung, SK Hynix, and Micron. This constrains supply for gaming GPUs that use GDDR7 and GDDR6X, which compete for the same advanced packaging facilities.
NVIDIA's consumer Blackwell products — the RTX 50-series — share the architecture in heavily scaled-down form. The gaming variants use GDDR7 rather than HBM and operate at power levels consistent with enthusiast desktop use.