back

Google splits its TPU line into dedicated training (8t) and inference (8i) chips at Cloud Next

2026-04-28 01:06

Announced at Google Cloud Next on April 22, Google's eighth-generation TPUs mark the first time it has split its flagship accelerator into two specialized chips: the TPU 8t scales to 9,600-chip superpods with 2 petabytes of shared HBM and 121 ExaFlops for model training, while the TPU 8i prioritizes low-latency inference with 384MB of on-chip SRAM (3× the prior generation) optimized for serving agentic workloads. Google claims 80% better price-performance for the 8i and up to 2.8× gain for the 8t versus the prior Ironwood generation. Both chips target availability later in 2026 via Google's AI Hypercomputer.

Citations