Announced at Google Cloud Next on April 22, Google's eighth-generation TPUs mark the first time it has split its flagship accelerator into two specialized chips: the TPU 8t scales to 9,600-chip superpods with 2 petabytes of shared HBM and 121 ExaFlops for model training, while the TPU 8i prioritizes low-latency inference with 384MB of on-chip SRAM (3× the prior generation) optimized for serving agentic workloads. Google claims 80% better price-performance for the 8i and up to 2.8× gain for the 8t versus the prior Ironwood generation. Both chips target availability later in 2026 via Google's AI Hypercomputer.

Google splits its TPU line into dedicated training (8t) and inference (8i) chips at Cloud Next

Citations