Alan's PKB

Alan's PKB

May 03, 20261 min read

One memory access costs 1,000 multiply-accumulates. That single fact shapes every chip, kernel, and serving system in production. These notes trace the connections — from transistors to tokens.

sections

cornered chips — custom silicon for workloads GPUs can’t touch
hardware — where all 92 billion transistors actually live
inference — your $30K GPU spends most of its time waiting
systems — the 10-50x gap between a naive kernel and an expert one
semiconductors — when leading-edge silicon costs more per transistor, not less
context — every AI chip bet has been made before

start here

systolic arrays — the computational primitive behind every AI accelerator
breaking down blackwell — NVIDIA’s B200, from transistors to inference cost
DIY TPU v1 — reverse-engineering Google’s first AI chip from scratch

tools

interactive visualizations — charts, calculators, and die maps
about

Graph View

sections
start here
tools

© 2026

GitHub
RSS