About

AI/ML Deep Dives

A personal research blog dedicated to making frontier AI & ML papers approachable — with interactive visualizations, annotated source code, and the kind of depth that connects the math to the implementation.

What this blog is

Most research papers reward careful reading but punish first-time readers with unexplained notation and buried intuitions. This blog exists to close that gap. Each post picks a single paper or tightly related cluster of ideas, digs into the mechanism at the level of code and equations, and pairs the explanation with live, in-browser visualizations so you can build the right mental model rather than just follow the prose.

Posts are written to be standalone and deep — not newsletter summaries. The goal is that after reading, you could reimplement the core idea from scratch.

Topics covered
Diffusion Language Models
Speculative Decoding
Knowledge Distillation
LLM Post-Training & RLVR
Inference Efficiency
KV-Cache & Quantization
Linear & Recurrent Attention
Pretraining & Scaling Laws
Multi-Token Prediction
LLM Serving & Systems
Who writes it

I'm Tianhao Zhou. I read a lot of ML papers and write up the ones I find genuinely interesting or underexplained. My focus is on the intersection of model architecture, training efficiency, and inference speed — the things that determine what actually ships in production.

If a post has an interactive component, it was written from scratch — no chart libraries, just SVG and a little JavaScript — so the visualization is exactly as precise as the explanation requires.