contributor

Liam Chen

@lchen

Inference engineer focused on kernel-level optimizations and evaluation methodology. Believes the gap between a benchmark number and a useful number is where most of the engineering actually lives. Writes about attention kernels, profiling, and the eval landscape.

2 articles

Inference & Serving · Evaluation focus

2 articles

The arithmetic of attention: why FlashAttention still matters

Memory bandwidth, not FLOPs, is what bounds modern inference. A walk through the numbers behind a kernel that quietly reshaped the field.

Liam Chen Inference & Serving

Nov 12, 2026
14 min

The four evals that matter (and the dozen that don't)

We have too many benchmarks and too few signals. A framework for choosing evaluations that correlate with the thing you actually care about.

Liam Chen Evaluation

Jul 22, 2026
13 min