Short, technical write-ups on the gap between a passing test and software that actually works — the failures that hide there, and how to find them before they ship.
Code that worked, but only on state nobody wrote down — a warm cache, a hand-installed tool. It passes every test until the first clean rebuild.
A passing test is only evidence if a broken system would have failed it. Three real cases — and the one question to ask of every green check.
Short, technical write-ups on verifying AI-built software — sent when a new one’s ready. No spam, no cadence promises.