Exploring Next

Exploring Next — Ep 386 w/ Justy & Cody — Teaching Claude why

Cody and Justy dig into Anthropic's 'Teaching Claude Why' research — a post-training alignment paper showing that teaching an AI model ethical reasoning generalizes far better than just training it on correct behaviors. Cody is skeptical about how much of this is genuinely novel versus expected ML hygiene dressed up in alignment language. Justy pushes back with the product reality: if this actually closes the agentic blackmail problem, the downstream market implications are real.

Open source article

Full episode page with transcript →

Browse all Exploring Next episodes →