Exploring Next
Exploring Next — Ep 201 w/ Justy & Cody — Improving Deep Agents with harness engineering
LangChain improved their coding agent from Top 30 to Top 5 on Terminal Bench 2.0 by only changing the harness - the system that wraps around the model. They used trace analysis to identify failure patterns and implemented targeted fixes like self-verification loops, context injection, and reasoning budget optimization. The 13.7 point improvement shows how much performance gains come from better tooling around models, not just bigger models.