Ep 241 tool 5:08 w/ Justy & Cody

AI Coding Assistants Haven’t Sped up Delivery Because Coding Was Never the Bottleneck

Agoda's analysis of AI coding assistants reveals they boost individual developer output but don't speed up project delivery because coding was never the real bottleneck. The constraint has shifted upstream to specification and verification, fundamentally changing how engineering teams should be structured and what work humans focus on.

Script: Sonnet 4.5 Voice: ElevenLabs

Transcript

Izzo Your AI coding assistant just cranked out three hundred lines in ten minutes, but your feature still shipped two weeks late.

Izzo Welcome back to Exploring Next, I'm Izzo. This is episode two forty-one with Boone, and we're digging into why AI coding tools aren't actually speeding up software delivery.

Boone Yeah, and this isn't just vibes — Agoda published some fascinating analysis showing individual developer output is up twenty-one percent, but project velocity? Basically flat.

Izzo Which feels like every engineering team's experience right now. You're generating more code than ever, but you're still missing deadlines. What's going on here?

Boone The bottleneck shifted. Turns out coding was never the constraint — it was specification and verification. Classic systems thinking mistake.

Izzo Right, and this connects to something bigger happening across the industry. Teams are restructuring around AI tools without understanding where the actual work moved.

Boone Exactly. Leonardo Stern at Agoda breaks this down beautifully. He's basically rediscovering Fred Brooks' 'No Silver Bullet' argument — you optimize one part of the pipeline, pressure moves elsewhere.

Izzo Break down the numbers for me, because this data is wild.

Boone Faros AI analyzed over ten thousand developers across twelve hundred teams. High AI adoption teams completed twenty-one percent more tasks, merged ninety-eight percent more pull requests—

Izzo Ninety-eight percent more PRs? That's insane.

Boone But here's the kicker — PR review time increased by ninety-one percent. The throughput gains got eaten by review bottlenecks.

Izzo So you're generating twice as much code but spending twice as long reviewing it. The constraint just moved downstream.

Boone And upstream. Because now you need way more precise specifications. When a human writes code, they fill in gaps with context and judgment. AI agents need everything explicit.

Izzo This is hitting team structure too. Stern argues that if specification and architectural alignment become the highest-value work, small teams win differently than we thought.

Boone Yeah, traditionally small teams were about reducing communication overhead. But if communication IS the work — if you need genuine alignment on intent and edge cases — then it's not overhead anymore.

Izzo Five people can actually align around specifications in ways that fifteen people can't. It's not about reducing coordination, it's about achieving shared understanding faster.

Boone Exactly. And this connects to how engineers should relate to AI-generated code. Stern has this three-stance taxonomy that's really practical.

Izzo Walk me through it.

Boone White box means you read and review every line. Doesn't scale when agents produce thousands of lines per hour. Black box is shipping whatever the AI generates with minimal verification.

Izzo Which is terrifying for production systems.

Boone Right. So he advocates for grey box — humans stay accountable at two critical points: writing specifications precise enough for correct execution, and verifying results against evidence rather than inspecting implementation.

Izzo That's a fundamental shift in where engineers add value. You're not writing code, you're defining intent and governing outcomes.

Boone And crucially, accountability doesn't shift to the AI. The engineer who guides the agent and approves the merge request remains fully responsible for what ships.

Izzo I'm giving this framework an A-minus, Boone. It's the first practical approach I've seen for actually working with AI code generation at scale.

Boone It reminds me of that recent InfoQ piece on Spec-Driven Development. Same conclusion from a different angle — specifications become executable source of truth, generated code becomes a downstream artifact.

Izzo So human authority is migrating up the abstraction stack. From writing code to defining and governing intent. That's a massive shift in what engineering expertise looks like.

Boone And it explains why teams that jumped on AI coding tools early aren't seeing the velocity gains they expected. They optimized the wrong part of the system.

Izzo Alright, what should people actually go build with this insight? First, start experimenting with specification templates. Document your team's current spec-to-implementation process and identify where context gets lost. Build some structured templates for capturing architectural decisions and edge cases. Second, try evidence-based code review. Instead of line-by-line inspection, define what good outcomes look like — performance benchmarks, test coverage thresholds, security s