Exploring Next

Exploring Next — Ep 354 w/ Justy & Cody — Agentic AI: How to Save on Tokens | Towards Data Science

Cody and Justy examine whether the token-saving techniques in Ida Silfverskiöld's article (prompt caching, semantic caching, lazy-loading, routing, context cleanup) are practical wins or theoretical cost-cutting that introduces real friction. Cody opens skeptical: the savings are real but the tradeoffs are often hidden or underestimated. Justy counters that for production teams already bleeding money on agentic AI, even 20-30% savings justifies the engineering lift. They land on a nuanced take: prompt caching is genuinely low-risk and worth it; semantic caching and aggressive routing are trickier and need honest trade-off audits before deployment.

Open source article

Full episode page with transcript →

Browse all Exploring Next episodes →