Why AI Engineers Are Moving Beyond LangChain to Native Agent Architectures | Towards Data Science
Justy and Cody unpack why teams are moving from LangChain-style frameworks toward native agent architectures once LLM apps hit production pressure.
Script: GPT-5.5 Voice: OpenAI TTS
Transcript
Justy This is Exploring Next, episode 356. LangChain got teams moving fast, but production agents are making people ask who really owns the wiring.
Cody Yeah, this one feels very current because the pain is familiar. You ship the clean demo, everyone nods, and then three weeks later the agent skips a step and the logs say what happened, but not why.
Justy [sighs] That is the part real users never see, but totally feel. The support rep just gets a weird answer. The analyst waits longer. The product team gets a ticket that says, basically, your smart thing is acting haunted.
Cody And the source gives LangChain real credit, which I liked. Around early 2023, a decent engineer could wire a RAG pipeline in under an hour: vector store, retriever, prompt template, memory, model call. Before that, that could be a week or two of plumbing.
Justy That matters for adoption. If you're a startup testing a customer support copilot, or a data team building an internal research assistant, speed is the whole pitch. You don't want to spend the first sprint debating memory architecture while the market question is still, does anyone even want this?
Cody Right. The trouble starts when that prototype becomes load-bearing. Framework abstraction hides execution details so you can move quickly, but production wants the opposite. It wants the exact prompt, exact context, exact tool call order, exact fallback path, and exact state changes.
Justy So the adoption barrier for native agent architectures is pretty obvious: more code upfront, more design meetings, and fewer magical blocks you can drag into place. Also, managers hear “rewrite the orchestration layer” and quietly start checking the calendar.
Cody Totally fair. But the examples in the article are the kinds of things that burn time. A memory module silently drops context between steps. The fix takes four minutes, but finding it takes half a day because you're reading framework internals. LangSmith can help with traces, but you're still seeing the system through the spans the framework exposes.
Justy Where I get twitchy, Cody, is multi-agent stuff. One agent plans, one executes, another verifies. That sounds great on a slide. In a real workflow, who owns the truth?
Cody That's the core issue. Shared state becomes the product. One agent writes memory, another reads an older version, and the coordinator makes a decision from stale context. Add four or six model calls per request and the small overheads from serialization, callbacks, validation, and routing start showing up in p95 and p99 latency.
Justy I do want to push back on one thing, though. “Native” can become a prestige move. Like, the team had a working LangChain app, then rebuilt it just to feel more serious. That is not automatically better.
Cody [chuckles] Yeah, replacing a simple thing with a hand-rolled simple thing plus three extra weeks of maintenance is not engineering maturity. If three people use it internally and nobody's sleep depends on it, keep the framework. Learn. Ship. Change your mind later.
Justy The market split is basically prototype versus product. If you're still proving value, a framework buys speed. If you have real customers, SLAs, audits of what happened, or a workflow where mistakes cost time, the buyer is going to care about control and traceability.
Cody Native agent architecture just means the orchestration is regular code you own. State is an explicit object or table. Tools are plain functions you can unit test. Memory is stored and retrieved by code you wrote. Model calls are wrapped where you can log prompts, token counts, tool inputs, and business events.
Justy So fewer mystery boxes, more boring code. My favorite category of enterprise software, terrifyingly named boring code. [laughs]
Cody Honestly, boring code is great when the alternative is “the agent did something somewhere.” The clever part is that complex workflows fit better too: parallel tasks, conditional branches, async jobs, human review steps, retries. Event-driven code handles that more naturally than a synchronous chain pretending everything is one neat line.
Justy Build Next: take one existing LangChain workflow, not your whole app, and rebuild only the orchestration. Same model, same prompts, same data. Then compare latency, trace quality, and how long it takes to explain a bad output.
Cody For a weekend solo build, run `uv init native-agent-spike`, then add `openai`, `pydantic`, `pytest`, and `opentelemetry-sdk`. Make a tiny agent with a typed state object, two tool functions, a verifier step, and a SQLite table for traces. If you want comparison points, inspect LangChain, LangGraph, and LlamaIndex examples, but keep your spike as plain Python so you feel the trade-off directly.
Justy That’s it for Exploring Next. Own the wiring when the wiring becomes the product. Cody, I’ll call that practical enough.