Ep 448 tool 5:59 w/ Justy & Cody

Memory OS — Hermes Agent Memory Operating System

Two friends debate Memory OS, a seven-layer local memory stack for Hermes Agent. Justy is excited about the promise of a finally-sane agent memory layer; Cody pokes at the stack of SQLite, Qdrant, and 16 plugins, and whether it's solving a problem that already has solutions.

Script: Mistral Small 4 119B 2603 Voice: Murf.AI Gen2

Transcript

Justy Okay — this is going to sound insane on a podcast, but someone finally built a memory system for Hermes Agent that might actually work.

Cody Right. Because we all know how well the last six memory plugins for Hermes did.

Justy Exactly — they all forget between sessions, they all need you to hand-curate your facts, and half of them are cloud-locked.

Cody Mm-hm. So what’s this one?

Justy It’s Memory OS from ClaudioDrews — a seven-layer local memory OS that runs entirely on your machine and remembers every single conversation you’ve ever had in Hermes.

Cody Wait, local only? No cloud?

Justy No cloud subscription, no vendor lock-in, works with any LLM provider Hermes supports — OpenRouter, OpenAI, Anthropic, Ollama, local models, whatever.

Cody Sure. And the seven layers are…?

Justy Layer one is flat markdown files that get injected into the system prompt every single turn — MEMORY.md, USER.md, CREATIVE.md, whatever you call them.

Cody Classic. So it’s still just prompt engineering?

Justy Except layer two is sessions stored in a SQLite database with full-text search across your entire conversation history.

Cody Okay, that’s different. What’s layer three?

Justy Layer three is structured facts in another SQLite table with trust scoring and entity resolution — the system trains its own trust over time based on your feedback loops.

Cody Uh huh.

Justy Layer four is the Fabric Icarus fork — sixteen tools for recall, write, briefing, all cross-session.

Cody Of course there’s a fork. Sixteen tools for recall? You haven’t even opened the repo yet, have you?

Justy I glanced! Layer five is Qdrant as the vector database — 4096-dimensional cosine with BM25 fallback and weekly decay scanning plus semantic deduplication at cosine over point nine two.

Cody Surprise — it uses Qdrant.

Justy And layer six is an auto-curating LLM wiki that just writes itself from your conversations.

Cody Which is exactly how every auto-curating wiki project starts — in over its head.

Justy Okay okay — but the bottom line is you finally get a Hermes Agent that remembers your projects, decisions, and reasoning without paying a memory subscription.

Cody Assuming you can keep the Docker container alive and Qdrant from eating your SSD.

Justy That’s on you. The repo is five days old, but the pitch is ‘your agent finally stops forgetting.’

Cody So who should actually care? And does it change anything practical?

Justy Power users who live in Hermes Agent every day and have already hit the wall where every session starts with ‘Remind me what we did last time.’

Cody That’s like three people.

Justy Or anyone who’s paid for a memory add-on that still flakes.

Cody Sure, but local-only memory stacks are their own kind of lock-in — you still need Hermes Agent, Docker, Qdrant, Redis, ARQ worker, Python 3.11.

Justy Independent research, no subscriptions — that’s the pitch.

Cody Right. And the vector search over every conversation ever had … that’s table stakes in twenty twenty-six. BM25 plus dense embeddings isn’t exactly novel.

Justy It is if the alternative is pasting your entire history into a prompt and praying the context window holds.

Cody Or using one of the three cloud memory services that already do this.

Justy With monthly fees and NDAs.

Cody Point taken.

Cody Three hours in Charlotte. Managed to eat a biscuit the size of my fist and still miss my gate.

Justy Classic. Anyway — Memory OS, or at least the idea of not paying monthly for an agent’s memory anymore.