M A N D A L I V I A
Obsidian Lab All Notes

Your Obsidian Vault Is Already an Agent Memory System

The biggest challenge with AI agents is context. Every session starts from zero. The agent doesn’t know who you are, what you’re working on, or what you told it yesterday. It discovers things, helps you think, and then forgets everything when the window closes.

A whole category of tools has emerged to solve this. Letta (formerly MemGPT) treats the context window like operating system memory — a hierarchy of core memory (pinned in the system prompt, editable by the agent), recall memory (searchable conversation history), and archival memory (processed knowledge in external databases). Zep builds temporal knowledge graphs where facts have timestamps, so the system knows when something was true and when it changed. Mem0 combines vector search with graph memory to preserve how information connects across time.

These systems share a core insight: the agent needs permission and tooling to write to its own memory, not just read from a static store. Letta’s “sleep-time agents” run between sessions to consolidate fragmented memories, deduplicate, prune outdated information, and reorganize what’s left. The agent doesn’t just remember — it maintains what it remembers.

Karpathy’s recent “LLM Wiki” proposal describes a similar architecture from the opposite direction: structured markdown files maintained by an LLM as persistent memory. His framing — “you never write the wiki yourself” — makes sense for people starting from zero. But if you already have a second brain, the opportunity is different. The vault becomes a shared surface where both you and the agent write, not an archive the agent maintains for you. I wrote about what that distinction means in practice — the short version is that a wiki the LLM writes for you looks backward, but a vault you both work in looks forward.

That’s where I started. I already had the second brain — thousands of notes in an Obsidian vault, wikilinked together, semantically searchable. A knowledge graph built by hand over years. The question wasn’t how to build an LLM Wiki. It was why my agent couldn’t use the one I already had.

The vault wasn’t the problem. The problem was that my agent didn’t know how to use it.

What was failing

Claude Code can read files in my vault. It has semantic search. It can follow wikilinks. But in practice, it almost never did any of this unprompted. I’d ask a question about my family and the agent would either hallucinate an answer or ask me to provide context it was sitting on top of.

The issue was orientation. The vault has thousands of files. The agent had no sense of which ones mattered, which ones to read first, or how to navigate from a summary to the details. It was like handing someone the keys to a library with no catalog.

The fix: a thin orientation layer

The solution borrows directly from Letta’s memory architecture, but maps it onto what already exists.

Letta has three memory tiers. Core memory is small, always in context, and editable. Recall memory is conversation history. Archival memory is everything else, searchable on demand. An Obsidian vault already has the equivalent of recall (daily notes, session logs) and archival (the full vault, searchable). What was missing was core memory — a small, always-loadable layer that tells the agent who I am and where to look.

I created context files. They live in a dedicated folder and follow a strict format: terse markdown bullets, one fact per line, around 40 lines max per file. The frontmatter is minimal — just a title and dates. No routing metadata, no pointers to related files. Depth comes from wikilinks inline.

The early version tried to be clever about routing. Each file had a load-when field in frontmatter — a tag like “Personal context, planning, coaching” that the agent was supposed to match against the current conversation. I also tried a see-also field pointing to deeper vault files the agent could follow. In practice, neither worked. The agent never matched load-when unprompted — it required globbing the folder, reading each file’s frontmatter, deciding which tags matched. Too many steps before it could even start helping. And see-also was redundant because the context files already contained wikilinks inline that served the same purpose.

The fix was simpler: list all five context files directly in CLAUDE.md with one-line summaries. Since CLAUDE.md is always in context, the agent sees the full menu immediately and loads what’s relevant without any intermediate steps. I eventually stripped load-when and see-also from the frontmatter entirely — dead weight that made the files look more sophisticated than they were.

Five context files cover everything the agent needs for orientation:

  • CTX-aboutme.md — Identity, family, friends, values, patterns
  • CTX-now.md — Current phase, location, daily life, scheduling
  • CTX-Work.md — Career, consulting, professional context
  • CTX-project-index.md — Active builds, experiments, on-hold work, repo paths
  • CTX-systems.md — Tools, services, automation, dev environment

Total: about 200 lines across five files. Down from 700+ lines across seven files that the agent mostly ignored.

Every conversation starts here. CLAUDE.md is always in context — the agent sees it before reading anything else. It lists the five context files with one-line summaries.
The agent picks what it needs for this conversation. If the topic is personal, it loads CTX-aboutme. If it's a build question, CTX-project-index. Most sessions need only one or two files.
Then it digs deeper when it needs details. Wikilinks inside context files resolve to full notes — friend profiles, project files, repo paths. The vault provides the depth; the context files provide the map.

How the agent navigates depth

The context files are summaries. They don’t duplicate what’s in the vault — they point to it. My about-me file lists friends by name as wikilinks. Each link resolves to a note in a Personal CRM folder. If I ask “which friends haven’t I reached out to recently?”, the agent reads the friend list from the context file, follows the wikilinks to each person’s CRM note, checks the last dated entry, and answers.

The same pattern works everywhere. The work context file mentions a consulting client with key contacts listed as wikilinks. If the conversation goes deeper — upcoming meetings, project specifics — the agent follows the link to the full project file. The context file gives it enough to orient; the vault gives it the depth.

CLAUDE.md includes instructions for following wikilinks (use Glob to find the file, resolve aliases). This sounds trivial but it wasn’t default behavior — the agent would see a [wikilink](/obsidian/wikilink/) and stop rather than follow it. Explicit instructions changed that.

The maintenance loop: /sleep

Context files go stale. Ages change, projects finish, situations shift. Letta solves this with sleep-time agents that consolidate memory asynchronously. I built a simpler version: a /sleep command that reviews and prunes the context files.

When you run /sleep, the agent:

  1. Reads every context file and the CLAUDE.md Quick Start section
  2. Checks for stale information, verbosity, contradictions between files, and gaps
  3. Reports what it found in a structured format
  4. Asks questions about anything it’s uncertain about before making changes
  5. Edits the files, tightens the prose, updates last-reviewed dates

It follows retention rules borrowed from Letta’s best practices: identity and preferences never expire, situational context gets reviewed when stale, dated events get pruned after they pass, and emotional or inner-landscape information doesn’t get changed without asking.

The key principle is prune over append. The goal is to keep the context files small, not to let them grow into the same bloated state that made the agent ignore them in the first place.

Remembering to sleep

The /sleep process can’t be automated. It requires judgment calls — the agent asks whether ages are still correct, whether a project has shipped, whether a stale line should be updated or removed. Someone has to answer those questions. That makes it a manual process, which means it has the failure mode of every manual process: you forget to do it, and the context files drift until the agent starts giving you slightly wrong answers.

The fix was building the nudge into the tool I’m already staring at. Claude Code has a configurable status line — a single shell command whose output renders at the bottom of every session. I added a sleep indicator that shows how many days since the last run. When I open a vault session and see 💤 12d ago, the reminder is ambient. No calendar event, no recurring task — just a number ticking up in the corner of my terminal until it bothers me enough to run /sleep.

The implementation is almost embarrassingly simple. The /sleep skill touches a timestamp file when it finishes. The status line reads that file’s modification time and computes the difference. If the file doesn’t exist, it shows “never” — which is its own kind of nudge.

What this actually looks like

CLAUDE.md carries a few lines of essential identity — always in context, zero extra file reads needed:

- Owner: [Name]. Partner: [Name]. Daughter: [Name], 12.
- Currently in [City, Country]. Based in [Home City].

Below that, all five context files are listed with one-line summaries — the agent sees this every session without reading any files. Instructions tell it to load the relevant ones before answering, follow wikilinks for depth, and note new facts for /sleep to consolidate later.

The context files themselves look like compressed dossiers:

## Parents
- Dad: [Name], xx — [health conditions]
- Mom: [Name], xx — [health conditions]
- Both in [City]. Doctors are John Doe and Jane Doe.

Every line is hand-editable. No proprietary format, no database, no external service. Just markdown files in a folder you can open in Obsidian and fix with your thumbs.

What I haven’t figured out yet

The agent doesn’t proactively update context files during a conversation. If I mention that a project shipped or a friend moved cities, the agent acknowledges it but doesn’t write it back. Right now, /sleep is the only write path — it’s manual, not automatic. Letta’s approach of triggering memory updates every N conversation turns is more robust. Whether that’s worth building or whether periodic manual consolidation is good enough, I don’t know yet.

There’s also a question about how many context files is the right number. Five feels right now — it grew from three as I realized the agent needed separate files for projects and systems/tooling to avoid overloading the others. The Letta people recommend 15-25 small files with deep hierarchy. That feels like overkill for personal context, but I haven’t tested the boundary.

The deeper question is whether the agent’s orientation should be mostly top-down (context files tell it where to look) or bottom-up (semantic search surfaces what’s relevant). Right now it’s top-down with search as fallback. But the vault already has semantic search that works well. Maybe the context files will eventually shrink to almost nothing — just enough to bootstrap the agent’s sense of who it’s talking to — and everything else comes through search. I’m not sure yet.

The hardest part of maintaining context files is remembering to do it. I added a sleep indicator to my Claude Code status line that shows how many days since the last /sleep run. When it says ”💤 12d ago,” I know the files are getting stale. It’s a small thing, but turning maintenance from a calendar reminder into an ambient signal made the difference between running it weekly and forgetting for months.

Keep Exploring