Saturday, April 25, 2026
Own Your Data: Organise It Into a Database for AI Agents
Posted by
If AI agents are going to mediate more of your work, you should own the context they read from.
That is the whole argument.
The useful thing an agent has is a context window. If the model provider owns the memory system that fills that window, they own too much of your working context.
The better pattern is to own a local context engine that agents can read from and write back to.
The context window is the whole game
People keep talking about AI memory as if the model is forming a stable human-like relationship with them.
That is not what is happening.
There is a context window. The system decides what to put in it. The model responds based on that context.
That means the durable asset is not the chat interface. It is not even the model. It is the corpus of context that gets retrieved and placed in front of the model at the right time.
Your projects. Your research. Your beliefs. Your decisions. Your source material. Your questions. Your mistakes. Your relationships. The state of your work.
That should not be trapped inside one provider's hidden memory feature.
Provider-owned memory is too small
ChatGPT and Claude remembering things about you is useful. I am not pretending otherwise.
But it is the wrong place to stop.
Those memory systems are shallow compared with the full texture of your work. They decide what matters. They store it in their format. They surface it inside their product. If you switch tools, the context does not cleanly come with you.
That is a problem if you believe agents will become a serious interface for your work.
The agent layer should be able to change. The context layer should remain yours.
Files are a good instinct, but not enough
The local-first instinct is correct.
You should be suspicious of handing your working context to a cloud product by default. A folder of markdown files is a much better starting point than a closed SaaS memory box.
But once agents become the primary user, files and folders start to show their limits.
Agents need to search, retrieve, update, and connect things. They need to know what a thing is, where it came from, and how it relates to the rest of the graph. They need to handle long source material without loading the whole world into context.
You can build all of that on top of a file system.
But you are basically building a database slowly and badly.
Why SQLite works
SQLite is boring in the best way.
It is local. It is portable. It is inspectable. It supports schema, indexes, full-text search, and ordinary queries. You can store it as one file on your machine, but still give agents a much cleaner interface than a folder of notes.
RA-H uses that boring foundation for a context graph.
The basic schema is simple:
- nodes: the atomic units of context
- edges: explicit relationships between nodes
- source: the material the node came from or preserves
- indexes and embeddings: retrieval surfaces for exact and semantic search
This is enough to do a lot.
An agent can find a source, search inside its chunks, create an insight node, link the insight back to the source, and then use that new node in future work.
That is the read/write loop that matters.
Nodes need to be explicit
The quality of the graph depends on the quality of the nodes.
Each node should have a clear title and an extremely explicit description. Do not make the model infer what something is from a cute title or a raw excerpt. Tell it plainly.
If it is a podcast, say it is a podcast. If it is an insight, say what the insight is. If it is a project, say what the project is and why it matters now.
The source can be the original material, a transcript, a link, a note, or a concise summary. The edge explains how this node connects to another node.
This is how you make a graph usable for agents.
It is not about hoarding data. It is about making the data legible.
Do not outsource the writeback
Reading context is only half of it.
The more important behavior is writeback.
When an agent learns something important, makes a decision, resolves a contradiction, finds a useful source, or creates an insight, that should not die in the chat transcript.
It should become durable context.
That is the design philosophy behind RA-H: agents should continually read from and write back to a context database you own.
Not everything belongs in the graph. Most chat should disappear. But the important stuff should be captured as explicit nodes and edges.
The point is not the app
The app is just one interface.
The durable thing is the context substrate.
You should be able to inspect it through a UI, query it through MCP, update it with an agent, and eventually move it between tools. The point is not to lock yourself into RA-H. The point is to stop locking your working context into model-provider memory systems.
That is why the open-source repo matters.
It gives people a way to try the architecture directly: local SQLite, graph-shaped context, MCP tools, hybrid retrieval, and explicit writeback.
What to do next
Start with one project.
Add the important sources. Create explicit nodes for the ideas, decisions, and people that matter. Link them carefully. Then make your agent read from the graph before it works and write back when something worth preserving happens.
Do not try to capture everything.
Capture the things that would make the next interaction smarter.
That is the whole system.
If you want the quick path, use the Mac app. If you want to run it yourself, clone the open-source repo.
Your data should be organised for the agents you are actually going to use.
And it should belong to you.