Multi-Provider Chats

May 16, 2026

Your codebase, chats, and instruction files. Embedded once, queryable forever.

Most "AI for code" tools re-embed your repo every session, charge you per token to do it, and throw the index away when the chat ends. You pay for the same context window, over and over, and the model still doesn't remember what you discussed last Tuesday.

The RAG Index is the layer that fixes this. It maintains persistent embeddings across three sources that normally live in completely different places: your code (AST-aware, not just text chunks), your chat history (ingested from Claude Code, Copilot, and other CLI agents), and the instruction files generated for each package in your repo.

Retrieval happens on demand. Agents query the index when they actually need context, not preemptively at the start of every session. That keeps token costs predictable and keeps the agent grounded in your decisions rather than its best guess at what's in the file.