Architecture
OmniData is a layered system. The directory bundle sits at the bottom. Everything else builds on top.
┌─────────────────────────────────────────────────┐
│ Surfaces │
│ Chrome Extension │ CLI │ MCP Server │ App │
├─────────────────────────────────────────────────┤
│ Runtime │
│ Adapters │ Chunker │ Embedder │ Search │
├─────────────────────────────────────────────────┤
│ Directory Bundle │
│ .omnidata/ (v0.2.0) │
│ │
│ manifest.json identity + schema version │
│ index.db resources, chunks, FTS5, │
│ embeddings, queue, kv │
│ memory.db collections, edges, tags, │
│ memory records │
│ adapters.json adapter registry + state │
│ ingress.log append-only ingest log │
│ blobs/ content-addressed files │
│ (SHA-256 fanout dirs) │
└─────────────────────────────────────────────────┘
Three layers
1. Directory bundle (the contract)
The .omnidata directory is a bundle containing multiple files, each with a focused responsibility:
| File | Purpose |
|---|---|
manifest.json |
Identity: who owns this container, what hat it belongs to, schema version |
index.db |
Search layer: resources, chunks with embeddings, FTS5 indexes, queue, key-value store |
memory.db |
Graph layer: collections, edges, tags, memory records with emotional salience |
blobs/ |
Content-addressed filesystem storage with SHA-256 fanout |
adapters.json |
What adapters feed this instance, their state and configuration |
ingress.log |
Append-only log of every ingest operation for debugging and replay |
The two databases have distinct concerns. index.db handles everything needed to find content: resource metadata, text chunks, vector embeddings, and full-text search indexes. memory.db handles everything needed to organize and relate content: hierarchies, graphs, and structured knowledge.
2. Runtime (the engine)
The runtime is a library that operates on .omnidata bundles:
- Adapter orchestration: Scheduling syncs, managing watermarks, enqueuing new content
- Text chunking: Splitting documents into segments (sliding window for text, tree-sitter for code)
- Embedding: Generating vector representations via models like Nomic embed-text
- RRF search: Combining vector similarity and full-text search with Reciprocal Rank Fusion
- Blob management: Content-addressed filesystem storage with SHA-256 naming and fanout directories
- Pipeline promotion: Moving resources from bronze (raw) to silver (chunked) to gold (searchable)
3. Surfaces (the interfaces)
Surfaces are applications that use the runtime:
- CLI:
omnidata init,omnidata search,omnidata ingest - MCP server: Exposes OmniData as tools for AI agents
- Chrome extension: Browser capture into the active
.omnidatainstance - Desktop app: Native macOS/Windows application
- API server: Thin HTTP layer for consumers that can’t access the filesystem
Data flow
Source (browser, voice, filesystem, messages)
│
▼
Adapter (discovers new/changed items)
│
▼
Queue (index.db — pending work items)
│
▼
Worker (dequeues, reads content, chunks, embeds)
│
├──▶ blobs/ (raw binary → filesystem, SHA-256 named)
├──▶ index.db (resource metadata, text chunks + embeddings, FTS5)
├──▶ memory.db (collections, edges, relationships)
└──▶ ingress.log (append-only operation record)
│
▼
Search (RRF: vector + FTS5 → fused results)
Binary content flows to the filesystem as content-addressed blobs. Metadata and searchable text flow to index.db. Organizational structure and relationships flow to memory.db. Every operation is recorded in ingress.log.
Instance isolation
Each .omnidata bundle is fully independent. A person may have many:
~/.local/share/eidosomni/instances/
├── director-of-ai.omnidata/
│ ├── manifest.json
│ ├── index.db
│ ├── memory.db
│ ├── blobs/
│ ├── adapters.json
│ └── ingress.log
├── eidos.omnidata/
├── greenmark.omnidata/
├── health.omnidata/
├── financial-manager.omnidata/
└── builder.omnidata/
Each instance has its own adapters, search index, blob store, and content. Moving to a new machine means copying the directories. Because .omnidata is a bundle, Finder (macOS) treats each one as a single item – drag, drop, zip, or back up as one unit.
Federation (future)
A future Meta-Omni layer will enable searching across multiple .omnidata instances. Each bundle must work alone first. Federation is additive, never required.