Resources
The omnidata_resources table is the central registry of an OmniData container. It lives in index.db inside the .omnidata bundle. Every piece of content — a web page, a file, an email, a voice recording, a screenshot — gets exactly one row in this table, identified by its URI.
Schema
-- index.db
CREATE TABLE omnidata_resources (
id TEXT PRIMARY KEY,
uri TEXT NOT NULL UNIQUE,
source TEXT NOT NULL,
resource_type TEXT NOT NULL,
title TEXT,
content_hash TEXT,
byte_size INTEGER,
mime_type TEXT,
resource_at TEXT,
pipeline_state TEXT NOT NULL DEFAULT 'bronze',
metadata TEXT DEFAULT '{}',
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')),
updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')),
deleted_at TEXT
);
Columns
id
Text (UUID v4). Primary key. Generated at creation, never reused.
uri
Text. The unique identifier for this resource in its source system. Uses source-specific URI schemes: file:///path/to/doc.pdf, tosh://+15551234567/msg-uuid, chrome-capture://2026-03-28T10:00:00Z. The URI is the natural key — if two adapters produce the same URI, they refer to the same resource.
source
Text. The name of the adapter that created this resource (e.g., "filesystem", "tosh", "chrome-capture"). Used to route read_content() calls back to the correct adapter during pipeline promotion.
resource_type
Text. A coarse classification: "document", "message", "image", "audio", "video", "webpage", "note", "code". Implementations may extend this set.
title
Text, nullable. A human-readable title for display. May be the filename, email subject, message preview, or page title.
content_hash
Text, nullable. SHA-256 hash of the resource’s raw content. Used to locate the corresponding file in the blobs/ directory (content-addressed filesystem storage). NULL if no raw content has been stored (metadata-only resources).
byte_size
Integer, nullable. Size of the raw content in bytes. Used for storage accounting and display.
mime_type
Text, nullable. IANA media type of the raw content (e.g., "application/pdf", "image/png", "text/plain").
resource_at
Text, nullable (ISO 8601 UTC). The source’s own timestamp for this content — when the message was sent, when the file was last modified, when the page was captured. Distinct from created_at, which records when OmniData ingested it.
pipeline_state
Text. One of "bronze", "silver", or "gold". Tracks how far this resource has been processed. See Pipeline States for details.
metadata
Text (JSON object). Adapter-specific data that doesn’t fit the core columns. Examples: email headers, message thread IDs, file permissions, capture context.
created_at / updated_at / deleted_at
Text (ISO 8601 UTC). Standard lifecycle timestamps. deleted_at is NULL for active records; set to a timestamp for soft-deleted records.
Indexes
The following indexes are created by the bootstrap SQL in index.db:
CREATE INDEX idx_resources_uri ON omnidata_resources(uri);
CREATE INDEX idx_resources_source ON omnidata_resources(source);
CREATE INDEX idx_resources_pipeline ON omnidata_resources(pipeline_state);
CREATE INDEX idx_resources_content_hash ON omnidata_resources(content_hash);
CREATE INDEX idx_resources_deleted ON omnidata_resources(deleted_at);
One row per URI
The uniqueness constraint on uri is foundational. An adapter that encounters the same item twice should update the existing row, not create a duplicate. The URI is the join point between the external world and the OmniData registry.