Machine-First Naming

OmniData names things for machines, not for humans saving keystrokes. Field names are verbose. URI schemes are explicit. Table names are prefixed. Metadata is self-documenting. This is a deliberate design choice.

The principle

It is 2026. AI writes the code. AI reads the configs. AI parses the metadata. The consumer of your schema is not a human typing SQL in a terminal — it is an agent that benefits from unambiguous, self-describing names and is not bothered by length.

Every naming decision in OmniData follows one rule: optimize for the reader that processes thousands of schemas, not the writer who touches this one twice.

What this looks like in practice

Table names are prefixed

All tables use the omnidata_ prefix:

omnidata_resources — not resources
omnidata_chunks — not chunks
omnidata_collections — not collections

When an agent encounters these tables in any database within the bundle, there is zero ambiguity about what they are or what system created them. A table called resources could belong to anything. A table called omnidata_resources is self-identifying.

The extension is explicit

The container uses .omnidata — not .omn, not .od, not .odata. The extension communicates exactly what the container is without requiring a lookup table or magic-byte check. File explorers, shell scripts, and AI agents all benefit from an extension that reads as a word.

Database files are self-describing

Within the .omnidata bundle, each database has a clear role:

index.db — resources, chunks, embeddings, FTS5, deltas, queue, kv
memory.db — collections, edges, tags, memory records

An agent opening the container knows exactly where to find what it needs.

Column names are descriptive

Columns say what they hold:

pipeline_state — not state or status
content_hash — not hash
embedding_model — not model
resource_at — not date or timestamp
hat_identifier — not hat_id

An agent reading the schema for the first time can infer the purpose of every column without consulting documentation.

URI schemes are namespaced

OmniData URIs use source-specific schemes: file:///, tosh://, chrome-capture://, gdrive://. The scheme tells you exactly where the content came from. No ambiguous https:// links that could be anything.

Metadata is JSON, not flags

Resource metadata is stored as a JSON column, not as a proliferation of nullable columns. This keeps the core schema stable while allowing adapters to attach arbitrary structured data. The metadata is self-documenting — keys are descriptive strings, not codes.

Config files are plain JSON

Container-level configuration lives in plain JSON files — manifest.json for identity and adapters.json for adapter registry. These are readable and editable without SQLite, making the bundle inspectable by any tool.

The cost is acceptable

Yes, omnidata_collections is longer to type than collections. The cost is a few extra characters. The benefit is a schema that any agent can reason about without external documentation. In an ecosystem where AI agents are the primary consumers of structured data, that trade-off is not close.

Brevity was a virtue when humans typed everything. Clarity is the virtue when machines parse everything.