Documentation Specification SDKs

Filesystem as Runtime

OmniData stores blobs as regular files on the host filesystem. This is a deliberate architectural decision: instead of reimplementing compression, snapshots, and deduplication inside the application, OmniData delegates these to the operating system.

The filesystem becomes a runtime layer. What it provides depends on which filesystem the .omnidata bundle lives on.

Per-filesystem capabilities

btrfs (Linux)

btrfs is the strongest match for OmniData’s storage model.

Transparent compression. Mount with compress=zstd and every blob is compressed on write, decompressed on read. The application sees uncompressed data. A bundle storing PDFs and images can shrink 40-60% with zero application code.

Copy-on-Write snapshots. btrfs subvolume snapshot creates an instant, zero-cost snapshot of an .omnidata bundle. The snapshot shares all blocks with the original until one side modifies a file. This enables:

  • Point-in-time recovery before risky operations
  • Branching a bundle for experimentation
  • Instant backup before a large ingest

Block-level deduplication. Tools like duperemove find identical blocks across files and collapse them into shared extents. Two .omnidata instances that ingested the same PDF share the blob’s disk blocks.

Checksums. btrfs checksums every data and metadata block. Bit rot is detected on read and can be self-healed from RAID mirrors. OmniData’s content-addressed naming provides a second layer: if the filename hash doesn’t match the content, the blob is corrupt regardless of what the filesystem reports.

ZFS (Linux, FreeBSD, macOS via OpenZFS)

send/receive. zfs send streams a snapshot to another machine or pool. This enables efficient replication of .omnidata bundles across machines – only changed blocks are transferred after the initial full send.

Scrubbing. zfs scrub reads every block on the pool and verifies checksums. Silent corruption is detected and, on mirrored or RAIDZ pools, automatically repaired.

Compression. Like btrfs, ZFS supports transparent compression (LZ4, zstd). Blobs are compressed at the block level without application involvement.

Checksums. All data and metadata is checksummed (SHA-256, fletcher4, or skein). Combined with OmniData’s content-addressed naming, corruption is caught at two independent layers.

APFS (macOS)

Clonefile. clonefile() creates an instant copy of a file that shares all disk blocks with the original. Copying a blob within or across .omnidata bundles on the same volume is nearly free until one copy is modified.

Per-file encryption. APFS supports per-file encryption keys, which means different .omnidata bundles on the same volume can have different encryption properties when managed by FileVault or third-party tools.

Snapshots. APFS snapshots are created automatically by Time Machine and can be created manually. They provide point-in-time recovery of the entire volume, including all .omnidata bundles.

Space sharing. Multiple APFS volumes share a single container’s free space. Multiple .omnidata instances don’t need pre-allocated partitions.

ext4 (Linux)

ext4 is the baseline. It provides:

  • Journaling for crash consistency
  • Universal availability on every Linux distribution
  • Mature tooling for backup, recovery, and monitoring

ext4 does not provide transparent compression, deduplication, or snapshots. On ext4, OmniData works correctly but does not benefit from filesystem-level optimization. The application-level dedup from content addressing still applies.

S3 and object storage (future)

Object storage maps well to the blob model:

  • Each blob is an S3 object, keyed by its SHA-256 hash
  • Versioning provides history without application logic
  • Lifecycle policies can tier old blobs to Glacier or Deep Archive
  • Cross-region replication provides geographic redundancy
  • Server-side encryption (SSE-S3, SSE-KMS) encrypts at rest

The index.db and memory.db files would need a different strategy (they are not object-friendly), but the blob layer maps directly. A future adapter could sync the blobs/ directory to S3 while keeping the databases local.

Why this is impossible with single-file formats

When all content lives inside a single SQLite file:

  • The OS cannot see the blobs. They are opaque rows inside a B-tree. The filesystem cannot compress, deduplicate, or snapshot individual blobs.
  • Backup is all-or-nothing. Changing one blob means the entire database file has changed. rsync, Time Machine, and btrfs snapshots must process the full file.
  • No incremental sync. Transferring a .omnidata file to another machine means sending the whole thing, even if only one resource was added.
  • No block sharing. Two instances with identical blobs store them independently. The filesystem has no way to know the data is duplicated.
  • WAL contention. SQLite’s write-ahead log serializes all writes. Ingesting blobs competes with search queries and metadata updates for the same lock.

The directory bundle format makes blobs visible to the filesystem, unlocking every optimization the host provides.

Comparison table

Feature Single SQLite file Directory bundle (btrfs) Directory bundle (ZFS) Directory bundle (APFS) Directory bundle (ext4)
Transparent compression No zstd, lzo, zlib lz4, zstd No No
Snapshots No Instant, CoW Instant, CoW Time Machine, manual No
Block-level dedup No duperemove Native (RAM-heavy) clonefile (manual) No
Checksums No Per-block Per-block No No
Incremental backup Full file Changed files only zfs send (delta) Time Machine (changed files) Changed files only
Incremental sync (rsync) Full file Changed blobs only Changed blobs only Changed blobs only Changed blobs only
Per-blob encryption No No (volume-level) Native datasets Per-file (FileVault) No (volume-level)
Self-healing No RAID1/RAID10 RAIDZ mirrors No No

The “No” entries for single SQLite are not limitations of SQLite itself. They are limitations of storing binary content inside any single-file database. The data is invisible to the filesystem, so the filesystem cannot optimize it.

Practical guidance

Development machines (macOS, APFS). You get clonefile dedup, Time Machine snapshots, and FileVault encryption. No configuration needed.

Production servers (Linux, btrfs). Mount the instances directory with compress=zstd. Run duperemove periodically. Use btrfs snapshots before large ingests.

NAS / backup targets (ZFS). Use zfs send | zfs receive for efficient replication. Enable compression on the dataset. Schedule scrubs weekly.

Minimal environments (ext4). Everything works. You lose filesystem-level compression and dedup, but content-addressed storage still provides application-level dedup, and standard backup tools handle the rest.