Skip to main content

What is the Content Exchange?

The Content Exchange is TAHO’s content-addressed storage system that identifies content by cryptographic hash rather than by location or filename. When you publish a file, TAHO generates a unique content ID that represents the exact bytes of that file. The same content always produces the same ID, regardless of filename or metadata. This approach provides:
  • Integrity verification - Content ID proves content hasn’t been corrupted
  • Location independence - Content can be retrieved from any node holding it
  • Efficient caching - Content can be safely cached and shared across nodes

Content IDs

Content IDs are cryptographic hashes that uniquely identify content. TAHO content IDs are self-describing identifiers with size and validation data woven into the hash. Nodes can inspect content properties, preallocate memory, and filter requests - all before a single byte is transferred. Once content is published and a content ID is generated, the content is immutable. The same bytes will always produce the same content ID, and any modification creates a new content ID. This guarantees that when you fetch content by ID, you receive exactly what was published.

Publishing Content

Publish content to the exchange and receive a content ID:
# Publish a file
taho publish ./model.onnx
# Output: 000006VX0Q0HARDX7YPGSCFMY9J7NP1Z6AFNNFDAGKJD1X3A6H6S9E0240

# Publish any file type
taho publish ./dataset.tar.gz
taho publish ./document.pdf
taho publish ./image.png

What Happens When You Publish

  1. Hash calculation - TAHO streams the file and calculates its hash
  2. Content ID generation - The hash becomes the content ID
  3. Local storage - Content is stored locally in the content exchange
  4. Network announcement - Your node announces content availability to The Mesh via gossip protocol
  5. ID output - The content ID is printed for later retrieval
Publishing uses streaming I/O, so large files (even multi-gigabyte models) are handled efficiently without loading everything into memory.

Content Storage Locations

TAHO stores published content locally using a multi-tier storage system:
  • Memory cache - Frequently accessed content kept in memory (100MB default limit)
  • File storage - Content persisted to disk in node-specific directories
  • Hot promotion - Frequently accessed content automatically promoted to memory cache
Content files are stored by content ID in TAHO’s data directory, typically ~/.taho/data/content/<node-id>/.

Retrieving Content

When you need content, the Content Exchange retrieves it from local storage or automatically fetches it from remote peers.

Local Retrieval

If content exists in local storage, it’s returned immediately:
# Content available locally - instant retrieval
client.get(&content_id).await?  // Returns immediately

Remote Retrieval

If content is not available locally, TAHO automatically discovers and fetches it from The Mesh:
  1. Discovery request - Your node broadcasts a content discovery request via gossip protocol
  2. Holder announcements - Nodes holding the content respond with announcements
  3. Peer selection - TAHO selects an optimal peer based on availability
  4. Content fetch - Content is fetched directly from the holder via request-response protocol
  5. Verification - Downloaded content is verified against the content ID hash
  6. Local caching - Content is stored locally for future requests

Content Discovery

The Content Exchange uses The Mesh’s gossip protocol for content discovery.

Content Announcements

When a node publishes content or comes online with existing content, it announces availability to The Mesh. These announcements are gossiped across the network, so all nodes maintain an index of content locations.
# Node A publishes content
taho publish ./model.onnx
# Output: 000006VX0Q0HARDX7YPGSCFMY9J7NP1Z6AFNNFDAGKJD1X3A6H6S9E0240
# Announcement gossiped: "Node A has 000006VX0Q0HARDX7YPGSCFMY9J7NP1Z6AFNNFDAGKJD1X3A6H6S9E0240"

# Node B receives the gossip
# Node B now knows Node A holds this content

Discovery Process

When a node needs content, it broadcasts a discovery request. Holders respond with announcements containing their peer information. The requesting node then initiates a direct fetch from a selected holder.

Gossip Protocol Integration

The Content Exchange leverages The Mesh’s gossipsub protocol for efficient content discovery:
  • Topic subscription - Nodes subscribe to the “content-exchange” gossip topic
  • Event broadcasting - Content events are serialized and published to the topic
  • Network-wide propagation - Events spread across The Mesh, reaching all nodes
  • Holder index maintenance - Each node maintains a local index of content holders

Use Cases

Large File Distribution

Distribute large ML models, datasets, or media files without centralized storage:
# Publisher shares a 3.4GB model
taho publish ./stable-diffusion-unet.onnx
# Output: 000006VX0Q0HARDX7YPGSCFMY9J7NP1Z6AFNNFDAGKJD1X3A6H6S9E0240

# Consumers automatically fetch from any holder
# Multiple consumers can fetch from different peers simultaneously

Distributed AI/ML Models

TAHO’s inference system uses the Content Exchange for storing and distributing ML models:
  • Model partitioning - Large models (>2GB) are partitioned into smaller subgraphs
  • Content-addressed partitions - Each partition gets its own content ID
  • Automatic deduplication - Common subgraphs across models are stored once
  • Distributed loading - Nodes fetch model partitions from The Mesh as needed
See AI/ML Inference for more details.

Development Asset Sharing

Share development assets across team members or build machines:
# Share compiled artifacts
taho publish ./target/release/binary

# Share test fixtures
taho publish ./fixtures/test-data.json

# Share configuration snapshots
taho publish ./config/production.toml

Storage Backends

The Content Exchange supports multiple storage backends that can be composed:

Memory Store

In-memory cache with LRU eviction:
  • Fast access for hot content
  • Configurable size limit (100MB default)
  • No persistence - cleared on restart

File Store

Persistent file-based storage:
  • Content stored as individual files named by content ID
  • Per-node isolation via subdirectories
  • No automatic eviction

Composite Store

Combines memory and file storage with hot content promotion:
  • Frequently accessed content promoted to memory cache
  • Memory evictions remain in file storage
  • Provides both speed and persistence
The default configuration uses the composite store for optimal performance and persistence.

Programmatic Access

The Content Exchange can be accessed programmatically in Rust applications:
use taho_common::{NodeContext, content::Content};
use taho_content_exchange::ContentSystemService;

// Get client for the current node
let context = NodeContext::default();
let client = ContentSystemService::client(context.clone());

// Store content
let content = Content::from_slice(b"data");
let content_id = content.content_id();
client.put(content).await?;

// Retrieve content (local or remote)
let result = client.get(&content_id).await?;

// Check for local presence
let has_it = client.has(&content_id).await?;

Security and Verification

Content Integrity

Content IDs provide cryptographic verification:
  1. Hash verification - After fetching content, TAHO recalculates the hash
  2. Comparison - The calculated hash must match the content ID
  3. Rejection - Mismatched content is rejected and not stored
This ensures you always receive authentic, unmodified content.

Immutability Guarantees

Content is immutable once published. Any attempt to modify content results in a new content ID. This provides:
  • Tamper evidence - Modified content has a different ID
  • Version clarity - Each version has a unique, permanent identifier
  • Reproducibility - Same content ID always means same content

Next Steps

  • The Mesh - Learn about TAHO’s P2P network that powers content discovery
  • Content Commands - Detailed CLI reference for content operations
  • AI/ML Inference - How inference leverages the Content Exchange