Content Exchange

What is the Content Exchange?

The Content Exchange is TAHO’s content-addressed storage system that identifies content by cryptographic hash rather than by location or filename. When you publish a file, TAHO generates a unique content ID that represents the exact bytes of that file. The same content always produces the same ID, regardless of filename or metadata. This approach provides:

Integrity verification - Content ID proves content hasn’t been corrupted
Location independence - Content can be retrieved from any node holding it
Efficient caching - Content can be safely cached and shared across nodes

Content IDs

Content IDs are cryptographic hashes that uniquely identify content. TAHO content IDs are self-describing identifiers with size and validation data woven into the hash. Nodes can inspect content properties, preallocate memory, and filter requests - all before a single byte is transferred. Once content is published and a content ID is generated, the content is immutable. The same bytes will always produce the same content ID, and any modification creates a new content ID. This guarantees that when you fetch content by ID, you receive exactly what was published.

Publishing Content

Publish content to the exchange and receive a content ID:

# Publish a file
taho publish ./model.onnx
# Output: 000006VX0Q0HARDX7YPGSCFMY9J7NP1Z6AFNNFDAGKJD1X3A6H6S9E0240

# Publish any file type
taho publish ./dataset.tar.gz
taho publish ./document.pdf
taho publish ./image.png

What Happens When You Publish

Hash calculation - TAHO streams the file and calculates its hash
Content ID generation - The hash becomes the content ID
Local storage - Content is stored locally in the content exchange
Network announcement - Your node announces content availability to The Mesh via gossip protocol
ID output - The content ID is printed for later retrieval

Publishing uses streaming I/O, so large files (even multi-gigabyte models) are handled efficiently without loading everything into memory.

Content Storage Locations

TAHO stores published content locally using a multi-tier storage system:

Memory cache - Frequently accessed content kept in memory (100MB default limit)
File storage - Content persisted to disk in node-specific directories
Hot promotion - Frequently accessed content automatically promoted to memory cache

Content files are stored by content ID in TAHO’s data directory, typically ~/.taho/data/content/<node-id>/.

Retrieving Content

When you need content, the Content Exchange retrieves it from local storage or automatically fetches it from remote peers.

Local Retrieval

If content exists in local storage, it’s returned immediately:

# Content available locally - instant retrieval
client.get(&content_id).await?  // Returns immediately

Remote Retrieval

If content is not available locally, TAHO automatically discovers and fetches it from The Mesh:

Discovery request - Your node broadcasts a content discovery request via gossip protocol
Holder announcements - Nodes holding the content respond with announcements
Peer selection - TAHO selects an optimal peer based on availability
Content fetch - Content is fetched directly from the holder via request-response protocol
Verification - Downloaded content is verified against the content ID hash
Local caching - Content is stored locally for future requests

Content Discovery

The Content Exchange uses The Mesh’s gossip protocol for content discovery.

Content Announcements

When a node publishes content or comes online with existing content, it announces availability to The Mesh. These announcements are gossiped across the network, so all nodes maintain an index of content locations.

# Node A publishes content
taho publish ./model.onnx
# Output: 000006VX0Q0HARDX7YPGSCFMY9J7NP1Z6AFNNFDAGKJD1X3A6H6S9E0240
# Announcement gossiped: "Node A has 000006VX0Q0HARDX7YPGSCFMY9J7NP1Z6AFNNFDAGKJD1X3A6H6S9E0240"

# Node B receives the gossip
# Node B now knows Node A holds this content

Discovery Process

When a node needs content, it broadcasts a discovery request. Holders respond with announcements containing their peer information. The requesting node then initiates a direct fetch from a selected holder.

Gossip Protocol Integration

The Content Exchange leverages The Mesh’s gossipsub protocol for efficient content discovery:

Topic subscription - Nodes subscribe to the “content-exchange” gossip topic
Event broadcasting - Content events are serialized and published to the topic
Network-wide propagation - Events spread across The Mesh, reaching all nodes
Holder index maintenance - Each node maintains a local index of content holders

Use Cases

Large File Distribution

Distribute large ML models, datasets, or media files without centralized storage:

# Publisher shares a 3.4GB model
taho publish ./stable-diffusion-unet.onnx
# Output: 000006VX0Q0HARDX7YPGSCFMY9J7NP1Z6AFNNFDAGKJD1X3A6H6S9E0240

# Consumers automatically fetch from any holder
# Multiple consumers can fetch from different peers simultaneously

Distributed AI/ML Models

TAHO’s inference system uses the Content Exchange for storing and distributing ML models:

Model partitioning - Large models (>2GB) are partitioned into smaller subgraphs
Content-addressed partitions - Each partition gets its own content ID
Automatic deduplication - Common subgraphs across models are stored once
Distributed loading - Nodes fetch model partitions from The Mesh as needed

See AI/ML Inference for more details. Share development assets across team members or build machines:

# Share compiled artifacts
taho publish ./target/release/binary

# Share test fixtures
taho publish ./fixtures/test-data.json

# Share configuration snapshots
taho publish ./config/production.toml

Storage Backends

The Content Exchange supports multiple storage backends that can be composed:

Memory Store

In-memory cache with LRU eviction:

Fast access for hot content
Configurable size limit (100MB default)
No persistence - cleared on restart

File Store

Persistent file-based storage:

Content stored as individual files named by content ID
Per-node isolation via subdirectories
No automatic eviction

Composite Store

Combines memory and file storage with hot content promotion:

Frequently accessed content promoted to memory cache
Memory evictions remain in file storage
Provides both speed and persistence

The default configuration uses the composite store for optimal performance and persistence.

Programmatic Access

The Content Exchange can be accessed programmatically in Rust applications:

use taho_common::{NodeContext, content::Content};
use taho_content_exchange::ContentSystemService;

// Get client for the current node
let context = NodeContext::default();
let client = ContentSystemService::client(context.clone());

// Store content
let content = Content::from_slice(b"data");
let content_id = content.content_id();
client.put(content).await?;

// Retrieve content (local or remote)
let result = client.get(&content_id).await?;

// Check for local presence
let has_it = client.has(&content_id).await?;

Security and Verification

Content Integrity

Content IDs provide cryptographic verification:

Hash verification - After fetching content, TAHO recalculates the hash
Comparison - The calculated hash must match the content ID
Rejection - Mismatched content is rejected and not stored

This ensures you always receive authentic, unmodified content.

Immutability Guarantees

Content is immutable once published. Any attempt to modify content results in a new content ID. This provides:

Tamper evidence - Modified content has a different ID
Version clarity - Each version has a unique, permanent identifier
Reproducibility - Same content ID always means same content

Next Steps

The Mesh - Learn about TAHO’s P2P network that powers content discovery
Content Commands - Detailed CLI reference for content operations
AI/ML Inference - How inference leverages the Content Exchange

Documentation Index

​What is the Content Exchange?

​Content IDs

​Publishing Content

​What Happens When You Publish

​Content Storage Locations

​Retrieving Content

​Local Retrieval

​Remote Retrieval

​Content Discovery

​Content Announcements

​Discovery Process

​Gossip Protocol Integration

​Use Cases

​Large File Distribution

​Distributed AI/ML Models

​Development Asset Sharing

​Storage Backends

​Memory Store

​File Store

​Composite Store

​Programmatic Access

​Security and Verification

​Content Integrity

​Immutability Guarantees

​Next Steps

What is the Content Exchange?

Content IDs

Publishing Content

What Happens When You Publish

Content Storage Locations

Retrieving Content

Local Retrieval

Remote Retrieval

Content Discovery

Content Announcements

Discovery Process

Gossip Protocol Integration

Use Cases

Large File Distribution

Distributed AI/ML Models

Development Asset Sharing

Storage Backends

Memory Store

File Store

Composite Store

Programmatic Access

Security and Verification

Content Integrity

Immutability Guarantees

Next Steps