What is the Content Exchange?
The Content Exchange is TAHO’s content-addressed storage system that identifies content by cryptographic hash rather than by location or filename. When you publish a file, TAHO generates a unique content ID that represents the exact bytes of that file. The same content always produces the same ID, regardless of filename or metadata. This approach provides:- Integrity verification - Content ID proves content hasn’t been corrupted
- Location independence - Content can be retrieved from any node holding it
- Efficient caching - Content can be safely cached and shared across nodes
Content IDs
Content IDs are cryptographic hashes that uniquely identify content. TAHO content IDs are self-describing identifiers with size and validation data woven into the hash. Nodes can inspect content properties, preallocate memory, and filter requests - all before a single byte is transferred. Once content is published and a content ID is generated, the content is immutable. The same bytes will always produce the same content ID, and any modification creates a new content ID. This guarantees that when you fetch content by ID, you receive exactly what was published.Publishing Content
Publish content to the exchange and receive a content ID:What Happens When You Publish
- Hash calculation - TAHO streams the file and calculates its hash
- Content ID generation - The hash becomes the content ID
- Local storage - Content is stored locally in the content exchange
- Network announcement - Your node announces content availability to The Mesh via gossip protocol
- ID output - The content ID is printed for later retrieval
Publishing uses streaming I/O, so large files (even multi-gigabyte models) are
handled efficiently without loading everything into memory.
Content Storage Locations
TAHO stores published content locally using a multi-tier storage system:- Memory cache - Frequently accessed content kept in memory (100MB default limit)
- File storage - Content persisted to disk in node-specific directories
- Hot promotion - Frequently accessed content automatically promoted to memory cache
~/.taho/data/content/<node-id>/.
Retrieving Content
When you need content, the Content Exchange retrieves it from local storage or automatically fetches it from remote peers.Local Retrieval
If content exists in local storage, it’s returned immediately:Remote Retrieval
If content is not available locally, TAHO automatically discovers and fetches it from The Mesh:- Discovery request - Your node broadcasts a content discovery request via gossip protocol
- Holder announcements - Nodes holding the content respond with announcements
- Peer selection - TAHO selects an optimal peer based on availability
- Content fetch - Content is fetched directly from the holder via request-response protocol
- Verification - Downloaded content is verified against the content ID hash
- Local caching - Content is stored locally for future requests
Content Discovery
The Content Exchange uses The Mesh’s gossip protocol for content discovery.Content Announcements
When a node publishes content or comes online with existing content, it announces availability to The Mesh. These announcements are gossiped across the network, so all nodes maintain an index of content locations.Discovery Process
When a node needs content, it broadcasts a discovery request. Holders respond with announcements containing their peer information. The requesting node then initiates a direct fetch from a selected holder.Gossip Protocol Integration
The Content Exchange leverages The Mesh’s gossipsub protocol for efficient content discovery:- Topic subscription - Nodes subscribe to the “content-exchange” gossip topic
- Event broadcasting - Content events are serialized and published to the topic
- Network-wide propagation - Events spread across The Mesh, reaching all nodes
- Holder index maintenance - Each node maintains a local index of content holders
Use Cases
Large File Distribution
Distribute large ML models, datasets, or media files without centralized storage:Distributed AI/ML Models
TAHO’s inference system uses the Content Exchange for storing and distributing ML models:- Model partitioning - Large models (>2GB) are partitioned into smaller subgraphs
- Content-addressed partitions - Each partition gets its own content ID
- Automatic deduplication - Common subgraphs across models are stored once
- Distributed loading - Nodes fetch model partitions from The Mesh as needed
Development Asset Sharing
Share development assets across team members or build machines:Storage Backends
The Content Exchange supports multiple storage backends that can be composed:Memory Store
In-memory cache with LRU eviction:- Fast access for hot content
- Configurable size limit (100MB default)
- No persistence - cleared on restart
File Store
Persistent file-based storage:- Content stored as individual files named by content ID
- Per-node isolation via subdirectories
- No automatic eviction
Composite Store
Combines memory and file storage with hot content promotion:- Frequently accessed content promoted to memory cache
- Memory evictions remain in file storage
- Provides both speed and persistence
Programmatic Access
The Content Exchange can be accessed programmatically in Rust applications:Security and Verification
Content Integrity
Content IDs provide cryptographic verification:- Hash verification - After fetching content, TAHO recalculates the hash
- Comparison - The calculated hash must match the content ID
- Rejection - Mismatched content is rejected and not stored
Immutability Guarantees
Content is immutable once published. Any attempt to modify content results in a new content ID. This provides:- Tamper evidence - Modified content has a different ID
- Version clarity - Each version has a unique, permanent identifier
- Reproducibility - Same content ID always means same content
Next Steps
- The Mesh - Learn about TAHO’s P2P network that powers content discovery
- Content Commands - Detailed CLI reference for content operations
- AI/ML Inference - How inference leverages the Content Exchange