> ## Documentation Index
> Fetch the complete documentation index at: https://docs.taho.is/llms.txt
> Use this file to discover all available pages before exploring further.

# Content Exchange

> Content-addressed storage with automatic peer discovery

## What is the Content Exchange?

The Content Exchange is TAHO's content-addressed storage system that identifies content by cryptographic hash rather than by location or filename. When you publish a file, TAHO generates a unique content ID that represents the exact bytes of that file. The same content always produces the same ID, regardless of filename or metadata.

This approach provides:

* **Integrity verification** - Content ID proves content hasn't been corrupted
* **Location independence** - Content can be retrieved from any node holding it
* **Efficient caching** - Content can be safely cached and shared across nodes

## Content IDs

Content IDs are cryptographic hashes that uniquely identify content. TAHO content IDs are **self-describing identifiers** with size and validation data woven into the hash. Nodes can inspect content properties, preallocate memory, and filter requests - all before a single byte is transferred.

Once content is published and a content ID is generated, the content is immutable. The same bytes will always produce the same content ID, and any modification creates a new content ID. This guarantees that when you fetch content by ID, you receive exactly what was published.

## Publishing Content

Publish content to the exchange and receive a content ID:

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
# Publish a file
taho publish ./model.onnx
# Output: 000006VX0Q0HARDX7YPGSCFMY9J7NP1Z6AFNNFDAGKJD1X3A6H6S9E0240

# Publish any file type
taho publish ./dataset.tar.gz
taho publish ./document.pdf
taho publish ./image.png
```

### What Happens When You Publish

1. **Hash calculation** - TAHO streams the file and calculates its hash
2. **Content ID generation** - The hash becomes the content ID
3. **Local storage** - Content is stored locally in the content exchange
4. **Network announcement** - Your node announces content availability to The Mesh via gossip protocol
5. **ID output** - The content ID is printed for later retrieval

<Note>
  Publishing uses streaming I/O, so large files (even multi-gigabyte models) are
  handled efficiently without loading everything into memory.
</Note>

### Content Storage Locations

TAHO stores published content locally using a multi-tier storage system:

* **Memory cache** - Frequently accessed content kept in memory (100MB default limit)
* **File storage** - Content persisted to disk in node-specific directories
* **Hot promotion** - Frequently accessed content automatically promoted to memory cache

Content files are stored by content ID in TAHO's data directory, typically `~/.taho/data/content/<node-id>/`.

## Retrieving Content

When you need content, the Content Exchange retrieves it from local storage or automatically fetches it from remote peers.

### Local Retrieval

If content exists in local storage, it's returned immediately:

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
# Content available locally - instant retrieval
client.get(&content_id).await?  // Returns immediately
```

### Remote Retrieval

If content is not available locally, TAHO automatically discovers and fetches it from The Mesh:

1. **Discovery request** - Your node broadcasts a content discovery request via gossip protocol
2. **Holder announcements** - Nodes holding the content respond with announcements
3. **Peer selection** - TAHO selects an optimal peer based on availability
4. **Content fetch** - Content is fetched directly from the holder via request-response protocol
5. **Verification** - Downloaded content is verified against the content ID hash
6. **Local caching** - Content is stored locally for future requests

## Content Discovery

The Content Exchange uses The Mesh's gossip protocol for content discovery.

### Content Announcements

When a node publishes content or comes online with existing content, it announces availability to The Mesh. These announcements are gossiped across the network, so all nodes maintain an index of content locations.

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
# Node A publishes content
taho publish ./model.onnx
# Output: 000006VX0Q0HARDX7YPGSCFMY9J7NP1Z6AFNNFDAGKJD1X3A6H6S9E0240
# Announcement gossiped: "Node A has 000006VX0Q0HARDX7YPGSCFMY9J7NP1Z6AFNNFDAGKJD1X3A6H6S9E0240"

# Node B receives the gossip
# Node B now knows Node A holds this content
```

### Discovery Process

When a node needs content, it broadcasts a discovery request. Holders respond with announcements containing their peer information. The requesting node then initiates a direct fetch from a selected holder.

```mermaid theme={"theme":{"light":"github-light","dark":"github-dark"}}
flowchart TB
    NodeA[Node A<br/>needs content] -->|Discovery Request| Mesh[The Mesh<br/>gossip]
    Mesh -->|Propagate| NodeB[Node B<br/>holder]
    Mesh -->|Propagate| NodeC[Node C<br/>holder]
    NodeB -->|Announcement| NodeA
    NodeC -->|Announcement| NodeA
    NodeA -->|Fetch| NodeB
    NodeB -->|Data| NodeA
```

### Gossip Protocol Integration

The Content Exchange leverages The Mesh's gossipsub protocol for efficient content discovery:

* **Topic subscription** - Nodes subscribe to the "content-exchange" gossip topic
* **Event broadcasting** - Content events are serialized and published to the topic
* **Network-wide propagation** - Events spread across The Mesh, reaching all nodes
* **Holder index maintenance** - Each node maintains a local index of content holders

## Use Cases

### Large File Distribution

Distribute large ML models, datasets, or media files without centralized storage:

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
# Publisher shares a 3.4GB model
taho publish ./stable-diffusion-unet.onnx
# Output: 000006VX0Q0HARDX7YPGSCFMY9J7NP1Z6AFNNFDAGKJD1X3A6H6S9E0240

# Consumers automatically fetch from any holder
# Multiple consumers can fetch from different peers simultaneously
```

### Distributed AI/ML Models

TAHO's inference system uses the Content Exchange for storing and distributing ML models:

* **Model partitioning** - Large models (>2GB) are partitioned into smaller subgraphs
* **Content-addressed partitions** - Each partition gets its own content ID
* **Automatic deduplication** - Common subgraphs across models are stored once
* **Distributed loading** - Nodes fetch model partitions from The Mesh as needed

See [AI/ML Inference](/core-concepts/ai-ml-inference) for more details.

### Development Asset Sharing

Share development assets across team members or build machines:

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
# Share compiled artifacts
taho publish ./target/release/binary

# Share test fixtures
taho publish ./fixtures/test-data.json

# Share configuration snapshots
taho publish ./config/production.toml
```

## Storage Backends

The Content Exchange supports multiple storage backends that can be composed:

### Memory Store

In-memory cache with LRU eviction:

* Fast access for hot content
* Configurable size limit (100MB default)
* No persistence - cleared on restart

### File Store

Persistent file-based storage:

* Content stored as individual files named by content ID
* Per-node isolation via subdirectories
* No automatic eviction

### Composite Store

Combines memory and file storage with hot content promotion:

* Frequently accessed content promoted to memory cache
* Memory evictions remain in file storage
* Provides both speed and persistence

The default configuration uses the composite store for optimal performance and persistence.

## Programmatic Access

The Content Exchange can be accessed programmatically in Rust applications:

```rust theme={"theme":{"light":"github-light","dark":"github-dark"}}
use taho_common::{NodeContext, content::Content};
use taho_content_exchange::ContentSystemService;

// Get client for the current node
let context = NodeContext::default();
let client = ContentSystemService::client(context.clone());

// Store content
let content = Content::from_slice(b"data");
let content_id = content.content_id();
client.put(content).await?;

// Retrieve content (local or remote)
let result = client.get(&content_id).await?;

// Check for local presence
let has_it = client.has(&content_id).await?;
```

## Security and Verification

### Content Integrity

Content IDs provide cryptographic verification:

1. **Hash verification** - After fetching content, TAHO recalculates the hash
2. **Comparison** - The calculated hash must match the content ID
3. **Rejection** - Mismatched content is rejected and not stored

This ensures you always receive authentic, unmodified content.

### Immutability Guarantees

Content is immutable once published. Any attempt to modify content results in a new content ID. This provides:

* **Tamper evidence** - Modified content has a different ID
* **Version clarity** - Each version has a unique, permanent identifier
* **Reproducibility** - Same content ID always means same content

## Next Steps

* **[The Mesh](/core-concepts/the-mesh)** - Learn about TAHO's P2P network that powers content discovery
* **[Content Commands](/cli-reference/content-commands)** - Detailed CLI reference for content operations
* **[AI/ML Inference](/core-concepts/ai-ml-inference)** - How inference leverages the Content Exchange
