Sharding
TensorDB distributes data across multiple shards for write parallelism.
Shard Routing
Keys are routed deterministically:
shard_id = hash(key) % shard_countEach shard is an independent actor with its own:
- WAL file
- Memtable
- SSTable files and manifest
- Commit counter (monotonically increasing)
Single-Writer Actors
Each shard uses a single-writer actor model:
- One writer thread per shard processes writes sequentially
- No fine-grained locks needed — the actor serializes all writes
- Multiple readers via
ShardReadHandle(lock-free)
This design eliminates write contention while enabling concurrent reads.
Parallelism
With shard_count = 4 (default):
- 4 independent write pipelines
- Writes to different shards execute concurrently
- Cross-shard queries (e.g., full table scans) merge results from all shards
Configuration
| Parameter | Default | Description |
|---|---|---|
shard_count | 4 | Number of independent shards |
Choosing Shard Count
| Workload | Recommendation |
|---|---|
| Single-threaded | 1–2 shards |
| General purpose | 4 shards (default) |
| Write-heavy | 8–16 shards |
| Many cores | Match to core count |