Multi-agent coordination CLI with SQLite message bus: - State machine: ASSIGNED -> WORKING -> IN_REVIEW -> APPROVED -> COMPLETED - Commands: spawn, start, done, approve, merge, cancel, fail, heartbeat - SQLite WAL mode, dedicated heartbeat thread, channel-based IPC - cligen for CLI, tiny_sqlite for DB, ORC memory management Design docs for branch-per-worker, state machine, message passing, and human observability patterns. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
6.4 KiB
Message Passing: Design Comparison
Purpose: Compare our design decisions against Beads and Tissue to validate approach and identify gaps.
Summary Comparison
| Decision | Our Design (v2) | Beads | Tissue |
|---|---|---|---|
| Primary storage | SQLite (WAL mode) | SQLite + JSONL export | JSONL (append-only) |
| Cache/index | N/A (SQLite is primary) | SQLite is primary | SQLite (derived) |
| Write locking | SQLite BEGIN IMMEDIATE | SQLite BEGIN IMMEDIATE | None (git merge) |
| Concurrency model | SQLite transactions | Optimistic (hash IDs) + SQLite txn | Optimistic (git merge) |
| Crash safety | SQLite atomic commit | SQLite transactions | Git (implicit) |
| Heartbeats | Yes (10s interval) | No (daemon only) | No |
| Liveness detection | SQL query on heartbeat timestamps | Not documented | Not documented |
| Large payloads | Blob storage (>4KB) | Compaction/summarization | Not addressed |
| Coordination | Polling + claim-check | bd ready queries |
tissue ready queries |
| Message schema | Explicit (id, ts, from, type, payload) | Implicit (issue events) | Implicit (issue events) |
| Human debugging | JSONL export (read-only) | JSONL in git | JSONL primary |
Decision (2026-01-10): After orch consensus with 3 models, we aligned with Beads' approach (SQLite primary) over Tissue's (JSONL primary). Key factors:
- Payloads 1-50KB exceed POSIX atomic write guarantees (~4KB)
- Crash mid-write with flock still corrupts log
- SQLite transactions provide true atomicity
- JSONL export preserves human debugging (
tail -f)
Detailed Analysis
Where We Align
1. JSONL as Source of Truth All three systems use append-only JSONL as the authoritative store. This is the right call:
- Git-friendly (merges cleanly)
- Human-readable (debuggable with
cat | jq) - Simple to implement
2. SQLite as Derived Cache All three use SQLite for queries, not as primary storage:
- Beads: Always-on cache with dirty tracking
- Tissue: Derived index, gitignored
- Ours: Phase 2 optimization
3. Pull-Based Coordination All use polling/queries rather than push events:
bd ready/tissue ready/ ourpoll()function- Simpler than event-driven, works across process boundaries
Where We Diverge
1. Write Locking Strategy
| System | Approach | Trade-off |
|---|---|---|
| Ours | flock on JSONL file | Simple, prevents interleaving, works locally |
| Beads | SQLite BEGIN IMMEDIATE | Stronger guarantees, more complex |
| Tissue | None (trust git merge) | Simplest, but can corrupt JSONL mid-write |
Our rationale: flock is simpler than SQLite transactions and safer than trusting git merge for mid-write crashes. Tissue's approach assumes writes complete atomically, which isn't guaranteed for large JSON lines.
2. Crash Safety
| System | Approach |
|---|---|
| Ours | Write to staging → validate → append under lock → delete staging |
| Beads | SQLite transactions (rollback on failure) |
| Tissue | Git recovery (implicit) |
Our rationale: Staging directory adds explicit crash recovery without SQLite complexity. If agent dies mid-write, staged file is recovered on restart.
3. Heartbeats / Liveness
| System | Approach |
|---|---|
| Ours | Mandatory heartbeats every 10s, timeout detection |
| Beads | Background daemon (no explicit heartbeats) |
| Tissue | None |
Our rationale: LLM API calls can hang indefinitely. Without heartbeats, a stuck agent blocks tasks forever. Beads/Tissue are issue trackers, not real-time coordination systems.
4. Large Payload Handling
| System | Approach |
|---|---|
| Ours | Blob storage with content-addressable hashing |
| Beads | Compaction (summarize old tasks) |
| Tissue | Not addressed |
Our rationale: Code diffs and agent outputs can be large. Blob storage keeps the log scannable. Beads' compaction is for context windows, not payload size.
5. Message Schema
| System | Schema Type |
|---|---|
| Ours | Explicit message schema (id, ts, from, to, type, payload) |
| Beads | Issue-centric (tasks with dependencies, audit trail) |
| Tissue | Issue-centric (similar to Beads) |
Our rationale: We need general message passing (state changes, heartbeats, claims), not just issue tracking. Beads/Tissue are issue trackers first; we're building coordination primitives.
Gaps in Our Design (Learned from Beads)
1. Hash-Based IDs for Merge Safety
Beads uses hash-based IDs (e.g., bd-a1b2) to prevent merge collisions. We should consider this for message IDs if multiple agents might create messages offline and merge later.
2. Dirty Tracking for Incremental Export Beads tracks "dirty" issues for efficient JSONL export. When we add SQLite cache, we should track which messages need re-export rather than full rescans.
3. File Hash Validation Beads stores JSONL file hash to detect external modifications. We could add this to detect corruption or manual edits.
Gaps in Our Design (Learned from Tissue)
1. FTS5 Full-Text Search Tissue's SQLite cache includes FTS5 for searching issue content. Useful for "find messages mentioning X" queries in Phase 2.
2. Simpler Concurrency (Maybe) Tissue trusts git merge without explicit locking. For single-machine scenarios with small writes, this might be sufficient. We could offer a "simple mode" without flock for low-contention cases.
Validation Verdict
Our design is more complex than Tissue but simpler than Beads, which matches our use case:
- Tissue: Issue tracker, optimizes for git collaboration
- Beads: Full workflow engine with daemon, RPC, recipes
- Ours: Coordination primitives for multi-agent coding
The key additions we make (heartbeats, blob storage, staging directory) are justified by our real-time coordination requirements that issue trackers don't have.
Recommended Updates to Design
- Add hash-based message IDs - Prevent merge collisions if agents work offline
- Add file hash validation - Detect log corruption on startup
- Document "simple mode" - No flock for single-agent or low-contention scenarios
- Plan for FTS5 - Add to Phase 2 SQLite cache design
References
- Beads source: https://github.com/steveyegge/beads
- Tissue source: https://github.com/evil-mind-evil-sword/tissue
- Our design: docs/design/message-passing-layer.md