# Message Passing: Design Comparison **Purpose**: Compare our design decisions against Beads and Tissue to validate approach and identify gaps. ## Summary Comparison | Decision | Our Design (v2) | Beads | Tissue | |----------|-----------------|-------|--------| | **Primary storage** | SQLite (WAL mode) | SQLite + JSONL export | JSONL (append-only) | | **Cache/index** | N/A (SQLite is primary) | SQLite is primary | SQLite (derived) | | **Write locking** | SQLite BEGIN IMMEDIATE | SQLite BEGIN IMMEDIATE | None (git merge) | | **Concurrency model** | SQLite transactions | Optimistic (hash IDs) + SQLite txn | Optimistic (git merge) | | **Crash safety** | SQLite atomic commit | SQLite transactions | Git (implicit) | | **Heartbeats** | Yes (10s interval) | No (daemon only) | No | | **Liveness detection** | SQL query on heartbeat timestamps | Not documented | Not documented | | **Large payloads** | Blob storage (>4KB) | Compaction/summarization | Not addressed | | **Coordination** | Polling + claim-check | `bd ready` queries | `tissue ready` queries | | **Message schema** | Explicit (id, ts, from, type, payload) | Implicit (issue events) | Implicit (issue events) | | **Human debugging** | JSONL export (read-only) | JSONL in git | JSONL primary | **Decision (2026-01-10)**: After orch consensus with 3 models, we aligned with Beads' approach (SQLite primary) over Tissue's (JSONL primary). Key factors: - Payloads 1-50KB exceed POSIX atomic write guarantees (~4KB) - Crash mid-write with flock still corrupts log - SQLite transactions provide true atomicity - JSONL export preserves human debugging (`tail -f`) ## Detailed Analysis ### Where We Align **1. JSONL as Source of Truth** All three systems use append-only JSONL as the authoritative store. This is the right call: - Git-friendly (merges cleanly) - Human-readable (debuggable with `cat | jq`) - Simple to implement **2. SQLite as Derived Cache** All three use SQLite for queries, not as primary storage: - Beads: Always-on cache with dirty tracking - Tissue: Derived index, gitignored - Ours: Phase 2 optimization **3. Pull-Based Coordination** All use polling/queries rather than push events: - `bd ready` / `tissue ready` / our `poll()` function - Simpler than event-driven, works across process boundaries ### Where We Diverge **1. Write Locking Strategy** | System | Approach | Trade-off | |--------|----------|-----------| | **Ours** | flock on JSONL file | Simple, prevents interleaving, works locally | | **Beads** | SQLite BEGIN IMMEDIATE | Stronger guarantees, more complex | | **Tissue** | None (trust git merge) | Simplest, but can corrupt JSONL mid-write | **Our rationale**: flock is simpler than SQLite transactions and safer than trusting git merge for mid-write crashes. Tissue's approach assumes writes complete atomically, which isn't guaranteed for large JSON lines. **2. Crash Safety** | System | Approach | |--------|----------| | **Ours** | Write to staging → validate → append under lock → delete staging | | **Beads** | SQLite transactions (rollback on failure) | | **Tissue** | Git recovery (implicit) | **Our rationale**: Staging directory adds explicit crash recovery without SQLite complexity. If agent dies mid-write, staged file is recovered on restart. **3. Heartbeats / Liveness** | System | Approach | |--------|----------| | **Ours** | Mandatory heartbeats every 10s, timeout detection | | **Beads** | Background daemon (no explicit heartbeats) | | **Tissue** | None | **Our rationale**: LLM API calls can hang indefinitely. Without heartbeats, a stuck agent blocks tasks forever. Beads/Tissue are issue trackers, not real-time coordination systems. **4. Large Payload Handling** | System | Approach | |--------|----------| | **Ours** | Blob storage with content-addressable hashing | | **Beads** | Compaction (summarize old tasks) | | **Tissue** | Not addressed | **Our rationale**: Code diffs and agent outputs can be large. Blob storage keeps the log scannable. Beads' compaction is for context windows, not payload size. **5. Message Schema** | System | Schema Type | |--------|-------------| | **Ours** | Explicit message schema (id, ts, from, to, type, payload) | | **Beads** | Issue-centric (tasks with dependencies, audit trail) | | **Tissue** | Issue-centric (similar to Beads) | **Our rationale**: We need general message passing (state changes, heartbeats, claims), not just issue tracking. Beads/Tissue are issue trackers first; we're building coordination primitives. ### Gaps in Our Design (Learned from Beads) **1. Hash-Based IDs for Merge Safety** Beads uses hash-based IDs (e.g., `bd-a1b2`) to prevent merge collisions. We should consider this for message IDs if multiple agents might create messages offline and merge later. **2. Dirty Tracking for Incremental Export** Beads tracks "dirty" issues for efficient JSONL export. When we add SQLite cache, we should track which messages need re-export rather than full rescans. **3. File Hash Validation** Beads stores JSONL file hash to detect external modifications. We could add this to detect corruption or manual edits. ### Gaps in Our Design (Learned from Tissue) **1. FTS5 Full-Text Search** Tissue's SQLite cache includes FTS5 for searching issue content. Useful for "find messages mentioning X" queries in Phase 2. **2. Simpler Concurrency (Maybe)** Tissue trusts git merge without explicit locking. For single-machine scenarios with small writes, this might be sufficient. We could offer a "simple mode" without flock for low-contention cases. ## Validation Verdict Our design is **more complex than Tissue but simpler than Beads**, which matches our use case: - **Tissue**: Issue tracker, optimizes for git collaboration - **Beads**: Full workflow engine with daemon, RPC, recipes - **Ours**: Coordination primitives for multi-agent coding The key additions we make (heartbeats, blob storage, staging directory) are justified by our real-time coordination requirements that issue trackers don't have. ## Recommended Updates to Design 1. **Add hash-based message IDs** - Prevent merge collisions if agents work offline 2. **Add file hash validation** - Detect log corruption on startup 3. **Document "simple mode"** - No flock for single-agent or low-contention scenarios 4. **Plan for FTS5** - Add to Phase 2 SQLite cache design ## References - Beads source: https://github.com/steveyegge/beads - Tissue source: https://github.com/evil-mind-evil-sword/tissue - Our design: docs/design/message-passing-layer.md