Key findings from gemini --websearch: - Manager-Worker orchestration (Maestro pattern) - alice/idle adversarial review gates (emes) - Git-as-state for agent coordination - tissue for machine-first issue tracking - Circuit breakers: semantic drift, three-strike, budget limits - Sandboxing: Wasm and Docker playgrounds Validates our direction: beads, orch, file-based coordination. Gaps: orchestrator-enforced gates, agent messaging, sandboxing. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
7.2 KiB
Cross-Agent Patterns: Web Research Synthesis
Date: 2026-01-09 Method: orch web research (gemini --websearch) Related: skills-hf1 (Cross-agent epic)
Executive Summary
Multi-agent AI coding has shifted from single "do it all" agents to specialized agent teams with distinct roles. Key patterns emerging in 2025-2026:
- Manager-Worker orchestration - Central agent delegates to specialists
- Adversarial review gates - Separate reviewer blocks completion until approved
- Git-as-state - Repository is source of truth for agent coordination
- Machine-first issue tracking - File-based, no API needed (tissue)
- Circuit breakers - Prevent infinite loops via semantic drift detection
1. Multi-Agent Orchestration Patterns
The "Maestro" Pattern (Centralized Orchestration)
A central Manager agent breaks down features into atomic tasks and delegates to Workers.
Workflow:
- Manager (e.g., Claude Sonnet): Reads SPEC.md, creates plan.md
- Worker A (e.g., Gemini CLI): "Scaffold database schema" - writes code, tests, reports back
- Worker B (e.g., OpenCode): "Write API endpoints" - waits for A, then executes
Tooling: Maestro CLI orchestrates Claude Code, Codex, local LLMs via orchestrator.md.
The "Inspector-Fixer" Loop (Dual-Agent)
Strict separation: one agent writes, one agent finds faults. Never switch roles.
- Fixer (Claude Code): Monitors
todo.md, implements tasks, moves todone.md - Inspector (Gemini Pro): Scans codebase, appends issues to
todo.md(never fixes)
Key insight: Inspector creates work for Fixer, not the other way around.
2. Adversarial Quality Gates
The alice/idle Pattern (emes)
From evil-mind-evil-sword/idle - review gate for autonomous agents.
Mechanism:
- Stop Hook: Intercepts agent's exit/finish signal
- Idle State: Forces mandatory pause before completion
- Alice (Reviewer): Read-only Opus agent inspects work
- Blocking: If Alice finds issues → creates tissue issues → main agent loops back
- Exit: Only when Alice returns
COMPLETE
Critical design: Enforcement is external to worker's prompt - system-level intervention.
Red Team Pre-Commit Hook
Before commit, code passes through hostile reviewer from different model family:
git diff | llm --model o1-preview --system "You are a hostile security auditor. Block this commit if you find any logic gaps."
If rejected, original agent receives critique and retries.
Test-Driven Agent Development (TDAD)
Agents forbidden from writing implementation until failing test exists:
- Agent generates
test_feature.py - System executes → asserts FAILURE
- Agent generates
feature.py - System executes → asserts SUCCESS
Prevents "placebo tests" that always pass regardless of logic.
3. Circuit Breaker Patterns
Semantic Drift Detection
Use embedding model to check similarity of agent's thought trace:
- If last 3 thoughts are >95% similar → circuit breaks
- Forces strategy shift, not just retry
Three-Strike Tool Rule
Same error message 3 times → inject system prompt:
"You are stuck. You must try a different tool or approach. Do not retry the previous action."
Budget-Based Interrupts
Replace step limits with token budgets per sub-task:
- If 50% budget burned on first step of 5-step plan → pause
- Request plan refinement or human intervention
4. State Management (Non-MCP Alternatives)
Git-as-State
Repository is source of truth:
- Coordination: Agent A writes file, Agent B reads diff
- State: Current HEAD of branch
- History: Git log = immutable, replayable decision history
jwz (Agent Messaging)
From emes - lightweight email-thread-style messaging:
- Async communication between agents
- Preserves threading for topic separation
- Doesn't confuse main context window
The "Postbox" File Pattern
Agents share state via filesystem:
- Standard:
.ai/context.mdorclaudecode.md - Agents dump thought process, obstacles, decisions before signing off
- Next agent reads to "load" state
Tooling: Vibe Kanban - markdown board multiple agents read/write to.
Git Worktrees for Isolation
Prevent agents overwriting each other in real-time:
- Agent A:
worktree/feature-login - Agent B:
worktree/refactor-db - Merge via PRs with standard conflict resolution
5. Machine-First Issue Tracking
tissue (emes)
From evil-mind-evil-sword/tissue - headless issue tracker for agents.
Design:
- File-based: Issues stored as JSON/Markdown in
.tissue/ - No API: Agents use standard file tools (read, write)
- Git-native: Issues version-controlled with code
- Branch-aware: Issue state branches with code branches
Workflow:
- Alice:
tissue new --title "Memory Leak"→.tissue/issues/1.json - Worker reads issue, fixes code
- Worker updates JSON status to
resolved - Alice verifies, merges branch (code fix + closed issue)
Why this matters: No desync between issue tracker and code state.
6. Sandboxing & Security
Wasm Sandboxing (Browser-style)
Tools like Pyodide run agent-generated code in WebAssembly sandbox:
- No host OS filesystem access unless explicit
- NVIDIA uses for data visualization code
Docker Playgrounds
Agent runs in disposable container, not user's shell:
- Commands trapped and executed in isolation
- Permission scoping: read/write only in project directory
- Read-only access to rest of system
Recommended Stack (2025-2026)
Based on research, the "Golden Path" for multi-agent teams:
| Role | Tool | Notes |
|---|---|---|
| Orchestrator | Maestro or custom | Manages high-level plan |
| Primary Coder | Claude Code | Via MCP for tool access |
| Reviewer/QA | Gemini Pro or Opus | "Watchful Inspector" role |
| State | Git + shared docs | docs/arch.md, worktrees |
| Issues | tissue or beads | File-based, git-native |
| Messaging | jwz | Async agent communication |
Key Insights for Our Cross-Agent Work
What Aligns With Our Direction
| Pattern | Our Equivalent | Status |
|---|---|---|
| Git-as-state | beads (.beads/ in repo) | ✅ Have |
| Machine-first issues | beads, tissue | ✅ Have |
| File-based coordination | SKILL.md, AGENTS.md | ✅ Have |
| Multi-model consensus | orch | ✅ Have |
| Adversarial review | alice pattern | 🔄 Researching |
Gaps to Address
| Gap | Pattern to Adopt |
|---|---|
| Stop hook (Claude/Gemini only) | Orchestrator-enforced gate |
| Agent messaging | Consider jwz or build on beads |
| Sandbox for research agents | Docker/Wasm or OS-level |
| Circuit breakers | Semantic drift + three-strike |
Recommended Next Steps
- Prototype orchestrator pattern - Central agent enforces review gate
- Evaluate jwz - Could complement beads for transient state
- Implement circuit breakers - Semantic drift detection
- Sandbox research - Docker-based for research subagents
Sources
- orch web research via gemini --websearch
- GitHub: evil-mind-evil-sword/idle (alice pattern)
- GitHub: evil-mind-evil-sword/tissue (machine-first issues)
- Maestro CLI documentation
- Google ADK event streams
- Anthropic MCP specification