skills/docs/research/cross-agent-patterns-web-research.md

# Cross-Agent Patterns: Web Research Synthesis

> **Date:** 2026-01-09
> **Method:** orch web research (gemini --websearch)
> **Related:** [skills-hf1](../../.beads/) (Cross-agent epic)

## Executive Summary

Multi-agent AI coding has shifted from single "do it all" agents to **specialized agent teams** with distinct roles. Key patterns emerging in 2025-2026:

1. **Manager-Worker orchestration** - Central agent delegates to specialists
2. **Adversarial review gates** - Separate reviewer blocks completion until approved
3. **Git-as-state** - Repository is source of truth for agent coordination
4. **Machine-first issue tracking** - File-based, no API needed (tissue)
5. **Circuit breakers** - Prevent infinite loops via semantic drift detection

---

## 1. Multi-Agent Orchestration Patterns

### The "Maestro" Pattern (Centralized Orchestration)

A central Manager agent breaks down features into atomic tasks and delegates to Workers.

**Workflow:**
1. **Manager (e.g., Claude Sonnet):** Reads SPEC.md, creates plan.md
2. **Worker A (e.g., Gemini CLI):** "Scaffold database schema" - writes code, tests, reports back
3. **Worker B (e.g., OpenCode):** "Write API endpoints" - waits for A, then executes

**Tooling:** Maestro CLI orchestrates Claude Code, Codex, local LLMs via `orchestrator.md`.

### The "Inspector-Fixer" Loop (Dual-Agent)

Strict separation: one agent writes, one agent finds faults. Never switch roles.

- **Fixer (Claude Code):** Monitors `todo.md`, implements tasks, moves to `done.md`
- **Inspector (Gemini Pro):** Scans codebase, appends issues to `todo.md` (never fixes)

**Key insight:** Inspector creates work for Fixer, not the other way around.

---

## 2. Adversarial Quality Gates

### The alice/idle Pattern (emes)

From `evil-mind-evil-sword/idle` - review gate for autonomous agents.

**Mechanism:**
1. **Stop Hook:** Intercepts agent's exit/finish signal
2. **Idle State:** Forces mandatory pause before completion
3. **Alice (Reviewer):** Read-only Opus agent inspects work
4. **Blocking:** If Alice finds issues → creates tissue issues → main agent loops back
5. **Exit:** Only when Alice returns `COMPLETE`

**Critical design:** Enforcement is *external* to worker's prompt - system-level intervention.

### Red Team Pre-Commit Hook

Before commit, code passes through hostile reviewer from different model family:

```bash
git diff | llm --model o1-preview --system "You are a hostile security auditor. Block this commit if you find any logic gaps."
```

If rejected, original agent receives critique and retries.

### Test-Driven Agent Development (TDAD)

Agents forbidden from writing implementation until failing test exists:

1. Agent generates `test_feature.py`
2. System executes → asserts **FAILURE**
3. Agent generates `feature.py`
4. System executes → asserts **SUCCESS**

Prevents "placebo tests" that always pass regardless of logic.

---

## 3. Circuit Breaker Patterns

### Semantic Drift Detection

Use embedding model to check similarity of agent's thought trace:
- If last 3 thoughts are >95% similar → circuit breaks
- Forces strategy shift, not just retry

### Three-Strike Tool Rule

Same error message 3 times → inject system prompt:
> "You are stuck. You must try a different tool or approach. Do not retry the previous action."

### Budget-Based Interrupts

Replace step limits with token budgets per sub-task:
- If 50% budget burned on first step of 5-step plan → pause
- Request plan refinement or human intervention

---

## 4. State Management (Non-MCP Alternatives)

### Git-as-State

Repository is source of truth:
- **Coordination:** Agent A writes file, Agent B reads diff
- **State:** Current HEAD of branch
- **History:** Git log = immutable, replayable decision history

### jwz (Agent Messaging)

From emes - lightweight email-thread-style messaging:
- Async communication between agents
- Preserves threading for topic separation
- Doesn't confuse main context window

### The "Postbox" File Pattern

Agents share state via filesystem:
- Standard: `.ai/context.md` or `claudecode.md`
- Agents dump thought process, obstacles, decisions before signing off
- Next agent reads to "load" state

**Tooling:** Vibe Kanban - markdown board multiple agents read/write to.

### Git Worktrees for Isolation

Prevent agents overwriting each other in real-time:
- Agent A: `worktree/feature-login`
- Agent B: `worktree/refactor-db`
- Merge via PRs with standard conflict resolution

---

## 5. Machine-First Issue Tracking

### tissue (emes)

From `evil-mind-evil-sword/tissue` - headless issue tracker for agents.

**Design:**
- **File-based:** Issues stored as JSON/Markdown in `.tissue/`
- **No API:** Agents use standard file tools (read, write)
- **Git-native:** Issues version-controlled with code
- **Branch-aware:** Issue state branches with code branches

**Workflow:**
1. Alice: `tissue new --title "Memory Leak"` → `.tissue/issues/1.json`
2. Worker reads issue, fixes code
3. Worker updates JSON status to `resolved`
4. Alice verifies, merges branch (code fix + closed issue)

**Why this matters:** No desync between issue tracker and code state.

---

## 6. Sandboxing & Security

### Wasm Sandboxing (Browser-style)

Tools like Pyodide run agent-generated code in WebAssembly sandbox:
- No host OS filesystem access unless explicit
- NVIDIA uses for data visualization code

### Docker Playgrounds

Agent runs in disposable container, not user's shell:
- Commands trapped and executed in isolation
- Permission scoping: read/write only in project directory
- Read-only access to rest of system

---

## Recommended Stack (2025-2026)

Based on research, the "Golden Path" for multi-agent teams:

| Role | Tool | Notes |
|------|------|-------|
| **Orchestrator** | Maestro or custom | Manages high-level plan |
| **Primary Coder** | Claude Code | Via MCP for tool access |
| **Reviewer/QA** | Gemini Pro or Opus | "Watchful Inspector" role |
| **State** | Git + shared docs | `docs/arch.md`, worktrees |
| **Issues** | tissue or beads | File-based, git-native |
| **Messaging** | jwz | Async agent communication |

---

## Key Insights for Our Cross-Agent Work

### What Aligns With Our Direction

| Pattern | Our Equivalent | Status |
|---------|---------------|--------|
| Git-as-state | beads (.beads/ in repo) | ✅ Have |
| Machine-first issues | beads, tissue | ✅ Have |
| File-based coordination | SKILL.md, AGENTS.md | ✅ Have |
| Multi-model consensus | orch | ✅ Have |
| Adversarial review | alice pattern | 🔄 Researching |

### Gaps to Address

| Gap | Pattern to Adopt |
|-----|------------------|
| Stop hook (Claude/Gemini only) | Orchestrator-enforced gate |
| Agent messaging | Consider jwz or build on beads |
| Sandbox for research agents | Docker/Wasm or OS-level |
| Circuit breakers | Semantic drift + three-strike |

### Recommended Next Steps

1. **Prototype orchestrator pattern** - Central agent enforces review gate
2. **Evaluate jwz** - Could complement beads for transient state
3. **Implement circuit breakers** - Semantic drift detection
4. **Sandbox research** - Docker-based for research subagents

---

## Sources

- orch web research via gemini --websearch
- GitHub: evil-mind-evil-sword/idle (alice pattern)
- GitHub: evil-mind-evil-sword/tissue (machine-first issues)
- Maestro CLI documentation
- Google ADK event streams
- Anthropic MCP specification