skills/docs/research/cross-agent-patterns-web-research.md
dan c14075ae7e docs: web research on cross-agent patterns (via orch)
Key findings from gemini --websearch:
- Manager-Worker orchestration (Maestro pattern)
- alice/idle adversarial review gates (emes)
- Git-as-state for agent coordination
- tissue for machine-first issue tracking
- Circuit breakers: semantic drift, three-strike, budget limits
- Sandboxing: Wasm and Docker playgrounds

Validates our direction: beads, orch, file-based coordination.
Gaps: orchestrator-enforced gates, agent messaging, sandboxing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 17:50:37 -08:00

7.2 KiB

Cross-Agent Patterns: Web Research Synthesis

Date: 2026-01-09 Method: orch web research (gemini --websearch) Related: skills-hf1 (Cross-agent epic)

Executive Summary

Multi-agent AI coding has shifted from single "do it all" agents to specialized agent teams with distinct roles. Key patterns emerging in 2025-2026:

  1. Manager-Worker orchestration - Central agent delegates to specialists
  2. Adversarial review gates - Separate reviewer blocks completion until approved
  3. Git-as-state - Repository is source of truth for agent coordination
  4. Machine-first issue tracking - File-based, no API needed (tissue)
  5. Circuit breakers - Prevent infinite loops via semantic drift detection

1. Multi-Agent Orchestration Patterns

The "Maestro" Pattern (Centralized Orchestration)

A central Manager agent breaks down features into atomic tasks and delegates to Workers.

Workflow:

  1. Manager (e.g., Claude Sonnet): Reads SPEC.md, creates plan.md
  2. Worker A (e.g., Gemini CLI): "Scaffold database schema" - writes code, tests, reports back
  3. Worker B (e.g., OpenCode): "Write API endpoints" - waits for A, then executes

Tooling: Maestro CLI orchestrates Claude Code, Codex, local LLMs via orchestrator.md.

The "Inspector-Fixer" Loop (Dual-Agent)

Strict separation: one agent writes, one agent finds faults. Never switch roles.

  • Fixer (Claude Code): Monitors todo.md, implements tasks, moves to done.md
  • Inspector (Gemini Pro): Scans codebase, appends issues to todo.md (never fixes)

Key insight: Inspector creates work for Fixer, not the other way around.


2. Adversarial Quality Gates

The alice/idle Pattern (emes)

From evil-mind-evil-sword/idle - review gate for autonomous agents.

Mechanism:

  1. Stop Hook: Intercepts agent's exit/finish signal
  2. Idle State: Forces mandatory pause before completion
  3. Alice (Reviewer): Read-only Opus agent inspects work
  4. Blocking: If Alice finds issues → creates tissue issues → main agent loops back
  5. Exit: Only when Alice returns COMPLETE

Critical design: Enforcement is external to worker's prompt - system-level intervention.

Red Team Pre-Commit Hook

Before commit, code passes through hostile reviewer from different model family:

git diff | llm --model o1-preview --system "You are a hostile security auditor. Block this commit if you find any logic gaps."

If rejected, original agent receives critique and retries.

Test-Driven Agent Development (TDAD)

Agents forbidden from writing implementation until failing test exists:

  1. Agent generates test_feature.py
  2. System executes → asserts FAILURE
  3. Agent generates feature.py
  4. System executes → asserts SUCCESS

Prevents "placebo tests" that always pass regardless of logic.


3. Circuit Breaker Patterns

Semantic Drift Detection

Use embedding model to check similarity of agent's thought trace:

  • If last 3 thoughts are >95% similar → circuit breaks
  • Forces strategy shift, not just retry

Three-Strike Tool Rule

Same error message 3 times → inject system prompt:

"You are stuck. You must try a different tool or approach. Do not retry the previous action."

Budget-Based Interrupts

Replace step limits with token budgets per sub-task:

  • If 50% budget burned on first step of 5-step plan → pause
  • Request plan refinement or human intervention

4. State Management (Non-MCP Alternatives)

Git-as-State

Repository is source of truth:

  • Coordination: Agent A writes file, Agent B reads diff
  • State: Current HEAD of branch
  • History: Git log = immutable, replayable decision history

jwz (Agent Messaging)

From emes - lightweight email-thread-style messaging:

  • Async communication between agents
  • Preserves threading for topic separation
  • Doesn't confuse main context window

The "Postbox" File Pattern

Agents share state via filesystem:

  • Standard: .ai/context.md or claudecode.md
  • Agents dump thought process, obstacles, decisions before signing off
  • Next agent reads to "load" state

Tooling: Vibe Kanban - markdown board multiple agents read/write to.

Git Worktrees for Isolation

Prevent agents overwriting each other in real-time:

  • Agent A: worktree/feature-login
  • Agent B: worktree/refactor-db
  • Merge via PRs with standard conflict resolution

5. Machine-First Issue Tracking

tissue (emes)

From evil-mind-evil-sword/tissue - headless issue tracker for agents.

Design:

  • File-based: Issues stored as JSON/Markdown in .tissue/
  • No API: Agents use standard file tools (read, write)
  • Git-native: Issues version-controlled with code
  • Branch-aware: Issue state branches with code branches

Workflow:

  1. Alice: tissue new --title "Memory Leak".tissue/issues/1.json
  2. Worker reads issue, fixes code
  3. Worker updates JSON status to resolved
  4. Alice verifies, merges branch (code fix + closed issue)

Why this matters: No desync between issue tracker and code state.


6. Sandboxing & Security

Wasm Sandboxing (Browser-style)

Tools like Pyodide run agent-generated code in WebAssembly sandbox:

  • No host OS filesystem access unless explicit
  • NVIDIA uses for data visualization code

Docker Playgrounds

Agent runs in disposable container, not user's shell:

  • Commands trapped and executed in isolation
  • Permission scoping: read/write only in project directory
  • Read-only access to rest of system

Based on research, the "Golden Path" for multi-agent teams:

Role Tool Notes
Orchestrator Maestro or custom Manages high-level plan
Primary Coder Claude Code Via MCP for tool access
Reviewer/QA Gemini Pro or Opus "Watchful Inspector" role
State Git + shared docs docs/arch.md, worktrees
Issues tissue or beads File-based, git-native
Messaging jwz Async agent communication

Key Insights for Our Cross-Agent Work

What Aligns With Our Direction

Pattern Our Equivalent Status
Git-as-state beads (.beads/ in repo) Have
Machine-first issues beads, tissue Have
File-based coordination SKILL.md, AGENTS.md Have
Multi-model consensus orch Have
Adversarial review alice pattern 🔄 Researching

Gaps to Address

Gap Pattern to Adopt
Stop hook (Claude/Gemini only) Orchestrator-enforced gate
Agent messaging Consider jwz or build on beads
Sandbox for research agents Docker/Wasm or OS-level
Circuit breakers Semantic drift + three-strike
  1. Prototype orchestrator pattern - Central agent enforces review gate
  2. Evaluate jwz - Could complement beads for transient state
  3. Implement circuit breakers - Semantic drift detection
  4. Sandbox research - Docker-based for research subagents

Sources

  • orch web research via gemini --websearch
  • GitHub: evil-mind-evil-sword/idle (alice pattern)
  • GitHub: evil-mind-evil-sword/tissue (machine-first issues)
  • Maestro CLI documentation
  • Google ADK event streams
  • Anthropic MCP specification