Comprehensive design covering: - Abstract layers (message passing, memory, enforcement) - Four enforcement strategies: - Hook-based (Claude/Gemini) - Orchestrator-enforced (OpenCode/Codex) - Validator sidecar (universal) - Proxy-based (API interception) - Circuit breakers (semantic drift, three-strike, budget) - Adversarial reviewer pattern - State flow diagram - Implementation phases Based on web research via orch (gemini --websearch). Addresses: skills-8sj Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
14 KiB
Cross-Agent Enforcement Architecture
Date: 2026-01-09 Status: Draft Issue: skills-8sj Research: Web research via orch (gemini --websearch)
Executive Summary
Design for quality gates and agent coordination that works across Claude Code, Gemini CLI, OpenCode, and Codex - regardless of which agent plays orchestrator, worker, or reviewer.
Core principle: External enforcement, not internal trust. The orchestrator/infrastructure enforces gates, not agent prompts.
Abstract Layers
Layer 1: Message Passing
Purpose: Async agent coordination, session handoffs, status updates
| Requirement | Description |
|---|---|
| Append-only | No overwrites, git-mergeable |
| Agent-attributed | Know which agent posted (model, role) |
| Topic-based | Namespaced conversations |
| Git-native | Lives in repo, auto-commits |
Interface:
post(topic, message, metadata) -> message_id
read(topic, since?) -> [messages]
reply(message_id, message) -> message_id
Implementation options:
- jwz (emes) - Zig, full-featured
- JSONL files with watcher
- beads extension
Recommended format: JSONL with sentinel trick for merge-safety
{"id": "01ABC", "topic": "review:session-123", "agent": "claude", "body": "Starting review", "ts": 1736456789}
{"id": "01ABD", "topic": "review:session-123", "agent": "gemini", "body": "APPROVED", "ts": 1736456800}
<!-- SENTINEL -->
Layer 2: Memory
Purpose: Persistent work items, review state, dependencies
| Requirement | Description |
|---|---|
| Cross-session | Survives compaction, restarts |
| Dependency tracking | Issue X blocks issue Y |
| Queryable | Find by status, type, assignee |
| Git-native | Versioned with code |
Interface:
create(issue) -> issue_id
update(id, fields) -> issue
query(filters) -> [issues]
close(id, reason) -> issue
Implementation: beads (already have this)
Review state schema:
review_state:
session_id: string
status: pending | in_review | approved | rejected
worker_agent: string # e.g., "claude-sonnet-4.5"
reviewer_agent: string # e.g., "gemini-pro"
issues_found: [issue_ids] # beads issues if rejected
attempts: number # for circuit breaker
created_at: timestamp
updated_at: timestamp
Layer 3: Enforcement
Purpose: Quality gates that block completion until approved
Key insight: Enforcement must be EXTERNAL to the worker agent. You cannot trust an agent to enforce its own gates.
Enforcement Strategies
Strategy A: Hook-Based (Claude Code, Gemini CLI)
For agents with lifecycle hooks, use Stop hook to block exit.
{
"hooks": {
"Stop": [{
"hooks": [{
"type": "command",
"command": "review-gate check",
"timeout": 30
}]
}]
}
}
review-gate CLI:
#!/bin/bash
# review-gate check
SESSION_ID=${CLAUDE_SESSION_ID:-$(cat .state/session_id)}
STATE=$(bd show "review:$SESSION_ID" --format json 2>/dev/null)
if [ -z "$STATE" ]; then
# No review registered - allow exit
exit 0
fi
STATUS=$(echo "$STATE" | jq -r '.status')
case "$STATUS" in
approved)
exit 0 # Allow exit
;;
rejected|pending|in_review)
echo "BLOCKED: Review not approved. Status: $STATUS"
echo "Spawn reviewer with: /review"
exit 1 # Block exit
;;
esac
Strength: Mechanical - agent cannot bypass
Strategy B: Orchestrator-Enforced (OpenCode, Codex, any)
For agents without hooks, orchestrator controls the session.
┌─────────────────────────────────────────────────────────────┐
│ ORCHESTRATOR (shell script or agent) │
│ │
│ 1. Start worker session │
│ 2. Worker does task, signals "done" │
│ 3. Orchestrator checks: Is review registered? │
│ NO → Accept done, exit │
│ YES → Spawn reviewer, wait for verdict │
│ 4. If APPROVED → Exit │
│ 5. If REJECTED → Feed issues back to worker, loop │
└─────────────────────────────────────────────────────────────┘
Implementation:
#!/bin/bash
# orchestrator.sh
# Start worker
opencode --session worker-123 --prompt "$TASK"
# Check if review needed
if bd show "review:worker-123" 2>/dev/null; then
# Spawn reviewer
claude --session reviewer-123 --prompt "Review work in session worker-123.
Post APPROVED or REJECTED with issues to beads."
# Check verdict
VERDICT=$(bd show "review:worker-123" --format json | jq -r '.status')
if [ "$VERDICT" != "approved" ]; then
# Feed back to worker
ISSUES=$(bd list --blocked-by "review:worker-123")
opencode --session worker-123 --prompt "Fix these issues: $ISSUES"
# Loop...
fi
fi
Strength: Works with any agent that can run in a script
Strategy C: Validator Sidecar (Universal)
Agent outputs to staging area. Validator checks before "done" is accepted.
Worker Agent Validator
│ │
├─── writes to staging/ ──────►│
│ │
├─── signals "done" ──────────►│ checks artifacts
│ │
│◄── PASS or FAIL + reasons ───┤
│ │
└─── if FAIL, retry ──────────►│
Validator checks:
- Does output file exist?
- Does code compile/lint?
- Do tests pass?
- Does schema validate?
Strength: Works without any agent cooperation - pure infrastructure
Strategy D: Proxy-Based (API interception)
Intercept critical tool calls, require approval before execution.
Agent ──► Proxy ──► Tool
│
▼
[If critical tool]
│
▼
Pause, notify human
│
▼
Wait for approval
│
▼
Execute or reject
Critical tools: write_file, bash (rm, deploy), send_email, git push
Strength: Agent perceives slow response or permission error - no special logic needed
Circuit Breakers
Prevent infinite loops when worker keeps failing review.
Semantic Drift Detection
def check_semantic_drift(thoughts: list[str], threshold=0.95) -> bool:
"""Return True if agent is stuck repeating itself."""
if len(thoughts) < 3:
return False
embeddings = embed(thoughts[-3:])
similarities = pairwise_cosine(embeddings)
return all(sim > threshold for sim in similarities)
Action: Inject "You are stuck. Try completely different approach."
Three-Strike Rule
tool_errors = defaultdict(list)
def on_tool_error(tool: str, args: dict, error: str):
sig = hash(f"{tool}:{args}:{error}")
tool_errors[tool].append(sig)
if tool_errors[tool][-3:].count(sig) >= 3:
inject("STOP. Same error 3 times. Use different tool/approach.")
tool_errors[tool].clear()
Budget Limits
MAX_REVIEW_ATTEMPTS = 3
MAX_TOKENS_PER_TASK = 50000
MAX_TIME_PER_TASK = 1800 # 30 minutes
def check_limits(session):
if session.review_attempts >= MAX_REVIEW_ATTEMPTS:
escalate_to_human("Review failed 3 times")
return ABORT
if session.tokens_used >= MAX_TOKENS_PER_TASK:
escalate_to_human("Token budget exceeded")
return ABORT
if session.elapsed >= MAX_TIME_PER_TASK:
escalate_to_human("Time limit exceeded")
return ABORT
return CONTINUE
Adversarial Review Pattern
The reviewer works for the USER, not the worker agent.
Reviewer Prompt Template
# Adversarial Reviewer
You are reviewing work done by another agent. Your job is to find problems.
## Ground Truth
The USER requested: [original user prompt]
## Your Methodology
1. Read the user's EXACT words, not the agent's summary
2. Examine ALL changes (git diff, file contents)
3. ASSUME errors exist - find them
4. Steel-man the work, then systematically attack it
5. Use orch for second opinions on non-trivial findings
## Your Tools (READ-ONLY)
- Read, Grep, Glob (examine files)
- Bash (git diff, git log only)
- orch (second opinions)
- bd (file issues)
## Your Decision
Post to beads topic "review:{session_id}":
If APPROVED:
- Status: approved
- Summary: Brief confirmation
If REJECTED:
- Status: rejected
- Create beads issues for each problem found
- Link issues to review topic
Multi-Model Verification
For high-stakes work, use orch for consensus:
orch consensus "Review this code change for security issues:
$(git diff HEAD~1)
Is this safe to deploy?" gemini claude deepseek --mode critique
State Flow Diagram
┌─────────────────┐
│ Task Received │
└────────┬────────┘
│
┌────────▼────────┐
│ Worker Starts │
│ (any agent) │
└────────┬────────┘
│
┌────────▼────────┐
│ Worker Done │
└────────┬────────┘
│
┌──────────────┼──────────────┐
│ │ │
┌────────▼────────┐ │ ┌────────▼────────┐
│ Hook Check │ │ │ Orchestrator │
│ (Claude/Gem) │ │ │ Check (others) │
└────────┬────────┘ │ └────────┬────────┘
│ │ │
└──────────────┼──────────────┘
│
┌────────▼────────┐
│ Review Needed? │
└────────┬────────┘
│
┌──────────────┼──────────────┐
│ NO │ │ YES
│ │ │
┌────────▼────────┐ │ ┌────────▼────────┐
│ Exit Allowed │ │ │ Spawn Reviewer │
└─────────────────┘ │ └────────┬────────┘
│ │
│ ┌────────▼────────┐
│ │ Reviewer Checks │
│ └────────┬────────┘
│ │
│ ┌────────▼────────┐
│ │ APPROVED? │
│ └────────┬────────┘
│ │
│ ┌──────────┼──────────┐
│ │ YES │ │ NO
│ │ │ │
│ ▼ │ ┌────▼────────┐
│ EXIT │ │ File Issues │
│ │ │ Loop Back │
│ │ └─────────────┘
Implementation Phases
Phase 1: Hook-based for Claude/Gemini
- Create
review-gateCLI - Add hooks.json to skills
- Test with simple approve/reject flow
Phase 2: Orchestrator for others
- Create orchestrator.sh wrapper
- Integrate with beads for state
- Test with OpenCode/Codex
Phase 3: Circuit breakers
- Add attempt tracking to beads
- Implement budget limits
- Add semantic drift detection (optional)
Phase 4: Reviewer skill
- Create adversarial reviewer prompt
- Integrate with orch for second opinions
- Test cross-agent review scenarios
File Structure
skills/review-gate/
├── SKILL.md # Reviewer skill
├── .claude-plugin/
│ └── plugin.json
├── skills/
│ └── review-gate.md
├── hooks/
│ └── hooks.json # Stop hook config
├── scripts/
│ ├── review-gate # CLI for gate checks
│ └── orchestrator.sh # Wrapper for non-hook agents
└── templates/
└── reviewer-prompt.md # Adversarial reviewer template
Open Questions
- Session ID passing: How does Stop hook know which session to check?
- Cross-agent spawning: Can Claude spawn Gemini reviewer? Via what mechanism?
- Beads schema: Need
review:topic type or use existing issues? - Circuit breaker storage: In beads or separate state file?
References
- alice/idle (emes)
- jwz (emes)
- LangGraph
- Web research via orch (gemini --websearch)