Comprehensive design covering: - Abstract layers (message passing, memory, enforcement) - Four enforcement strategies: - Hook-based (Claude/Gemini) - Orchestrator-enforced (OpenCode/Codex) - Validator sidecar (universal) - Proxy-based (API interception) - Circuit breakers (semantic drift, three-strike, budget) - Adversarial reviewer pattern - State flow diagram - Implementation phases Based on web research via orch (gemini --websearch). Addresses: skills-8sj Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
472 lines
14 KiB
Markdown
472 lines
14 KiB
Markdown
# Cross-Agent Enforcement Architecture
|
|
|
|
> **Date:** 2026-01-09
|
|
> **Status:** Draft
|
|
> **Issue:** skills-8sj
|
|
> **Research:** Web research via orch (gemini --websearch)
|
|
|
|
## Executive Summary
|
|
|
|
Design for quality gates and agent coordination that works across Claude Code, Gemini CLI, OpenCode, and Codex - regardless of which agent plays orchestrator, worker, or reviewer.
|
|
|
|
**Core principle:** External enforcement, not internal trust. The orchestrator/infrastructure enforces gates, not agent prompts.
|
|
|
|
---
|
|
|
|
## Abstract Layers
|
|
|
|
### Layer 1: Message Passing
|
|
|
|
**Purpose:** Async agent coordination, session handoffs, status updates
|
|
|
|
| Requirement | Description |
|
|
|-------------|-------------|
|
|
| Append-only | No overwrites, git-mergeable |
|
|
| Agent-attributed | Know which agent posted (model, role) |
|
|
| Topic-based | Namespaced conversations |
|
|
| Git-native | Lives in repo, auto-commits |
|
|
|
|
**Interface:**
|
|
```
|
|
post(topic, message, metadata) -> message_id
|
|
read(topic, since?) -> [messages]
|
|
reply(message_id, message) -> message_id
|
|
```
|
|
|
|
**Implementation options:**
|
|
- jwz (emes) - Zig, full-featured
|
|
- JSONL files with watcher
|
|
- beads extension
|
|
|
|
**Recommended format:** JSONL with sentinel trick for merge-safety
|
|
```jsonl
|
|
{"id": "01ABC", "topic": "review:session-123", "agent": "claude", "body": "Starting review", "ts": 1736456789}
|
|
{"id": "01ABD", "topic": "review:session-123", "agent": "gemini", "body": "APPROVED", "ts": 1736456800}
|
|
<!-- SENTINEL -->
|
|
```
|
|
|
|
---
|
|
|
|
### Layer 2: Memory
|
|
|
|
**Purpose:** Persistent work items, review state, dependencies
|
|
|
|
| Requirement | Description |
|
|
|-------------|-------------|
|
|
| Cross-session | Survives compaction, restarts |
|
|
| Dependency tracking | Issue X blocks issue Y |
|
|
| Queryable | Find by status, type, assignee |
|
|
| Git-native | Versioned with code |
|
|
|
|
**Interface:**
|
|
```
|
|
create(issue) -> issue_id
|
|
update(id, fields) -> issue
|
|
query(filters) -> [issues]
|
|
close(id, reason) -> issue
|
|
```
|
|
|
|
**Implementation:** beads (already have this)
|
|
|
|
**Review state schema:**
|
|
```yaml
|
|
review_state:
|
|
session_id: string
|
|
status: pending | in_review | approved | rejected
|
|
worker_agent: string # e.g., "claude-sonnet-4.5"
|
|
reviewer_agent: string # e.g., "gemini-pro"
|
|
issues_found: [issue_ids] # beads issues if rejected
|
|
attempts: number # for circuit breaker
|
|
created_at: timestamp
|
|
updated_at: timestamp
|
|
```
|
|
|
|
---
|
|
|
|
### Layer 3: Enforcement
|
|
|
|
**Purpose:** Quality gates that block completion until approved
|
|
|
|
**Key insight:** Enforcement must be EXTERNAL to the worker agent. You cannot trust an agent to enforce its own gates.
|
|
|
|
---
|
|
|
|
## Enforcement Strategies
|
|
|
|
### Strategy A: Hook-Based (Claude Code, Gemini CLI)
|
|
|
|
For agents with lifecycle hooks, use Stop hook to block exit.
|
|
|
|
```json
|
|
{
|
|
"hooks": {
|
|
"Stop": [{
|
|
"hooks": [{
|
|
"type": "command",
|
|
"command": "review-gate check",
|
|
"timeout": 30
|
|
}]
|
|
}]
|
|
}
|
|
}
|
|
```
|
|
|
|
**review-gate CLI:**
|
|
```bash
|
|
#!/bin/bash
|
|
# review-gate check
|
|
|
|
SESSION_ID=${CLAUDE_SESSION_ID:-$(cat .state/session_id)}
|
|
STATE=$(bd show "review:$SESSION_ID" --format json 2>/dev/null)
|
|
|
|
if [ -z "$STATE" ]; then
|
|
# No review registered - allow exit
|
|
exit 0
|
|
fi
|
|
|
|
STATUS=$(echo "$STATE" | jq -r '.status')
|
|
|
|
case "$STATUS" in
|
|
approved)
|
|
exit 0 # Allow exit
|
|
;;
|
|
rejected|pending|in_review)
|
|
echo "BLOCKED: Review not approved. Status: $STATUS"
|
|
echo "Spawn reviewer with: /review"
|
|
exit 1 # Block exit
|
|
;;
|
|
esac
|
|
```
|
|
|
|
**Strength:** Mechanical - agent cannot bypass
|
|
|
|
---
|
|
|
|
### Strategy B: Orchestrator-Enforced (OpenCode, Codex, any)
|
|
|
|
For agents without hooks, orchestrator controls the session.
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ ORCHESTRATOR (shell script or agent) │
|
|
│ │
|
|
│ 1. Start worker session │
|
|
│ 2. Worker does task, signals "done" │
|
|
│ 3. Orchestrator checks: Is review registered? │
|
|
│ NO → Accept done, exit │
|
|
│ YES → Spawn reviewer, wait for verdict │
|
|
│ 4. If APPROVED → Exit │
|
|
│ 5. If REJECTED → Feed issues back to worker, loop │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
**Implementation:**
|
|
```bash
|
|
#!/bin/bash
|
|
# orchestrator.sh
|
|
|
|
# Start worker
|
|
opencode --session worker-123 --prompt "$TASK"
|
|
|
|
# Check if review needed
|
|
if bd show "review:worker-123" 2>/dev/null; then
|
|
# Spawn reviewer
|
|
claude --session reviewer-123 --prompt "Review work in session worker-123.
|
|
Post APPROVED or REJECTED with issues to beads."
|
|
|
|
# Check verdict
|
|
VERDICT=$(bd show "review:worker-123" --format json | jq -r '.status')
|
|
|
|
if [ "$VERDICT" != "approved" ]; then
|
|
# Feed back to worker
|
|
ISSUES=$(bd list --blocked-by "review:worker-123")
|
|
opencode --session worker-123 --prompt "Fix these issues: $ISSUES"
|
|
# Loop...
|
|
fi
|
|
fi
|
|
```
|
|
|
|
**Strength:** Works with any agent that can run in a script
|
|
|
|
---
|
|
|
|
### Strategy C: Validator Sidecar (Universal)
|
|
|
|
Agent outputs to staging area. Validator checks before "done" is accepted.
|
|
|
|
```
|
|
Worker Agent Validator
|
|
│ │
|
|
├─── writes to staging/ ──────►│
|
|
│ │
|
|
├─── signals "done" ──────────►│ checks artifacts
|
|
│ │
|
|
│◄── PASS or FAIL + reasons ───┤
|
|
│ │
|
|
└─── if FAIL, retry ──────────►│
|
|
```
|
|
|
|
**Validator checks:**
|
|
- Does output file exist?
|
|
- Does code compile/lint?
|
|
- Do tests pass?
|
|
- Does schema validate?
|
|
|
|
**Strength:** Works without any agent cooperation - pure infrastructure
|
|
|
|
---
|
|
|
|
### Strategy D: Proxy-Based (API interception)
|
|
|
|
Intercept critical tool calls, require approval before execution.
|
|
|
|
```
|
|
Agent ──► Proxy ──► Tool
|
|
│
|
|
▼
|
|
[If critical tool]
|
|
│
|
|
▼
|
|
Pause, notify human
|
|
│
|
|
▼
|
|
Wait for approval
|
|
│
|
|
▼
|
|
Execute or reject
|
|
```
|
|
|
|
**Critical tools:** write_file, bash (rm, deploy), send_email, git push
|
|
|
|
**Strength:** Agent perceives slow response or permission error - no special logic needed
|
|
|
|
---
|
|
|
|
## Circuit Breakers
|
|
|
|
Prevent infinite loops when worker keeps failing review.
|
|
|
|
### Semantic Drift Detection
|
|
|
|
```python
|
|
def check_semantic_drift(thoughts: list[str], threshold=0.95) -> bool:
|
|
"""Return True if agent is stuck repeating itself."""
|
|
if len(thoughts) < 3:
|
|
return False
|
|
|
|
embeddings = embed(thoughts[-3:])
|
|
similarities = pairwise_cosine(embeddings)
|
|
|
|
return all(sim > threshold for sim in similarities)
|
|
```
|
|
|
|
**Action:** Inject "You are stuck. Try completely different approach."
|
|
|
|
### Three-Strike Rule
|
|
|
|
```python
|
|
tool_errors = defaultdict(list)
|
|
|
|
def on_tool_error(tool: str, args: dict, error: str):
|
|
sig = hash(f"{tool}:{args}:{error}")
|
|
tool_errors[tool].append(sig)
|
|
|
|
if tool_errors[tool][-3:].count(sig) >= 3:
|
|
inject("STOP. Same error 3 times. Use different tool/approach.")
|
|
tool_errors[tool].clear()
|
|
```
|
|
|
|
### Budget Limits
|
|
|
|
```python
|
|
MAX_REVIEW_ATTEMPTS = 3
|
|
MAX_TOKENS_PER_TASK = 50000
|
|
MAX_TIME_PER_TASK = 1800 # 30 minutes
|
|
|
|
def check_limits(session):
|
|
if session.review_attempts >= MAX_REVIEW_ATTEMPTS:
|
|
escalate_to_human("Review failed 3 times")
|
|
return ABORT
|
|
|
|
if session.tokens_used >= MAX_TOKENS_PER_TASK:
|
|
escalate_to_human("Token budget exceeded")
|
|
return ABORT
|
|
|
|
if session.elapsed >= MAX_TIME_PER_TASK:
|
|
escalate_to_human("Time limit exceeded")
|
|
return ABORT
|
|
|
|
return CONTINUE
|
|
```
|
|
|
|
---
|
|
|
|
## Adversarial Review Pattern
|
|
|
|
The reviewer works for the USER, not the worker agent.
|
|
|
|
### Reviewer Prompt Template
|
|
|
|
```markdown
|
|
# Adversarial Reviewer
|
|
|
|
You are reviewing work done by another agent. Your job is to find problems.
|
|
|
|
## Ground Truth
|
|
The USER requested: [original user prompt]
|
|
|
|
## Your Methodology
|
|
1. Read the user's EXACT words, not the agent's summary
|
|
2. Examine ALL changes (git diff, file contents)
|
|
3. ASSUME errors exist - find them
|
|
4. Steel-man the work, then systematically attack it
|
|
5. Use orch for second opinions on non-trivial findings
|
|
|
|
## Your Tools (READ-ONLY)
|
|
- Read, Grep, Glob (examine files)
|
|
- Bash (git diff, git log only)
|
|
- orch (second opinions)
|
|
- bd (file issues)
|
|
|
|
## Your Decision
|
|
Post to beads topic "review:{session_id}":
|
|
|
|
If APPROVED:
|
|
- Status: approved
|
|
- Summary: Brief confirmation
|
|
|
|
If REJECTED:
|
|
- Status: rejected
|
|
- Create beads issues for each problem found
|
|
- Link issues to review topic
|
|
```
|
|
|
|
### Multi-Model Verification
|
|
|
|
For high-stakes work, use orch for consensus:
|
|
|
|
```bash
|
|
orch consensus "Review this code change for security issues:
|
|
|
|
$(git diff HEAD~1)
|
|
|
|
Is this safe to deploy?" gemini claude deepseek --mode critique
|
|
```
|
|
|
|
---
|
|
|
|
## State Flow Diagram
|
|
|
|
```
|
|
┌─────────────────┐
|
|
│ Task Received │
|
|
└────────┬────────┘
|
|
│
|
|
┌────────▼────────┐
|
|
│ Worker Starts │
|
|
│ (any agent) │
|
|
└────────┬────────┘
|
|
│
|
|
┌────────▼────────┐
|
|
│ Worker Done │
|
|
└────────┬────────┘
|
|
│
|
|
┌──────────────┼──────────────┐
|
|
│ │ │
|
|
┌────────▼────────┐ │ ┌────────▼────────┐
|
|
│ Hook Check │ │ │ Orchestrator │
|
|
│ (Claude/Gem) │ │ │ Check (others) │
|
|
└────────┬────────┘ │ └────────┬────────┘
|
|
│ │ │
|
|
└──────────────┼──────────────┘
|
|
│
|
|
┌────────▼────────┐
|
|
│ Review Needed? │
|
|
└────────┬────────┘
|
|
│
|
|
┌──────────────┼──────────────┐
|
|
│ NO │ │ YES
|
|
│ │ │
|
|
┌────────▼────────┐ │ ┌────────▼────────┐
|
|
│ Exit Allowed │ │ │ Spawn Reviewer │
|
|
└─────────────────┘ │ └────────┬────────┘
|
|
│ │
|
|
│ ┌────────▼────────┐
|
|
│ │ Reviewer Checks │
|
|
│ └────────┬────────┘
|
|
│ │
|
|
│ ┌────────▼────────┐
|
|
│ │ APPROVED? │
|
|
│ └────────┬────────┘
|
|
│ │
|
|
│ ┌──────────┼──────────┐
|
|
│ │ YES │ │ NO
|
|
│ │ │ │
|
|
│ ▼ │ ┌────▼────────┐
|
|
│ EXIT │ │ File Issues │
|
|
│ │ │ Loop Back │
|
|
│ │ └─────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Implementation Phases
|
|
|
|
### Phase 1: Hook-based for Claude/Gemini
|
|
1. Create `review-gate` CLI
|
|
2. Add hooks.json to skills
|
|
3. Test with simple approve/reject flow
|
|
|
|
### Phase 2: Orchestrator for others
|
|
1. Create orchestrator.sh wrapper
|
|
2. Integrate with beads for state
|
|
3. Test with OpenCode/Codex
|
|
|
|
### Phase 3: Circuit breakers
|
|
1. Add attempt tracking to beads
|
|
2. Implement budget limits
|
|
3. Add semantic drift detection (optional)
|
|
|
|
### Phase 4: Reviewer skill
|
|
1. Create adversarial reviewer prompt
|
|
2. Integrate with orch for second opinions
|
|
3. Test cross-agent review scenarios
|
|
|
|
---
|
|
|
|
## File Structure
|
|
|
|
```
|
|
skills/review-gate/
|
|
├── SKILL.md # Reviewer skill
|
|
├── .claude-plugin/
|
|
│ └── plugin.json
|
|
├── skills/
|
|
│ └── review-gate.md
|
|
├── hooks/
|
|
│ └── hooks.json # Stop hook config
|
|
├── scripts/
|
|
│ ├── review-gate # CLI for gate checks
|
|
│ └── orchestrator.sh # Wrapper for non-hook agents
|
|
└── templates/
|
|
└── reviewer-prompt.md # Adversarial reviewer template
|
|
```
|
|
|
|
---
|
|
|
|
## Open Questions
|
|
|
|
1. **Session ID passing:** How does Stop hook know which session to check?
|
|
2. **Cross-agent spawning:** Can Claude spawn Gemini reviewer? Via what mechanism?
|
|
3. **Beads schema:** Need `review:` topic type or use existing issues?
|
|
4. **Circuit breaker storage:** In beads or separate state file?
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- [alice/idle (emes)](https://github.com/evil-mind-evil-sword/idle)
|
|
- [jwz (emes)](https://github.com/evil-mind-evil-sword/jwz)
|
|
- [LangGraph](https://langchain-ai.github.io/langgraph/)
|
|
- Web research via orch (gemini --websearch)
|