skills/docs/design/cross-agent-enforcement-architecture.md

# Cross-Agent Enforcement Architecture

> **Date:** 2026-01-09
> **Status:** Draft
> **Issue:** skills-8sj
> **Research:** Web research via orch (gemini --websearch)

## Executive Summary

Design for quality gates and agent coordination that works across Claude Code, Gemini CLI, OpenCode, and Codex - regardless of which agent plays orchestrator, worker, or reviewer.

**Core principle:** External enforcement, not internal trust. The orchestrator/infrastructure enforces gates, not agent prompts.

---

## Abstract Layers

### Layer 1: Message Passing

**Purpose:** Async agent coordination, session handoffs, status updates

| Requirement | Description |
|-------------|-------------|
| Append-only | No overwrites, git-mergeable |
| Agent-attributed | Know which agent posted (model, role) |
| Topic-based | Namespaced conversations |
| Git-native | Lives in repo, auto-commits |

**Interface:**
```
post(topic, message, metadata) -> message_id
read(topic, since?) -> [messages]
reply(message_id, message) -> message_id
```

**Implementation options:**
- jwz (emes) - Zig, full-featured
- JSONL files with watcher
- beads extension

**Recommended format:** JSONL with sentinel trick for merge-safety
```jsonl
{"id": "01ABC", "topic": "review:session-123", "agent": "claude", "body": "Starting review", "ts": 1736456789}
{"id": "01ABD", "topic": "review:session-123", "agent": "gemini", "body": "APPROVED", "ts": 1736456800}
<!-- SENTINEL -->
```

---

### Layer 2: Memory

**Purpose:** Persistent work items, review state, dependencies

| Requirement | Description |
|-------------|-------------|
| Cross-session | Survives compaction, restarts |
| Dependency tracking | Issue X blocks issue Y |
| Queryable | Find by status, type, assignee |
| Git-native | Versioned with code |

**Interface:**
```
create(issue) -> issue_id
update(id, fields) -> issue
query(filters) -> [issues]
close(id, reason) -> issue
```

**Implementation:** beads (already have this)

**Review state schema:**
```yaml
review_state:
  session_id: string
  status: pending | in_review | approved | rejected
  worker_agent: string      # e.g., "claude-sonnet-4.5"
  reviewer_agent: string    # e.g., "gemini-pro"
  issues_found: [issue_ids] # beads issues if rejected
  attempts: number          # for circuit breaker
  created_at: timestamp
  updated_at: timestamp
```

---

### Layer 3: Enforcement

**Purpose:** Quality gates that block completion until approved

**Key insight:** Enforcement must be EXTERNAL to the worker agent. You cannot trust an agent to enforce its own gates.

---

## Enforcement Strategies

### Strategy A: Hook-Based (Claude Code, Gemini CLI)

For agents with lifecycle hooks, use Stop hook to block exit.

```json
{
  "hooks": {
    "Stop": [{
      "hooks": [{
        "type": "command",
        "command": "review-gate check",
        "timeout": 30
      }]
    }]
  }
}
```

**review-gate CLI:**
```bash
#!/bin/bash
# review-gate check

SESSION_ID=${CLAUDE_SESSION_ID:-$(cat .state/session_id)}
STATE=$(bd show "review:$SESSION_ID" --format json 2>/dev/null)

if [ -z "$STATE" ]; then
  # No review registered - allow exit
  exit 0
fi

STATUS=$(echo "$STATE" | jq -r '.status')

case "$STATUS" in
  approved)
    exit 0  # Allow exit
    ;;
  rejected|pending|in_review)
    echo "BLOCKED: Review not approved. Status: $STATUS"
    echo "Spawn reviewer with: /review"
    exit 1  # Block exit
    ;;
esac
```

**Strength:** Mechanical - agent cannot bypass

---

### Strategy B: Orchestrator-Enforced (OpenCode, Codex, any)

For agents without hooks, orchestrator controls the session.

```
┌─────────────────────────────────────────────────────────────┐
│  ORCHESTRATOR (shell script or agent)                        │
│                                                              │
│  1. Start worker session                                     │
│  2. Worker does task, signals "done"                         │
│  3. Orchestrator checks: Is review registered?               │
│     NO  → Accept done, exit                                  │
│     YES → Spawn reviewer, wait for verdict                   │
│  4. If APPROVED → Exit                                       │
│  5. If REJECTED → Feed issues back to worker, loop           │
└─────────────────────────────────────────────────────────────┘
```

**Implementation:**
```bash
#!/bin/bash
# orchestrator.sh

# Start worker
opencode --session worker-123 --prompt "$TASK"

# Check if review needed
if bd show "review:worker-123" 2>/dev/null; then
  # Spawn reviewer
  claude --session reviewer-123 --prompt "Review work in session worker-123.
    Post APPROVED or REJECTED with issues to beads."

  # Check verdict
  VERDICT=$(bd show "review:worker-123" --format json | jq -r '.status')

  if [ "$VERDICT" != "approved" ]; then
    # Feed back to worker
    ISSUES=$(bd list --blocked-by "review:worker-123")
    opencode --session worker-123 --prompt "Fix these issues: $ISSUES"
    # Loop...
  fi
fi
```

**Strength:** Works with any agent that can run in a script

---

### Strategy C: Validator Sidecar (Universal)

Agent outputs to staging area. Validator checks before "done" is accepted.

```
Worker Agent                    Validator
     │                              │
     ├─── writes to staging/ ──────►│
     │                              │
     ├─── signals "done" ──────────►│ checks artifacts
     │                              │
     │◄── PASS or FAIL + reasons ───┤
     │                              │
     └─── if FAIL, retry ──────────►│
```

**Validator checks:**
- Does output file exist?
- Does code compile/lint?
- Do tests pass?
- Does schema validate?

**Strength:** Works without any agent cooperation - pure infrastructure

---

### Strategy D: Proxy-Based (API interception)

Intercept critical tool calls, require approval before execution.

```
Agent ──► Proxy ──► Tool
            │
            ▼
    [If critical tool]
            │
            ▼
    Pause, notify human
            │
            ▼
    Wait for approval
            │
            ▼
    Execute or reject
```

**Critical tools:** write_file, bash (rm, deploy), send_email, git push

**Strength:** Agent perceives slow response or permission error - no special logic needed

---

## Circuit Breakers

Prevent infinite loops when worker keeps failing review.

### Semantic Drift Detection

```python
def check_semantic_drift(thoughts: list[str], threshold=0.95) -> bool:
    """Return True if agent is stuck repeating itself."""
    if len(thoughts) < 3:
        return False

    embeddings = embed(thoughts[-3:])
    similarities = pairwise_cosine(embeddings)

    return all(sim > threshold for sim in similarities)
```

**Action:** Inject "You are stuck. Try completely different approach."

### Three-Strike Rule

```python
tool_errors = defaultdict(list)

def on_tool_error(tool: str, args: dict, error: str):
    sig = hash(f"{tool}:{args}:{error}")
    tool_errors[tool].append(sig)

    if tool_errors[tool][-3:].count(sig) >= 3:
        inject("STOP. Same error 3 times. Use different tool/approach.")
        tool_errors[tool].clear()
```

### Budget Limits

```python
MAX_REVIEW_ATTEMPTS = 3
MAX_TOKENS_PER_TASK = 50000
MAX_TIME_PER_TASK = 1800  # 30 minutes

def check_limits(session):
    if session.review_attempts >= MAX_REVIEW_ATTEMPTS:
        escalate_to_human("Review failed 3 times")
        return ABORT

    if session.tokens_used >= MAX_TOKENS_PER_TASK:
        escalate_to_human("Token budget exceeded")
        return ABORT

    if session.elapsed >= MAX_TIME_PER_TASK:
        escalate_to_human("Time limit exceeded")
        return ABORT

    return CONTINUE
```

---

## Adversarial Review Pattern

The reviewer works for the USER, not the worker agent.

### Reviewer Prompt Template

```markdown
# Adversarial Reviewer

You are reviewing work done by another agent. Your job is to find problems.

## Ground Truth
The USER requested: [original user prompt]

## Your Methodology
1. Read the user's EXACT words, not the agent's summary
2. Examine ALL changes (git diff, file contents)
3. ASSUME errors exist - find them
4. Steel-man the work, then systematically attack it
5. Use orch for second opinions on non-trivial findings

## Your Tools (READ-ONLY)
- Read, Grep, Glob (examine files)
- Bash (git diff, git log only)
- orch (second opinions)
- bd (file issues)

## Your Decision
Post to beads topic "review:{session_id}":

If APPROVED:
  - Status: approved
  - Summary: Brief confirmation

If REJECTED:
  - Status: rejected
  - Create beads issues for each problem found
  - Link issues to review topic
```

### Multi-Model Verification

For high-stakes work, use orch for consensus:

```bash
orch consensus "Review this code change for security issues:

$(git diff HEAD~1)

Is this safe to deploy?" gemini claude deepseek --mode critique
```

---

## State Flow Diagram

```
                    ┌─────────────────┐
                    │  Task Received  │
                    └────────┬────────┘
                             │
                    ┌────────▼────────┐
                    │  Worker Starts  │
                    │  (any agent)    │
                    └────────┬────────┘
                             │
                    ┌────────▼────────┐
                    │  Worker Done    │
                    └────────┬────────┘
                             │
              ┌──────────────┼──────────────┐
              │              │              │
     ┌────────▼────────┐     │     ┌────────▼────────┐
     │  Hook Check     │     │     │  Orchestrator   │
     │  (Claude/Gem)   │     │     │  Check (others) │
     └────────┬────────┘     │     └────────┬────────┘
              │              │              │
              └──────────────┼──────────────┘
                             │
                    ┌────────▼────────┐
                    │ Review Needed?  │
                    └────────┬────────┘
                             │
              ┌──────────────┼──────────────┐
              │ NO           │              │ YES
              │              │              │
     ┌────────▼────────┐     │     ┌────────▼────────┐
     │  Exit Allowed   │     │     │  Spawn Reviewer │
     └─────────────────┘     │     └────────┬────────┘
                             │              │
                             │     ┌────────▼────────┐
                             │     │ Reviewer Checks │
                             │     └────────┬────────┘
                             │              │
                             │     ┌────────▼────────┐
                             │     │   APPROVED?     │
                             │     └────────┬────────┘
                             │              │
                             │   ┌──────────┼──────────┐
                             │   │ YES      │          │ NO
                             │   │          │          │
                             │   ▼          │     ┌────▼────────┐
                             │  EXIT        │     │ File Issues │
                             │              │     │ Loop Back   │
                             │              │     └─────────────┘
```

---

## Implementation Phases

### Phase 1: Hook-based for Claude/Gemini
1. Create `review-gate` CLI
2. Add hooks.json to skills
3. Test with simple approve/reject flow

### Phase 2: Orchestrator for others
1. Create orchestrator.sh wrapper
2. Integrate with beads for state
3. Test with OpenCode/Codex

### Phase 3: Circuit breakers
1. Add attempt tracking to beads
2. Implement budget limits
3. Add semantic drift detection (optional)

### Phase 4: Reviewer skill
1. Create adversarial reviewer prompt
2. Integrate with orch for second opinions
3. Test cross-agent review scenarios

---

## File Structure

```
skills/review-gate/
├── SKILL.md                 # Reviewer skill
├── .claude-plugin/
│   └── plugin.json
├── skills/
│   └── review-gate.md
├── hooks/
│   └── hooks.json           # Stop hook config
├── scripts/
│   ├── review-gate          # CLI for gate checks
│   └── orchestrator.sh      # Wrapper for non-hook agents
└── templates/
    └── reviewer-prompt.md   # Adversarial reviewer template
```

---

## Open Questions

1. **Session ID passing:** How does Stop hook know which session to check?
2. **Cross-agent spawning:** Can Claude spawn Gemini reviewer? Via what mechanism?
3. **Beads schema:** Need `review:` topic type or use existing issues?
4. **Circuit breaker storage:** In beads or separate state file?

---

## References

- [alice/idle (emes)](https://github.com/evil-mind-evil-sword/idle)
- [jwz (emes)](https://github.com/evil-mind-evil-sword/jwz)
- [LangGraph](https://langchain-ai.github.io/langgraph/)
- Web research via orch (gemini --websearch)