diff --git a/.beads/issues.jsonl b/.beads/issues.jsonl index c5d8ed4..cb8c1b5 100644 --- a/.beads/issues.jsonl +++ b/.beads/issues.jsonl @@ -119,7 +119,7 @@ {"id":"skills-mjf","title":"Design: Portable adversarial reviewer","description":"Design a reviewer agent/skill that can run on any capable model.\n\nalice is Claude Opus with specific tools. We need:\n- Model-agnostic reviewer prompt/instructions\n- Tool requirements (read-only: Read, Grep, Glob, Bash)\n- Integration with orch for multi-model consensus\n- Decision format (APPROVED/ISSUES)\n- Issue filing (beads or tissue)\n\nKey principles from alice:\n- Work for the USER, not the agent\n- Assume errors exist, find them\n- Steel-man then attack\n- Seek second opinions\n\nOutput: Reviewer skill spec that works across agents","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-09T17:14:20.778647076-08:00","created_by":"dan","updated_at":"2026-01-09T19:59:37.80146821-08:00","closed_at":"2026-01-09T19:59:37.80146821-08:00","close_reason":"Covered in architecture design doc - adversarial reviewer section"} {"id":"skills-mx3","title":"spec-review: Define consensus thresholds and decision rules","description":"'Use judgment' for mixed results leads to inconsistent decisions.\n\nDefine:\n- What constitutes consensus (2/3? unanimous?)\n- How to handle NEUTRAL votes\n- Tie-break rules\n- When human override is acceptable and how to document it","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-15T00:23:24.121175736-08:00","updated_at":"2025-12-15T13:58:04.339283238-08:00","closed_at":"2025-12-15T13:58:04.339283238-08:00"} {"id":"skills-njb","title":"worklog: clarify or remove semantic compression references","description":"SKILL.md references 'semantic compression is a planned workflow' multiple times but it's not implemented. Speculative generality - adds cognitive load for non-existent feature. Either implement or move to design notes. Found by smells lens review.","status":"closed","priority":4,"issue_type":"task","created_at":"2025-12-25T02:03:25.387405002-05:00","updated_at":"2025-12-27T10:11:48.169923742-05:00","closed_at":"2025-12-27T10:11:48.169923742-05:00","close_reason":"Closed"} -{"id":"skills-nto","title":"Prototype: End-to-end cross-agent workflow","description":"Build a working prototype of cross-agent quality gate.\n\n## Scenario\n1. Worker agent (any) does task\n2. Posts status to message layer\n3. Reviewer agent (any) checks work\n4. Posts approval/issues to memory layer\n5. Gate checks memory, allows/blocks completion\n\n## Test Matrix\n\n| Orchestrator | Worker | Reviewer | Enforcement |\n|--------------|--------|----------|-------------|\n| Claude | Claude | Gemini | Hook |\n| Claude | Gemini | Claude | Hook |\n| OpenCode | Claude | Gemini | Orchestrator |\n| Manual | OpenCode | Claude | Protocol |\n\n## Components to Build\n1. Message layer interface (post/read status)\n2. Memory layer interface (review state)\n3. Gate check CLI (for hooks and manual)\n4. Reviewer skill/prompt\n\n## Success Criteria\n- At least 2 agent combinations working\n- Gate actually blocks when review fails\n- State persists across agent boundaries","status":"open","priority":2,"issue_type":"task","created_at":"2026-01-09T19:32:57.579195169-08:00","created_by":"dan","updated_at":"2026-01-09T19:32:57.579195169-08:00"} +{"id":"skills-nto","title":"Prototype: End-to-end cross-agent workflow","description":"Build a working prototype of cross-agent quality gate.\n\n## Scenario\n1. Worker agent (any) does task\n2. Posts status to message layer\n3. Reviewer agent (any) checks work\n4. Posts approval/issues to memory layer\n5. Gate checks memory, allows/blocks completion\n\n## Test Matrix\n\n| Orchestrator | Worker | Reviewer | Enforcement |\n|--------------|--------|----------|-------------|\n| Claude | Claude | Gemini | Hook |\n| Claude | Gemini | Claude | Hook |\n| OpenCode | Claude | Gemini | Orchestrator |\n| Manual | OpenCode | Claude | Protocol |\n\n## Components to Build\n1. Message layer interface (post/read status)\n2. Memory layer interface (review state)\n3. Gate check CLI (for hooks and manual)\n4. Reviewer skill/prompt\n\n## Success Criteria\n- At least 2 agent combinations working\n- Gate actually blocks when review fails\n- State persists across agent boundaries","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-09T19:32:57.579195169-08:00","created_by":"dan","updated_at":"2026-01-09T20:39:24.013666826-08:00","closed_at":"2026-01-09T20:39:24.013666826-08:00","close_reason":"Prototype complete: review-gate CLI with hooks.json, adversarial reviewer prompt, and dual-publish structure"} {"id":"skills-oes","title":"Define skill manifest format","description":"Skills need to declare their interface so beads can validate.\n\nManifest should include:\n- Required inputs (args, env vars)\n- Optional inputs with defaults\n- Expected outputs (files, artifacts)\n- Preconditions (tools, repos, permissions)\n\nLocation: SKILL.md frontmatter or separate manifest.yaml\n\nEnables: Proto validation before spawning, better error messages.","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T19:49:30.673372413-05:00","updated_at":"2025-12-23T20:55:04.427620449-05:00","closed_at":"2025-12-23T20:55:04.427620449-05:00","close_reason":"ADRs revised with orch consensus feedback"} {"id":"skills-p2o","title":"Refactor update-agent-context.sh: array+loop for agents","description":"File: .specify/scripts/bash/update-agent-context.sh (772 lines)\n\nIssues:\n- 12 nearly-identical if-blocks in update_all_existing_agents() (lines 632-701)\n- Should be refactored into loop with array of agent configurations\n- Current pattern repeats: if [[ -f \"$CLAUDE_FILE\" ]]; then update_agent_file...\n\nFix:\n- Create AGENTS array with (file, name, format) tuples\n- Replace 12 if-blocks with single for loop\n- Estimated reduction: 60 lines\n\nSeverity: HIGH","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-24T02:50:57.385820971-05:00","updated_at":"2025-12-25T01:44:58.370191619-05:00","closed_at":"2025-12-25T01:44:58.370191619-05:00","close_reason":"update-agent-context.sh is .specify upstream code, not maintained here"} {"id":"skills-p3v","title":"Cross-language FFI wormholes via LSP","description":"Bridge FFI boundaries where standard LSPs go blind:\n- Rust extern C → clangd lookup\n- Go CGO → match C symbols\n- Python FFI → trace bindings\n\nGenerate synthetic go-to-definition maps. When hovering over C call in Rust, intercept hover request, query C LSP, inject C definition into Rust tooltip.\n\nEnables seamless polyglot navigation.","status":"closed","priority":4,"issue_type":"feature","created_at":"2025-12-24T02:29:57.597602745-05:00","updated_at":"2025-12-29T14:37:35.354771695-05:00","closed_at":"2025-12-29T14:37:35.354771695-05:00","close_reason":"Parked: waiting on gastown (Steve Yegge's orchestration layer for beads). Revisit when gastown lands."} diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index 6f3a8df..4c39795 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -86,6 +86,11 @@ "name": "worklog", "source": "./skills/worklog", "description": "Create structured worklogs documenting work sessions" + }, + { + "name": "review-gate", + "source": "./skills/review-gate", + "description": "Quality gate for cross-agent review enforcement" } ] } diff --git a/.gitignore b/.gitignore index 5ef5cc1..cf0e74a 100644 --- a/.gitignore +++ b/.gitignore @@ -26,3 +26,6 @@ Thumbs.db # Environment .env .env.local + +# Review gate state (transient) +.review-state/ diff --git a/skills/review-gate/.claude-plugin/plugin.json b/skills/review-gate/.claude-plugin/plugin.json new file mode 100644 index 0000000..fe8fe7d --- /dev/null +++ b/skills/review-gate/.claude-plugin/plugin.json @@ -0,0 +1,13 @@ +{ + "name": "review-gate", + "version": "0.1.0", + "description": "Quality gate for cross-agent review enforcement", + "repository": "https://git.clarun.xyz/dan/skills", + "author": "dan", + "license": "MIT", + "claude_code": { + "skill_paths": [ + "../skills/review-gate.md" + ] + } +} diff --git a/skills/review-gate/SKILL.md b/skills/review-gate/SKILL.md new file mode 100644 index 0000000..51f46d2 --- /dev/null +++ b/skills/review-gate/SKILL.md @@ -0,0 +1,121 @@ +# review-gate + +Quality gate for cross-agent review enforcement. Blocks agent completion until work is reviewed and approved. + +## Purpose + +When agents work autonomously (unattended, CI/CD, batch jobs), they may: +- Claim "done" when work is incomplete +- Miss requirements or make incorrect assumptions +- Take shortcuts that don't serve user intent + +review-gate enforces an external quality check before the agent can exit. + +## Usage + +### Enable review requirement + +At the start of work that needs review: + +```bash +review-gate enable [session_id] +``` + +This creates a "pending" review state. The agent will be blocked from exiting until approved. + +### Approve review + +After reviewing work (manually or via reviewer agent): + +```bash +review-gate approve [session_id] +``` + +### Reject review + +If issues are found: + +```bash +review-gate reject [session_id] "reason" ["issue1" "issue2" ...] +``` + +### Check status + +```bash +review-gate status [session_id] +review-gate list # all sessions +``` + +## Session ID + +Session ID is auto-detected in this order: +1. Provided argument +2. `REVIEW_SESSION_ID` env var +3. `CLAUDE_SESSION_ID` env var +4. Generated from git branch + commit + +## Hook Integration + +For Claude Code, add to your project's `.claude/hooks.json`: + +```json +{ + "hooks": { + "Stop": [{ + "hooks": [{ + "type": "command", + "command": "review-gate check", + "timeout": 30 + }] + }] + } +} +``` + +The Stop hook runs when the agent tries to exit. If review is pending or rejected, exit is blocked. + +## State Storage + +Review state is stored in `.review-state/` as JSON files. This directory should be gitignored (transient state, not code). + +## Commands + +| Command | Description | Exit Code | +|---------|-------------|-----------| +| `check [id]` | Check if exit allowed | 0=allowed, 1=blocked | +| `enable [id]` | Enable review requirement | 0 | +| `approve [id]` | Mark as approved | 0 | +| `reject [id] [reason] [issues...]` | Mark as rejected | 0 | +| `status [id]` | Show review state | 0 | +| `list` | List all sessions | 0 | +| `clean [age]` | Remove old states (default: 7d) | 0 | + +## Environment Variables + +| Variable | Description | +|----------|-------------| +| `REVIEW_SESSION_ID` | Override session ID | +| `REVIEW_STATE_DIR` | Override state directory (default: `./.review-state`) | +| `CLAUDE_SESSION_ID` | Fallback session ID (set by Claude Code) | + +## Cross-Agent Pattern + +For agents without Stop hooks, use orchestrator pattern: + +```bash +# Orchestrator script +review-gate enable "$SESSION_ID" + +# Worker completes... + +# Reviewer approves or rejects... +review-gate check "$SESSION_ID" || { + echo "Review not passed, looping back to worker" + # ... retry logic +} +``` + +## Dependencies + +- `jq` - JSON parsing +- `bash` 4.0+ diff --git a/skills/review-gate/hooks/hooks.json b/skills/review-gate/hooks/hooks.json new file mode 100644 index 0000000..a1ab65a --- /dev/null +++ b/skills/review-gate/hooks/hooks.json @@ -0,0 +1,16 @@ +{ + "hooks": { + "Stop": [ + { + "matcher": "", + "hooks": [ + { + "type": "command", + "command": "review-gate check", + "timeout": 30 + } + ] + } + ] + } +} diff --git a/skills/review-gate/scripts/review-gate b/skills/review-gate/scripts/review-gate new file mode 100755 index 0000000..3a358df --- /dev/null +++ b/skills/review-gate/scripts/review-gate @@ -0,0 +1,238 @@ +#!/usr/bin/env bash +# review-gate - Quality gate CLI for cross-agent review enforcement +# +# Usage: +# review-gate check [session_id] - Check if review is approved (for hooks) +# review-gate enable [session_id] - Enable review requirement +# review-gate approve [session_id] - Mark review as approved +# review-gate reject [session_id] [reason] - Mark review as rejected +# review-gate status [session_id] - Show current review status +# +# Exit codes: +# 0 - Review approved or not required (allow exit) +# 1 - Review pending or rejected (block exit) + +set -euo pipefail + +# Session ID: use provided, or env var, or generate from git +get_session_id() { + local provided="${1:-}" + if [[ -n "$provided" ]]; then + echo "$provided" + elif [[ -n "${REVIEW_SESSION_ID:-}" ]]; then + echo "$REVIEW_SESSION_ID" + elif [[ -n "${CLAUDE_SESSION_ID:-}" ]]; then + echo "$CLAUDE_SESSION_ID" + else + # Generate from git state + local branch=$(git branch --show-current 2>/dev/null || echo "main") + local short_sha=$(git rev-parse --short HEAD 2>/dev/null || echo "unknown") + echo "review-${branch}-${short_sha}" + fi +} + +# State file location (simple file-based for prototype) +STATE_DIR="${REVIEW_STATE_DIR:-${PWD}/.review-state}" +mkdir -p "$STATE_DIR" + +get_state_file() { + local session_id="$1" + echo "${STATE_DIR}/${session_id}.json" +} + +# Commands +cmd_check() { + local session_id=$(get_session_id "${1:-}") + local state_file=$(get_state_file "$session_id") + + if [[ ! -f "$state_file" ]]; then + # No review registered - allow exit + echo "No review required for session: $session_id" + exit 0 + fi + + local status=$(jq -r '.status' "$state_file" 2>/dev/null || echo "unknown") + + case "$status" in + approved) + echo "✓ Review approved for session: $session_id" + exit 0 + ;; + pending) + echo "⏳ BLOCKED: Review pending for session: $session_id" + echo "" + echo "To proceed, spawn a reviewer agent:" + echo " /task Review the work in this session. Check git diff and file changes." + echo " Then run: review-gate approve $session_id" + echo "" + echo "Or to skip review: review-gate approve $session_id" + exit 1 + ;; + rejected) + local reason=$(jq -r '.reason // "No reason provided"' "$state_file") + local issues=$(jq -r '.issues[]? // empty' "$state_file" | head -5) + echo "❌ BLOCKED: Review rejected for session: $session_id" + echo "Reason: $reason" + if [[ -n "$issues" ]]; then + echo "Issues:" + echo "$issues" | sed 's/^/ - /' + fi + echo "" + echo "Fix the issues and request re-review." + exit 1 + ;; + *) + echo "⚠ Unknown review status: $status" + exit 1 + ;; + esac +} + +cmd_enable() { + local session_id=$(get_session_id "${1:-}") + local state_file=$(get_state_file "$session_id") + + cat > "$state_file" < "$state_file" </dev/null || true + local issues=("$@") + + local issues_json="[]" + if [[ ${#issues[@]} -gt 0 ]]; then + issues_json=$(printf '%s\n' "${issues[@]}" | jq -R . | jq -s .) + fi + + cat > "$state_file" </dev/null || echo "unknown") + printf " %-40s %s\n" "$session" "$status" + done + shopt -u nullglob +} + +cmd_clean() { + local max_age="${1:-7d}" + echo "Cleaning review states older than $max_age..." + find "$STATE_DIR" -name "*.json" -mtime +${max_age%d} -delete 2>/dev/null || true + echo "Done." +} + +# Main +case "${1:-help}" in + check) + cmd_check "${2:-}" + ;; + enable) + cmd_enable "${2:-}" + ;; + approve) + cmd_approve "${2:-}" + ;; + reject) + cmd_reject "${2:-}" "${3:-}" "${@:4}" + ;; + status) + cmd_status "${2:-}" + ;; + list) + cmd_list + ;; + clean) + cmd_clean "${2:-7d}" + ;; + help|--help|-h) + echo "review-gate - Quality gate CLI for cross-agent review" + echo "" + echo "Commands:" + echo " check [session] Check if review approved (exit 0) or blocked (exit 1)" + echo " enable [session] Enable review requirement for session" + echo " approve [session] Mark review as approved" + echo " reject [session] [reason] [issues...] Mark as rejected" + echo " status [session] Show review status" + echo " list List all review sessions" + echo " clean [age] Clean old review states (default: 7d)" + echo "" + echo "Session ID is auto-detected from REVIEW_SESSION_ID, CLAUDE_SESSION_ID," + echo "or generated from git branch/commit." + ;; + *) + echo "Unknown command: $1" + echo "Run 'review-gate help' for usage." + exit 1 + ;; +esac diff --git a/skills/review-gate/skills/review-gate.md b/skills/review-gate/skills/review-gate.md new file mode 100644 index 0000000..417be69 --- /dev/null +++ b/skills/review-gate/skills/review-gate.md @@ -0,0 +1,109 @@ +# review-gate + +Quality gate for cross-agent review enforcement. Blocks agent completion until work is reviewed and approved. + +## Purpose + +When agents work autonomously (unattended, CI/CD, batch jobs), they may: +- Claim "done" when work is incomplete +- Miss requirements or make incorrect assumptions +- Take shortcuts that don't serve user intent + +review-gate enforces an external quality check before the agent can exit. + +## Usage + +### Enable review requirement + +At the start of work that needs review: + +```bash +review-gate enable [session_id] +``` + +This creates a "pending" review state. The agent will be blocked from exiting until approved. + +### Approve review + +After reviewing work (manually or via reviewer agent): + +```bash +review-gate approve [session_id] +``` + +### Reject review + +If issues are found: + +```bash +review-gate reject [session_id] "reason" ["issue1" "issue2" ...] +``` + +### Check status + +```bash +review-gate status [session_id] +review-gate list # all sessions +``` + +## Session ID + +Session ID is auto-detected in this order: +1. Provided argument +2. `REVIEW_SESSION_ID` env var +3. `CLAUDE_SESSION_ID` env var +4. Generated from git branch + commit + +## Hook Integration + +For Claude Code, add to your project's `.claude/hooks.json`: + +```json +{ + "hooks": { + "Stop": [{ + "hooks": [{ + "type": "command", + "command": "review-gate check", + "timeout": 30 + }] + }] + } +} +``` + +The Stop hook runs when the agent tries to exit. If review is pending or rejected, exit is blocked. + +## Commands + +| Command | Description | Exit Code | +|---------|-------------|-----------| +| `check [id]` | Check if exit allowed | 0=allowed, 1=blocked | +| `enable [id]` | Enable review requirement | 0 | +| `approve [id]` | Mark as approved | 0 | +| `reject [id] [reason] [issues...]` | Mark as rejected | 0 | +| `status [id]` | Show review state | 0 | +| `list` | List all sessions | 0 | +| `clean [age]` | Remove old states (default: 7d) | 0 | + +## Cross-Agent Pattern + +For agents without Stop hooks, use orchestrator pattern: + +```bash +# Orchestrator script +review-gate enable "$SESSION_ID" + +# Worker completes... + +# Reviewer approves or rejects... +review-gate check "$SESSION_ID" || { + echo "Review not passed, looping back to worker" + # ... retry logic +} +``` + +## Dependencies + +- `jq` - JSON parsing +- `bash` 4.0+ diff --git a/skills/review-gate/templates/reviewer-prompt.md b/skills/review-gate/templates/reviewer-prompt.md new file mode 100644 index 0000000..156396d --- /dev/null +++ b/skills/review-gate/templates/reviewer-prompt.md @@ -0,0 +1,78 @@ +# Adversarial Reviewer + +You are reviewing work done by another agent. Your job is to find problems. + +## Your Mandate + +You work for the USER, not the worker agent. The worker may have: +- Misunderstood the request +- Taken shortcuts +- Made incorrect assumptions +- Produced incomplete work + +Your job is to catch these issues before the work is accepted. + +## Ground Truth + +The USER originally requested: + +``` +{user_prompt} +``` + +Read these words carefully. The worker's interpretation may differ from what the user actually asked for. + +## Your Methodology + +1. **Read the user's EXACT words** - not the agent's summary or interpretation +2. **Examine ALL changes** - `git diff`, file contents, test results +3. **ASSUME errors exist** - actively look for problems +4. **Steel-man first** - give benefit of doubt, understand what worker tried to do +5. **Then attack systematically** - check requirements, edge cases, correctness + +## Review Checklist + +- [ ] Does the work address what the user ACTUALLY asked for? +- [ ] Are there any missing requirements? +- [ ] Does the code compile/run without errors? +- [ ] Are there obvious bugs or logic errors? +- [ ] Are there security concerns? +- [ ] Is anything incomplete or TODO-marked? + +## Your Tools (READ-ONLY) + +You have access to: +- `Read`, `Grep`, `Glob` - examine files +- `Bash` - run `git diff`, `git log`, test commands (read-only operations) +- `orch` - get second opinions from other models (for non-trivial concerns) + +You should NOT: +- Modify any files +- Make commits +- "Fix" issues yourself + +## Your Decision + +After thorough review, run one of: + +**If APPROVED** (work meets user's actual request): +```bash +review-gate approve {session_id} +``` + +**If REJECTED** (problems found): +```bash +review-gate reject {session_id} "Brief summary of issues" "issue 1" "issue 2" ... +``` + +Then report your findings to the user. + +## Second Opinions + +For non-trivial concerns, use orch to get consensus: + +```bash +orch consensus "Is this implementation correct? [describe concern]" gemini deepseek claude +``` + +If you're uncertain, err on the side of rejection. Better to have the worker fix potential issues than to let bugs through.