From 4773abe56f213593f6cc398425fc4e566739e28b Mon Sep 17 00:00:00 2001 From: dan Date: Fri, 9 Jan 2026 16:45:49 -0800 Subject: [PATCH] docs: correct alice framing - adversarial agent review for automation alice is for reviewing AGENT work in unattended/autonomous contexts, not code review. Key use cases: - Autonomous runs on ops-jrz1 - CI/CD pipelines with agents - High-stakes changes without human oversight Added hybrid approach recommendation: use alice concepts (Stop hook, adversarial methodology) with our infrastructure (beads, orch). Co-Authored-By: Claude Opus 4.5 --- docs/research/idle-alice-quality-gate.md | 127 ++++++++++++++++++----- 1 file changed, 100 insertions(+), 27 deletions(-) diff --git a/docs/research/idle-alice-quality-gate.md b/docs/research/idle-alice-quality-gate.md index a4d33fb..e40dc06 100644 --- a/docs/research/idle-alice-quality-gate.md +++ b/docs/research/idle-alice-quality-gate.md @@ -158,60 +158,133 @@ From emes architecture: - `codex` - OpenAI CLI - `gemini` - Google CLI -## Applicability to Our Skills +## Use Case: Adversarial Agent Review -### code-review Skill +alice is NOT about code review - it's about **adversarial review of agent work** in automation contexts. -**Current state:** Interactive - runs lenses, presents findings, asks before filing issues. +### The Problem alice Solves -**Potential enhancement:** Add quality gate that blocks exit until findings are addressed. +When agents work autonomously (unattended, CI/CD, batch jobs): +- Agent may claim "done" when work is incomplete +- Agent may miss requirements or make incorrect assumptions +- Agent may take shortcuts that don't serve user intent +- No human in the loop to catch mistakes -**Challenges:** -1. We don't have jwz - would need state management -2. Our review IS the quality gate (not a separate reviewer) -3. Different use case: code-review reviews code, alice reviews agent work +### alice's Solution -**Options:** +An independent adversarial reviewer (Opus) that: +1. **Works for the user, not the agent** - grounds truth in user's actual words +2. **Assumes errors exist** - actively looks for problems +3. **Steel-mans then attacks** - gives benefit of doubt, then systematically challenges +4. **Seeks external opinions** - uses Codex/Gemini for second opinions +5. **Mechanically blocks exit** - can't be talked out of it via prompt + +### When to Use alice Pattern + +| Scenario | Why alice helps | +|----------|-----------------| +| **Autonomous/unattended runs** | No human watching - need automated QA | +| **CI/CD with agents** | Quality gate before merge | +| **Complex multi-step features** | Verify each deliverable meets requirements | +| **Refactoring** | Ensure nothing broke | +| **ops-jrz1 deployment** | Remote server, less oversight | + +### When NOT to Use + +- Interactive sessions with human oversight +- Simple, low-risk changes +- Exploratory/research work (no deliverable to review) + +## Applicability to Our Workflow + +### Potential Use Cases + +1. **Autonomous runs on ops-jrz1** + - Agent implements feature on VPS + - alice reviews before agent exits + - Issues filed to tissue if problems found + +2. **Batch processing** + - Agent processes multiple tasks + - alice spot-checks work quality + +3. **High-stakes changes** + - Security-sensitive code + - Infrastructure changes + - Production deployments + +### Integration Options | Approach | Pros | Cons | |----------|------|------| -| **A: Adopt jwz** | Full emes compatibility | Another dependency, Zig tool | -| **B: Use beads** | Already have it | Not designed for transient session state | -| **C: Simple file state** | Minimal, portable | DIY circuit breakers | -| **D: Hook-only (stateless)** | Simplest | No persistence across tool calls | +| **A: Adopt alice directly** | Battle-tested, full features | Requires jwz, tissue, Zig deps | +| **B: Build our own** | Tailored to our needs, use beads | Dev effort, reinventing wheel | +| **C: Hybrid** | Use alice concepts, our infra | Best of both, some integration work | +| **D: orch-as-reviewer** | Already have orch for multi-model | Different purpose, not adversarial | -### Recommendation +### Hybrid Approach (Recommended) -For code-review, the alice pattern is overkill. Our skill already does the review - we don't need a second reviewer to review the review. +Use alice's **concepts** with our **infrastructure**: -**More useful pattern:** Use `Stop` hook to remind agent to run code-review before exiting if significant code changes were made. This is a "did you remember to review?" gate, not a "did review pass?" gate. +1. **Stop hook** - Block exit until review passes +2. **beads for state** - Track review status per session +3. **orch for second opinions** - We already have multi-model consensus +4. **Adversarial prompt** - Adapt alice's methodology -Example: +Example hooks.json: ```json { "hooks": { "Stop": [{ "hooks": [{ "type": "command", - "command": "check-review-reminder.sh", - "timeout": 5 + "command": "review-gate check", + "timeout": 30 }] }] } } ``` -The script checks if: -1. Significant code changes exist (git diff) -2. code-review was invoked this session -3. If changes but no review → return non-zero (block with reminder) +`review-gate` would: +1. Check if review mode is enabled (beads flag or env var) +2. If enabled, check for approval in beads +3. If unapproved, block and instruct agent to spawn reviewer +4. Circuit breaker after N failures + +### Reviewer Agent Design + +Inspired by alice but using our tools: + +```markdown +# Adversarial Reviewer + +You review agent work for the USER, not the agent. + +## Methodology +1. Read the user's original request (not agent's summary) +2. Examine all changes made (git diff, file reads) +3. Assume errors exist - find them +4. Use orch for second opinions on non-trivial work +5. Post decision to beads + +## Decision +- APPROVED: Work meets user's actual request +- ISSUES: Problems found (file beads issues) + +## Tools Available +- Read, Grep, Glob (read-only) +- orch (second opinions) +- bd (issue tracking) +``` ## Open Questions -1. Should we adopt jwz for cross-skill state coordination? -2. Is the "review reminder" pattern valuable enough to implement? -3. Could ops-review benefit from similar gating? -4. How do hooks interact with our dual-publish strategy? +1. Do we need jwz or can beads handle session state? +2. Should the reviewer be a separate skill or plugin? +3. How do we handle the "review the reviewer" problem? +4. What's the circuit breaker threshold (3 like alice)? +5. Should this be opt-in (`#review`) or always-on for certain contexts? ## References