skills/docs/research/idle-alice-quality-gate.md
dan 239c758dc7 docs: research idle/alice quality gate mechanism
Comprehensive analysis of emes idle/alice plugin:
- Hook chain (6 hooks, Stop is key blocker)
- State management via jwz (topic-based messaging)
- alice agent (read-only Opus reviewer)
- Circuit breakers against infinite loops

Conclusion: alice pattern is overkill for code-review (we ARE the
reviewer). More useful: "review reminder" hook that checks if
code-review was run before exit on significant changes.

Closes: skills-9jk

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 16:43:46 -08:00

5.9 KiB

idle/alice Quality Gate Analysis

Date: 2026-01-09 Status: Research complete Related: skills-9jk, ADR-005

Overview

alice (package name: idle) is a Claude Code plugin that mechanically enforces code quality by blocking agent exit until an independent reviewer (the alice agent) approves the work.

How It Works

Activation

Opt-in per-prompt via #alice prefix:

#alice implement user authentication with JWT

The UserPromptSubmit hook detects this prefix and sets review state via jwz.

Hook Chain

alice uses 6 Claude Code hooks:

Hook Purpose Timeout
SessionStart Initialize session state 5s
UserPromptSubmit Detect #alice prefix, enable review 5s
Stop Block exit until approved 30s
PostToolUse Track tool usage 5s
SubagentStop Validate alice posted decision 5s
SessionEnd Cleanup 5s

The Stop Hook (Core Mechanism)

When agent tries to exit:

1. Load jwz store
2. Query "review:state:{session_id}" - is review enabled?
3. If not enabled → approve immediately
4. Query "alice:status:{session_id}" - did alice approve?
5. If decision == "COMPLETE" → reset state, allow exit
6. Otherwise → BLOCK, instruct agent to spawn alice

hooks.json Structure

{
  "hooks": {
    "SessionStart": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "alice hook session-start",
            "timeout": 5
          }
        ]
      }
    ],
    "Stop": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "alice hook stop",
            "timeout": 30
          }
        ]
      }
    ]
  }
}

Each hook invokes the alice CLI with a subcommand. The CLI checks/updates state in jwz.

State Management (jwz)

jwz is an append-only topic-based messaging system:

  • Stores messages in .jwz/messages.jsonl (git-mergeable)
  • SQLite cache for FTS5 search
  • Auto-captures git context (commit, branch, dirty status)
  • Topics like review:state:{session}, alice:status:{session}

Key jwz commands:

jwz post <topic> -m <message>     # Post message
jwz read <topic>                   # Read topic
jwz search <query>                 # Full-text search

The alice Agent

alice is a read-only Opus-based reviewer:

  • Model: Claude Opus
  • Access: Read-only (no file modifications)
  • Tools: Read, Grep, Glob, Bash (restricted to tissue and jwz)
  • Philosophy: "Work for the user, not the agent"

Review Methodology

  1. Compare deliverables against user's actual words (not agent claims)
  2. Assume errors exist in complex work
  3. Steel-man the strongest case, then attack it
  4. Seek second opinions from Codex/Gemini
  5. Post decision: COMPLETE or ISSUES

Decision Output

alice posts to alice:status:{session_id}:

{
  "decision": "COMPLETE" | "ISSUES",
  "summary": "...",
  "reasoning": "...",
  "second_opinions": [...],
  "message_to_agent": "..."
}

Circuit Breakers

Three safeguards against infinite loops:

  1. Stale Review Detection: Same review blocks ≥3 times → fail open
  2. No-ID Blocks: alice never posts decision → 3 blocks → fail open
  3. State Persistence: Counters stored in jwz for recovery

Key Design Principles

From emes architecture:

Principle Implementation
Pull over push Agent retrieves context on-demand, not upfront
Safety over policy Critical guardrails via hooks, not prompts
Pointer over payload Messages contain references (IDs), not full content

Dependencies

Required:

  • jwz - State management
  • tissue - Issue tracking
  • jq - JSON parsing in hooks

Optional (for consensus):

  • codex - OpenAI CLI
  • gemini - Google CLI

Applicability to Our Skills

code-review Skill

Current state: Interactive - runs lenses, presents findings, asks before filing issues.

Potential enhancement: Add quality gate that blocks exit until findings are addressed.

Challenges:

  1. We don't have jwz - would need state management
  2. Our review IS the quality gate (not a separate reviewer)
  3. Different use case: code-review reviews code, alice reviews agent work

Options:

Approach Pros Cons
A: Adopt jwz Full emes compatibility Another dependency, Zig tool
B: Use beads Already have it Not designed for transient session state
C: Simple file state Minimal, portable DIY circuit breakers
D: Hook-only (stateless) Simplest No persistence across tool calls

Recommendation

For code-review, the alice pattern is overkill. Our skill already does the review - we don't need a second reviewer to review the review.

More useful pattern: Use Stop hook to remind agent to run code-review before exiting if significant code changes were made. This is a "did you remember to review?" gate, not a "did review pass?" gate.

Example:

{
  "hooks": {
    "Stop": [{
      "hooks": [{
        "type": "command",
        "command": "check-review-reminder.sh",
        "timeout": 5
      }]
    }]
  }
}

The script checks if:

  1. Significant code changes exist (git diff)
  2. code-review was invoked this session
  3. If changes but no review → return non-zero (block with reminder)

Open Questions

  1. Should we adopt jwz for cross-skill state coordination?
  2. Is the "review reminder" pattern valuable enough to implement?
  3. Could ops-review benefit from similar gating?
  4. How do hooks interact with our dual-publish strategy?

References