Comprehensive analysis of emes idle/alice plugin: - Hook chain (6 hooks, Stop is key blocker) - State management via jwz (topic-based messaging) - alice agent (read-only Opus reviewer) - Circuit breakers against infinite loops Conclusion: alice pattern is overkill for code-review (we ARE the reviewer). More useful: "review reminder" hook that checks if code-review was run before exit on significant changes. Closes: skills-9jk Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
221 lines
5.9 KiB
Markdown
221 lines
5.9 KiB
Markdown
# idle/alice Quality Gate Analysis
|
|
|
|
> **Date:** 2026-01-09
|
|
> **Status:** Research complete
|
|
> **Related:** [skills-9jk](../../.beads/), [ADR-005](../adr/005-dual-publish-plugin-architecture.md)
|
|
|
|
## Overview
|
|
|
|
**alice** (package name: idle) is a Claude Code plugin that mechanically enforces code quality by blocking agent exit until an independent reviewer (the alice agent) approves the work.
|
|
|
|
- **Repo:** https://github.com/evil-mind-evil-sword/idle
|
|
- **Language:** Zig
|
|
- **Author:** femtomc
|
|
- **License:** AGPL-3.0
|
|
|
|
## How It Works
|
|
|
|
### Activation
|
|
|
|
Opt-in per-prompt via `#alice` prefix:
|
|
```
|
|
#alice implement user authentication with JWT
|
|
```
|
|
|
|
The `UserPromptSubmit` hook detects this prefix and sets review state via jwz.
|
|
|
|
### Hook Chain
|
|
|
|
alice uses 6 Claude Code hooks:
|
|
|
|
| Hook | Purpose | Timeout |
|
|
|------|---------|---------|
|
|
| `SessionStart` | Initialize session state | 5s |
|
|
| `UserPromptSubmit` | Detect `#alice` prefix, enable review | 5s |
|
|
| `Stop` | **Block exit until approved** | 30s |
|
|
| `PostToolUse` | Track tool usage | 5s |
|
|
| `SubagentStop` | Validate alice posted decision | 5s |
|
|
| `SessionEnd` | Cleanup | 5s |
|
|
|
|
### The Stop Hook (Core Mechanism)
|
|
|
|
When agent tries to exit:
|
|
|
|
```
|
|
1. Load jwz store
|
|
2. Query "review:state:{session_id}" - is review enabled?
|
|
3. If not enabled → approve immediately
|
|
4. Query "alice:status:{session_id}" - did alice approve?
|
|
5. If decision == "COMPLETE" → reset state, allow exit
|
|
6. Otherwise → BLOCK, instruct agent to spawn alice
|
|
```
|
|
|
|
### hooks.json Structure
|
|
|
|
```json
|
|
{
|
|
"hooks": {
|
|
"SessionStart": [
|
|
{
|
|
"hooks": [
|
|
{
|
|
"type": "command",
|
|
"command": "alice hook session-start",
|
|
"timeout": 5
|
|
}
|
|
]
|
|
}
|
|
],
|
|
"Stop": [
|
|
{
|
|
"hooks": [
|
|
{
|
|
"type": "command",
|
|
"command": "alice hook stop",
|
|
"timeout": 30
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
}
|
|
```
|
|
|
|
Each hook invokes the `alice` CLI with a subcommand. The CLI checks/updates state in jwz.
|
|
|
|
## State Management (jwz)
|
|
|
|
**jwz** is an append-only topic-based messaging system:
|
|
|
|
- Stores messages in `.jwz/messages.jsonl` (git-mergeable)
|
|
- SQLite cache for FTS5 search
|
|
- Auto-captures git context (commit, branch, dirty status)
|
|
- Topics like `review:state:{session}`, `alice:status:{session}`
|
|
|
|
Key jwz commands:
|
|
```bash
|
|
jwz post <topic> -m <message> # Post message
|
|
jwz read <topic> # Read topic
|
|
jwz search <query> # Full-text search
|
|
```
|
|
|
|
## The alice Agent
|
|
|
|
alice is a **read-only Opus-based reviewer**:
|
|
|
|
- **Model:** Claude Opus
|
|
- **Access:** Read-only (no file modifications)
|
|
- **Tools:** Read, Grep, Glob, Bash (restricted to `tissue` and `jwz`)
|
|
- **Philosophy:** "Work for the user, not the agent"
|
|
|
|
### Review Methodology
|
|
|
|
1. Compare deliverables against **user's actual words** (not agent claims)
|
|
2. Assume errors exist in complex work
|
|
3. Steel-man the strongest case, then attack it
|
|
4. Seek second opinions from Codex/Gemini
|
|
5. Post decision: `COMPLETE` or `ISSUES`
|
|
|
|
### Decision Output
|
|
|
|
alice posts to `alice:status:{session_id}`:
|
|
```json
|
|
{
|
|
"decision": "COMPLETE" | "ISSUES",
|
|
"summary": "...",
|
|
"reasoning": "...",
|
|
"second_opinions": [...],
|
|
"message_to_agent": "..."
|
|
}
|
|
```
|
|
|
|
## Circuit Breakers
|
|
|
|
Three safeguards against infinite loops:
|
|
|
|
1. **Stale Review Detection:** Same review blocks ≥3 times → fail open
|
|
2. **No-ID Blocks:** alice never posts decision → 3 blocks → fail open
|
|
3. **State Persistence:** Counters stored in jwz for recovery
|
|
|
|
## Key Design Principles
|
|
|
|
From emes architecture:
|
|
|
|
| Principle | Implementation |
|
|
|-----------|----------------|
|
|
| **Pull over push** | Agent retrieves context on-demand, not upfront |
|
|
| **Safety over policy** | Critical guardrails via hooks, not prompts |
|
|
| **Pointer over payload** | Messages contain references (IDs), not full content |
|
|
|
|
## Dependencies
|
|
|
|
**Required:**
|
|
- `jwz` - State management
|
|
- `tissue` - Issue tracking
|
|
- `jq` - JSON parsing in hooks
|
|
|
|
**Optional (for consensus):**
|
|
- `codex` - OpenAI CLI
|
|
- `gemini` - Google CLI
|
|
|
|
## Applicability to Our Skills
|
|
|
|
### code-review Skill
|
|
|
|
**Current state:** Interactive - runs lenses, presents findings, asks before filing issues.
|
|
|
|
**Potential enhancement:** Add quality gate that blocks exit until findings are addressed.
|
|
|
|
**Challenges:**
|
|
1. We don't have jwz - would need state management
|
|
2. Our review IS the quality gate (not a separate reviewer)
|
|
3. Different use case: code-review reviews code, alice reviews agent work
|
|
|
|
**Options:**
|
|
|
|
| Approach | Pros | Cons |
|
|
|----------|------|------|
|
|
| **A: Adopt jwz** | Full emes compatibility | Another dependency, Zig tool |
|
|
| **B: Use beads** | Already have it | Not designed for transient session state |
|
|
| **C: Simple file state** | Minimal, portable | DIY circuit breakers |
|
|
| **D: Hook-only (stateless)** | Simplest | No persistence across tool calls |
|
|
|
|
### Recommendation
|
|
|
|
For code-review, the alice pattern is overkill. Our skill already does the review - we don't need a second reviewer to review the review.
|
|
|
|
**More useful pattern:** Use `Stop` hook to remind agent to run code-review before exiting if significant code changes were made. This is a "did you remember to review?" gate, not a "did review pass?" gate.
|
|
|
|
Example:
|
|
```json
|
|
{
|
|
"hooks": {
|
|
"Stop": [{
|
|
"hooks": [{
|
|
"type": "command",
|
|
"command": "check-review-reminder.sh",
|
|
"timeout": 5
|
|
}]
|
|
}]
|
|
}
|
|
}
|
|
```
|
|
|
|
The script checks if:
|
|
1. Significant code changes exist (git diff)
|
|
2. code-review was invoked this session
|
|
3. If changes but no review → return non-zero (block with reminder)
|
|
|
|
## Open Questions
|
|
|
|
1. Should we adopt jwz for cross-skill state coordination?
|
|
2. Is the "review reminder" pattern valuable enough to implement?
|
|
3. Could ops-review benefit from similar gating?
|
|
4. How do hooks interact with our dual-publish strategy?
|
|
|
|
## References
|
|
|
|
- [alice/idle repo](https://github.com/evil-mind-evil-sword/idle)
|
|
- [jwz repo](https://github.com/evil-mind-evil-sword/jwz)
|
|
- [Claude Code Hooks Docs](https://code.claude.com/docs/en/hooks)
|