diff --git a/.beads/issues.jsonl b/.beads/issues.jsonl
index 7856894..b8ccf02 100644
--- a/.beads/issues.jsonl
+++ b/.beads/issues.jsonl
@@ -65,7 +65,7 @@
 {"id":"skills-5vg","title":"spec-review: Add context/assumptions step to prompts","description":"Reviews can become speculative without establishing context first.\n\nAdd to prompts:\n- List assumptions being made\n- Distinguish: missing from doc vs implied vs out of scope\n- Ask clarifying questions if critical context missing","status":"closed","priority":3,"issue_type":"task","created_at":"2025-12-15T00:23:25.681448596-08:00","updated_at":"2025-12-15T14:06:15.415750911-08:00","closed_at":"2025-12-15T14:06:15.415750911-08:00"}
 {"id":"skills-5x2o","title":"Extract msToUnix helper for repeated div 1000","description":"[SMELL] LOW state.nim - 'div 1000' for ms to seconds conversion repeated 8 times. Add helper proc msToUnix(ms: int64): int64 in types.nim.","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-10T19:49:52.505245039-08:00","created_by":"dan","updated_at":"2026-01-10T20:32:28.362386563-08:00","closed_at":"2026-01-10T20:32:28.362386563-08:00","close_reason":"Created utils.nim with common helpers"}
 {"id":"skills-5xkg","title":"Document Intent/Approach/Work workflow","description":"Write user-facing documentation for structured beads.\n\n## Deliverable\n- How-to guide\n- Template reference\n- Examples at different scales\n\n## Sections\n- Why structure? (vs just doing the thing)\n- The three phases: Intent / Approach / Work\n- Full template vs minimal template\n- When to use each\n- Examples: small fix, medium feature, large epic\n- Integration with bd commands","status":"closed","priority":3,"issue_type":"task","owner":"dan@delpad","created_at":"2026-01-18T08:13:59.050133558-08:00","created_by":"dan","updated_at":"2026-01-18T20:20:47.00512145-08:00","closed_at":"2026-01-18T20:20:47.00512145-08:00","close_reason":"Docs written to docs/intent-approach-work.md","dependencies":[{"issue_id":"skills-5xkg","depends_on_id":"skills-oh8m","type":"blocks","created_at":"2026-01-18T08:14:32.866401069-08:00","created_by":"dan"},{"issue_id":"skills-5xkg","depends_on_id":"skills-ankb","type":"blocks","created_at":"2026-01-18T08:14:45.264952521-08:00","created_by":"dan"},{"issue_id":"skills-5xkg","depends_on_id":"skills-sx8u","type":"blocks","created_at":"2026-01-18T08:14:45.375561869-08:00","created_by":"dan"},{"issue_id":"skills-5xkg","depends_on_id":"skills-4ecn","type":"blocks","created_at":"2026-01-18T08:26:55.34104244-08:00","created_by":"dan"}]}
-{"id":"skills-5ycq","title":"Implement /synod multi-model consensus extension for pi","description":"## Overview\n\nImplement a `/synod` command for pi-coding-agent that provides multi-model consensus with conversation context inheritance and interactive UI.\n\n## Background\n\nResearch conducted 2026-01-22 analyzing:\n- Orch CLI capabilities (423 models, voting, synthesis, serial strategies)\n- Pi Oracle extension from shitty-extensions (conversation context, add-to-context flow)\n- Gap between the two approaches\n\n## Core Concept\n\n`/synod` = Assembly of AI models convened to deliberate. One command covering both:\n- **Parallel voting** (conclave-style): Independent opinions, tally votes\n- **Serial discussion** (council-style): Models build on each other's responses\n\n## Usage Design\n\n```bash\n/synod \"Should we use Rust?\" flash gemini claude    # Parallel vote (default)\n/synod \"Should we use Rust?\" --debate               # Serial discussion\n/synod \"Should we use Rust?\" --brainstorm           # Generative mode\n/synod \"Should we use Rust?\" --vote                 # Explicit parallel vote\n```\n\n## Key Features Required\n\n### Must Have\n- [ ] Multi-select model picker with quick keys\n- [ ] Conversation context inheritance (serialize pi conversation to models)\n- [ ] Parallel query execution with progress indicators\n- [ ] Vote parsing (SUPPORT/OPPOSE/NEUTRAL) from responses\n- [ ] Results display with scrolling\n- [ ] Add-to-context workflow (YES/SUMMARY/NO)\n- [ ] Cost estimation before query\n\n### Nice to Have\n- [ ] Side-by-side comparison view\n- [ ] Diff highlighting for disagreements\n- [ ] Response caching (5min TTL)\n- [ ] Model recommendations based on query type\n- [ ] Synthesis mode (aggregate responses)\n- [ ] Serial strategies (refine, debate)\n\n## Architecture Decision: Hybrid Approach\n\n1. **Keep orch CLI** for advanced features (423 models, synthesis, sessions, serial strategies)\n2. **Add /synod extension** for interactive queries with conversation context\n3. **Register orch_consensus tool** for agent programmatic access\n\n### Why Hybrid?\n- Orch: No conversation context sharing, no interactive UI\n- Oracle: Only one model at a time, no voting\n- Synod: Best of both - context inheritance + multi-model + voting + UI\n\n## Technical Implementation\n\n### Conversation Context Serialization\n```typescript\nimport { serializeConversation, convertToLlm } from \"@mariozechner/pi-coding-agent\";\n\nconst history = ctx.sessionManager.getBranch();\nconst serialized = serializeConversation(history);\nconst llmMessages = serialized.map(convertToLlm);\n```\n\n### Model Registry\nStart by querying orch: `orch models` and parse output.\nLater: import orch's config directly.\n\n### Vote Parsing\nPrompt engineering approach:\n```\nRespond with your verdict first: SUPPORT, OPPOSE, or NEUTRAL\nThen explain your reasoning.\n```\nParse with regex, fallback to secondary classification query.\n\n### Add-to-Context Options\n1. YES - Add all model responses verbatim\n2. SUMMARY - Synthesize and add summary only\n3. NO - Don't add to conversation\n\n## UI Patterns (from research)\n\n### Model Picker\n- Multi-select with checkboxes\n- Quick keys 1-9 for fast selection\n- Show cost per model\n- Filter by authenticated models only\n- Exclude current model\n\n### Results Display\n- Progressive disclosure: Gauge → List → Side-by-side\n- Vote counts: SUPPORT: 2, OPPOSE: 1, NEUTRAL: 0\n- Scrollable reasoning for each model\n- Box drawing character borders\n\n### Key Detection\n```typescript\nimport { matchesKey, Key } from \"@mariozechner/pi-tui\";\nif (matchesKey(data, Key.enter)) submit();\nif (matchesKey(data, Key.escape)) cancel();\n```\n\n## Implementation Plan\n\n### Phase 1: Basic /synod (Week 1)\n1. Port Oracle extension structure\n2. Add model aliases from orch\n3. Multi-select model picker\n4. Parallel query execution\n5. Basic results display\n6. Add-to-context workflow\n\n### Phase 2: Voting \u0026 Comparison (Week 2)\n1. Vote parsing from responses\n2. Consensus gauge visualization\n3. Side-by-side comparison view\n4. Cost preview before query\n\n### Phase 3: Advanced Features (Week 3-4)\n1. Serial strategies (--debate, --refine)\n2. Synthesis mode\n3. Response caching\n4. orch_consensus tool wrapper for agent\n\n## Research References\n\nFull research documents:\n- /tmp/pi-extension-ecosystem-research.md (14KB)\n- /tmp/pi-ui-ecosystem-research.md (22KB)\n- /tmp/multi-model-consensus-analysis.md (22KB)\n\nKey sources:\n- shitty-extensions/oracle.ts - UI patterns, context serialization\n- pi-mono/packages/tui - Component architecture\n- pi-mono/examples/extensions - Official patterns\n- nicobailon/pi-* - Community extensions\n\n## Design Questions Resolved\n\n1. **Single vs Multi model?** Support both via modes\n2. **Auto-add to context?** Always prompt (configurable)\n3. **Expensive models?** Show cost warning, require confirmation\n4. **Caching?** 5min TTL with hash(model+context+prompt)\n5. **Visualization?** Progressive disclosure (gauge → list → diff)","status":"open","priority":2,"issue_type":"feature","owner":"dan@delpad","created_at":"2026-01-22T22:35:32.203497461-08:00","created_by":"dan","updated_at":"2026-01-22T22:35:32.203497461-08:00","labels":["multi-model","pi-extension","synod"]}
+{"id":"skills-5ycq","title":"Implement /synod multi-model consensus extension for pi","description":"## Overview\n\nImplement a `/synod` command for pi-coding-agent that provides multi-model consensus with conversation context inheritance and interactive UI.\n\n## Background\n\nResearch conducted 2026-01-22 analyzing:\n- Orch CLI capabilities (423 models, voting, synthesis, serial strategies)\n- Pi Oracle extension from shitty-extensions (conversation context, add-to-context flow)\n- Gap between the two approaches\n\n## Core Concept\n\n`/synod` = Assembly of AI models convened to deliberate. One command covering both:\n- **Parallel voting** (conclave-style): Independent opinions, tally votes\n- **Serial discussion** (council-style): Models build on each other's responses\n\n## Usage Design\n\n```bash\n/synod \"Should we use Rust?\" flash gemini claude    # Parallel vote (default)\n/synod \"Should we use Rust?\" --debate               # Serial discussion\n/synod \"Should we use Rust?\" --brainstorm           # Generative mode\n/synod \"Should we use Rust?\" --vote                 # Explicit parallel vote\n```\n\n## Key Features Required\n\n### Must Have\n- [ ] Multi-select model picker with quick keys\n- [ ] Conversation context inheritance (serialize pi conversation to models)\n- [ ] Parallel query execution with progress indicators\n- [ ] Vote parsing (SUPPORT/OPPOSE/NEUTRAL) from responses\n- [ ] Results display with scrolling\n- [ ] Add-to-context workflow (YES/SUMMARY/NO)\n- [ ] Cost estimation before query\n\n### Nice to Have\n- [ ] Side-by-side comparison view\n- [ ] Diff highlighting for disagreements\n- [ ] Response caching (5min TTL)\n- [ ] Model recommendations based on query type\n- [ ] Synthesis mode (aggregate responses)\n- [ ] Serial strategies (refine, debate)\n\n## Architecture Decision: Hybrid Approach\n\n1. **Keep orch CLI** for advanced features (423 models, synthesis, sessions, serial strategies)\n2. **Add /synod extension** for interactive queries with conversation context\n3. **Register orch_consensus tool** for agent programmatic access\n\n### Why Hybrid?\n- Orch: No conversation context sharing, no interactive UI\n- Oracle: Only one model at a time, no voting\n- Synod: Best of both - context inheritance + multi-model + voting + UI\n\n## Technical Implementation\n\n### Conversation Context Serialization\n```typescript\nimport { serializeConversation, convertToLlm } from \"@mariozechner/pi-coding-agent\";\n\nconst history = ctx.sessionManager.getBranch();\nconst serialized = serializeConversation(history);\nconst llmMessages = serialized.map(convertToLlm);\n```\n\n### Model Registry\nStart by querying orch: `orch models` and parse output.\nLater: import orch's config directly.\n\n### Vote Parsing\nPrompt engineering approach:\n```\nRespond with your verdict first: SUPPORT, OPPOSE, or NEUTRAL\nThen explain your reasoning.\n```\nParse with regex, fallback to secondary classification query.\n\n### Add-to-Context Options\n1. YES - Add all model responses verbatim\n2. SUMMARY - Synthesize and add summary only\n3. NO - Don't add to conversation\n\n## UI Patterns (from research)\n\n### Model Picker\n- Multi-select with checkboxes\n- Quick keys 1-9 for fast selection\n- Show cost per model\n- Filter by authenticated models only\n- Exclude current model\n\n### Results Display\n- Progressive disclosure: Gauge → List → Side-by-side\n- Vote counts: SUPPORT: 2, OPPOSE: 1, NEUTRAL: 0\n- Scrollable reasoning for each model\n- Box drawing character borders\n\n### Key Detection\n```typescript\nimport { matchesKey, Key } from \"@mariozechner/pi-tui\";\nif (matchesKey(data, Key.enter)) submit();\nif (matchesKey(data, Key.escape)) cancel();\n```\n\n## Implementation Plan\n\n### Phase 1: Basic /synod (Week 1)\n1. Port Oracle extension structure\n2. Add model aliases from orch\n3. Multi-select model picker\n4. Parallel query execution\n5. Basic results display\n6. Add-to-context workflow\n\n### Phase 2: Voting \u0026 Comparison (Week 2)\n1. Vote parsing from responses\n2. Consensus gauge visualization\n3. Side-by-side comparison view\n4. Cost preview before query\n\n### Phase 3: Advanced Features (Week 3-4)\n1. Serial strategies (--debate, --refine)\n2. Synthesis mode\n3. Response caching\n4. orch_consensus tool wrapper for agent\n\n## Research References\n\nFull research documents:\n- /tmp/pi-extension-ecosystem-research.md (14KB)\n- /tmp/pi-ui-ecosystem-research.md (22KB)\n- /tmp/multi-model-consensus-analysis.md (22KB)\n\nKey sources:\n- shitty-extensions/oracle.ts - UI patterns, context serialization\n- pi-mono/packages/tui - Component architecture\n- pi-mono/examples/extensions - Official patterns\n- nicobailon/pi-* - Community extensions\n\n## Design Questions Resolved\n\n1. **Single vs Multi model?** Support both via modes\n2. **Auto-add to context?** Always prompt (configurable)\n3. **Expensive models?** Show cost warning, require confirmation\n4. **Caching?** 5min TTL with hash(model+context+prompt)\n5. **Visualization?** Progressive disclosure (gauge → list → diff)","status":"open","priority":2,"issue_type":"feature","owner":"dan@delpad","created_at":"2026-01-22T22:35:32.203497461-08:00","created_by":"dan","updated_at":"2026-01-22T22:35:32.203497461-08:00","labels":["multi-model","pi-extension","synod"],"comments":[{"id":23,"issue_id":"skills-5ycq","author":"dan","text":"## Research Findings (2026-01-25)\n\n### Key Open Source Tools Discovered\n\n#### 1. **llm-council** (jersobh/consensus)\n- Langchain-compatible framework for deliberative decision-making\n- **Voting modes**: majority, ranked-choice, weighted confidence, veto\n- **Multi-round reasoning** with peer feedback\n- **Parallel execution** support\n- Self-correction mechanisms\n\n#### 2. **Oracle Extension** (hjanuschka/shitty-extensions)\n- **Existing pi-agent extension** for second opinions\n- Single model at a time (not multi-model consensus)\n- Key patterns to adopt:\n  - `serializeConversation()` + `convertToLlm()` for context inheritance\n  - Model picker with quick keys (1-9)\n  - Add-to-context workflow (YES/NO)\n  - `BorderedLoader` for async operations\n  - `ctx.ui.custom()` for full TUI components\n\n#### 3. **Routing/Orchestration Tools**\n| Tool | Type | Key Feature |\n|------|------|-------------|\n| **RouteLLM** | OSS | Cost-based routing, 85% savings |\n| **LiteLLM** | OSS | Unified API, Python SDK |\n| **Portkey** | Commercial | Conditional routing, observability |\n| **NotDiamond** | Commercial | Predictive model selection |\n\n#### 4. **Consensus Patterns**\n- **Voting/Council**: Independent votes, tally results\n- **Debate**: Models critique each other iteratively\n- **Mixture of Agents (MoA)**: Layered proposer→aggregator\n- **Self-Refine**: Single model iterates with self-feedback\n- **LLM-as-Judge**: One model evaluates others\n\n### Design Considerations\n\n#### Already Well-Aligned\n- Voting mechanism (SUPPORT/OPPOSE/NEUTRAL) ✓\n- Parallel query execution ✓\n- Model aliases via orch CLI ✓\n- Cost preview ✓\n\n#### Consider Adding\n1. **Voting modes** beyond majority: ranked-choice, weighted, veto\n2. **Confidence scores** - low confidence triggers more models\n3. **Model recommendations** based on query type\n4. **MoA-style aggregation** - synthesize insights, not just tally\n5. **Disagreement highlighting** - often the most valuable signal\n\n### Open Questions\n1. Debate mode: synchronous (wait all) vs streaming (show as respond)?\n2. How to handle ties in voting?\n3. Should disagreement be surfaced prominently?\n4. Oracle extends single model → synod extends to N models. Reuse oracle UI patterns?\n5. Integration: wrap orch CLI vs native pi-ai calls?\n\n### Reference Implementations\n- `hjanuschka/shitty-extensions/oracle.ts` - Context serialization, model picker UI\n- `jersobh/consensus` - Voting strategies, multi-round deliberation\n- `qualisero/awesome-pi-agent` - Extension ecosystem overview\n- pi-mono `subagent/index.ts` - Parallel execution, streaming results\n\n### Next Steps\n- [ ] Spike: Port oracle.ts patterns to multi-model\n- [ ] Evaluate: orch CLI wrapper vs native implementation\n- [ ] Design: Voting mode UX (how to select majority vs ranked-choice)\n- [ ] Prototype: Disagreement visualization","created_at":"2026-01-25T07:27:54Z"},{"id":24,"issue_id":"skills-5ycq","author":"dan","text":"## Ecosystem Context\n\n### awesome-pi-agent Highlights\nKey extensions relevant to synod:\n- **oracle** - Second opinion from alt models (single model, context inheritance)\n- **handoff** - Transfer context to new sessions\n- **memory-mode** - Save instructions to AGENTS.md\n- **subagent** - Delegate to specialized agents (parallel, chain modes)\n\n### pi-mono Patterns to Leverage\nFrom official examples:\n- `questionnaire.ts` - Tab-based multi-select, option navigation\n- `subagent/index.ts` - Parallel execution, progress streaming, usage stats\n- `ctx.ui.custom()` - Full TUI components with keyboard input\n\n### Potential Architecture\n\n```\n/synod \"question\" [models...]\n\n┌─────────────────────────────────────────┐\n│ 🔮 Synod - Multi-Model Consensus        │\n├─────────────────────────────────────────┤\n│ Q: Should we use Rust for this service? │\n├─────────────────────────────────────────┤\n│ Models: flash, gemini, claude (3)       │\n│ Estimated cost: ~$0.02                  │\n├─────────────────────────────────────────┤\n│ [Query All]  [Edit Models]  [Cancel]    │\n└─────────────────────────────────────────┘\n\n       ↓ parallel queries ↓\n\n┌─────────────────────────────────────────┐\n│ 🗳️ Results (2 SUPPORT, 1 OPPOSE)        │\n├─────────────────────────────────────────┤\n│ ✓ flash: SUPPORT                        │\n│   \"Rust's safety guarantees...\"         │\n│ ✓ gemini: SUPPORT                       │\n│   \"Memory safety without GC...\"         │\n│ ✗ claude: OPPOSE                        │\n│   \"Team expertise in Go...\" ← DISSENT   │\n├─────────────────────────────────────────┤\n│ Add to context? [All] [Summary] [None]  │\n└─────────────────────────────────────────┘\n```\n\n### Key Differentiator from Oracle\n- Oracle: 1 model, 1 opinion, add to context\n- Synod: N models, vote tally, highlight disagreement, synthesize","created_at":"2026-01-25T07:28:08Z"},{"id":25,"issue_id":"skills-5ycq","author":"dan","text":"## Technical Insights: Confidence \u0026 Disagreement\n\n### Confidence Calibration\nResearch shows LLM confidence often doesn't match actual accuracy. Key approaches:\n- **Multicalibration** - Calibrate across data groupings correlated with correctness\n- **LENS (Learning Ensemble Confidence from Neural States)** - Analyze internal representations\n- **Self-consistency ensembles** - Aggregate confidence across runs\n\n### Disagreement as Signal\nDisagreement between models is valuable information:\n- High disagreement → uncertain/nuanced topic → surface to user\n- Low disagreement → high confidence consensus\n- Systematic bias detection → models from same family may share blind spots\n\n### Implementation Ideas\n1. **Confidence-weighted voting** - Models report confidence, weight votes accordingly\n2. **Disagreement highlighting** - When models disagree, show reasoning side-by-side\n3. **Family diversity** - Recommend models from different providers (OpenAI + Anthropic + Google)\n4. **Tie-breaker escalation** - On tie, optionally query additional model\n\n### Cost vs Quality Tradeoffs\n- Start with cheap models (flash, haiku) for initial vote\n- Escalate to expensive models (opus, gpt-4) only on low confidence/disagreement\n- Cache identical queries across sessions (5min TTL)\n\n### Risk: Overconfidence\nLLMs as judges tend toward overconfidence. Mitigations:\n- Always show reasoning, not just vote\n- Highlight when all models agree (groupthink risk)\n- Option to query \"devil's advocate\" model explicitly","created_at":"2026-01-25T07:28:40Z"},{"id":26,"issue_id":"skills-5ycq","author":"dan","text":"## Design Decisions (2026-01-25 discussion)\n\n### 1. Implementation Strategy: Native pi-ai calls\n\n**Decision:** Write native pi extension using pi-ai directly, not wrapping orch/llm CLIs.\n\n**Rationale:**\n- pi already has unified model registry (`ctx.modelRegistry`)\n- Oracle extension shows the pattern: `complete()` from `@mariozechner/pi-ai`\n- Avoids subprocess overhead and parsing CLI output\n- Full access to streaming, usage stats, abort signals\n- Can leverage pi's API key management\n\n**Trade-off:** Lose orch's 424 model aliases, but pi's registry is sufficient for common models. Can add aliases later.\n\n### 2. Results Delivery: All at once (parallel)\n\n**Decision:** Query in parallel, show results when all complete.\n\n**Rationale:**\n- Simpler UX - one moment of decision, not drip-feed\n- Agent (us) can see full picture before commenting\n- Avoids cognitive load of \"wait, there's more coming\"\n- Streaming progress indicators show activity during wait\n\n**Alternative considered:** Show results as they arrive, agent comments incrementally. Rejected because:\n- Creates pressure to react before full context\n- Harder to synthesize/compare\n- More complex state management\n\n**Exception:** Debate mode (--debate) is inherently serial - models respond to each other.\n\n### 3. Tie Handling: Explicit acknowledgment\n\n**Decision:** Say \"It's a tie\" and show the split.\n\n**Implementation:**\n```\n🗳️ Results: TIE (1 SUPPORT, 1 OPPOSE, 1 NEUTRAL)\n\nNo clear consensus. The models are split:\n- flash: SUPPORT - \"Performance benefits...\"\n- claude: OPPOSE - \"Complexity cost...\"  \n- gemini: NEUTRAL - \"Depends on team...\"\n\nConsider: Query additional model? Reframe question?\n```\n\n**Rationale:**\n- Ties ARE the answer sometimes - the question is genuinely contested\n- Forcing a winner hides valuable signal\n- User/agent can decide next action (add model, rephrase, accept ambiguity)\n\n### 4. Summarization: Agent synthesizes\n\n**Decision:** Agent (the one calling /synod) summarizes results, not the tool.\n\n**Rationale:**\n- Agent has full conversation context\n- Agent can weigh results against prior discussion\n- Tool returns structured data, agent interprets\n- Keeps tool simple and composable\n\n**Add-to-context options:**\n1. **All** - Raw responses verbatim (agent can summarize in next turn)\n2. **Summary** - Tool generates brief summary (backup if agent doesn't want to)\n3. **None** - Don't pollute context\n\n**Summary prompt (for option 2):**\n```\nSynthesize these model responses into 2-3 sentences:\n- Note the consensus (if any)\n- Highlight key disagreements\n- Don't pick a winner, present the landscape\n```","created_at":"2026-01-25T07:31:58Z"},{"id":27,"issue_id":"skills-5ycq","author":"dan","text":"### Addendum: Streaming vs Batch UX\n\nWhile we wait for all results before presenting, the **progress UI should stream**:\n\n```\n🔮 Synod - Querying 3 models...\n\n  ✓ flash      (0.3s)  \n  ⏳ gemini    (1.2s...)\n  ⏳ claude    (0.8s...)\n```\n\nThis gives:\n- Feedback that something is happening\n- Sense of which models are fast/slow\n- Ability to abort if taking too long\n\nWhen all complete, transition to results view.\n\n### Alternative: \"Reveal as ready\" mode (future)\n\nCould add `--stream` flag for power users who want to see results as they arrive:\n```\n/synod --stream \"question\" flash gemini claude\n```\n\nBut default is batch for cleaner UX.","created_at":"2026-01-25T07:32:06Z"},{"id":28,"issue_id":"skills-5ycq","author":"dan","text":"## llm Plugin Ecosystem Research\n\n### Available Plugins (no consensus/voting built-in)\n\nCurrently installed:\n- `llm-anthropic` - Claude models\n- `llm-openrouter` - OpenRouter gateway (250+ models)\n- `llm-gemini` - Google Gemini\n\nFull directory (50+ plugins): https://llm.datasette.io/en/stable/plugins/directory.html\n\n**Notable: No native ensemble/voting plugin in core llm.**\n\n### Key Discovery: llm-consortium\n\n**GitHub:** irthomasthomas/llm-consortium\n**PyPI:** `llm install llm-consortium`\n\nInspired by Karpathy's insight:\n\u003e \"Your best performance will come from just asking all the models, and then getting them to come to a consensus.\"\n\n**Features:**\n- Multi-model orchestration in parallel\n- Iterative refinement until confidence threshold met\n- Arbiter model synthesizes responses\n- Configurable confidence thresholds (default 0.8)\n- Instance counts per model (`gpt-4o:2` = 2 instances)\n- Conversation continuation support\n- SQLite logging\n\n**Usage:**\n```bash\nllm consortium \"Your complex query\" \\\n  -m o3-mini:1 \\\n  -m gpt-4o:2 \\\n  -m gemini-2:3 \\\n  --arbiter gemini-2 \\\n  --confidence-threshold 0.9 \\\n  --max-iterations 4\n```\n\n**Programmatic:**\n```python\nfrom llm_consortium import create_consortium\n\norchestrator = create_consortium(\n    models=[\"o3-mini:1\", \"gpt-4o:2\"],\n    confidence_threshold=0.9,\n    arbiter=\"gemini-2\"\n)\nresult = orchestrator.orchestrate(\"Your prompt\")\n```\n\n### Other Relevant PyPI Packages\n\n| Package | Description |\n|---------|-------------|\n| `llm-consensus` | Langchain-compatible, voting strategies |\n| `multi-llm-consensus` | Moderator-based consensus |\n| `llm-multi` (0.1.0) | Basic multi-model prompting |\n| `nons` | Majority voting, ensemble decisions |\n| `agorai` | Social choice theory aggregation |\n\n### Implications for /synod\n\n**Option A: Wrap llm-consortium**\n- Pros: Battle-tested, iterative refinement, arbiter synthesis\n- Cons: Subprocess overhead, no pi context inheritance\n\n**Option B: Port llm-consortium patterns to pi-ai**\n- Pros: Native integration, context inheritance, streaming\n- Cons: More implementation work\n\n**Option C: Hybrid - use llm-consortium for synthesis logic**\n- Query models via pi-ai (context inheritance)\n- Use llm-consortium's arbiter prompt pattern for synthesis\n- Best of both worlds?\n\n### llm-consortium Architecture Worth Adopting\n\n1. **Arbiter model** - Dedicated model to synthesize/evaluate\n2. **Confidence threshold** - Iterate until confident enough\n3. **Instance counts** - Run same model multiple times for diversity\n4. **Iterative refinement** - Not just one-shot consensus","created_at":"2026-01-25T07:34:02Z"}]}
 {"id":"skills-69sz","title":"Fix P1 security bugs (genOid, HeartbeatThread)","description":"Two critical security/safety issues:\n\n1. genOid() - skills-0wk\n   - Currently uses rand(25) without randomize()\n   - IDs are predictable/deterministic\n   - Fix: Use std/sysrand for crypto-safe randomness, or call randomize() at startup\n\n2. HeartbeatThread - skills-bk7x  \n   - Uses manual alloc0/dealloc\n   - Risk of memory leak if startup fails, use-after-free if caller holds reference\n   - Fix: Use 'ref HeartbeatThread' with GC management\n\nParent: skills-g2wa","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-10T20:18:49.759721333-08:00","created_by":"dan","updated_at":"2026-01-10T20:24:36.613555221-08:00","closed_at":"2026-01-10T20:24:36.613555221-08:00","close_reason":"Both P1 security bugs fixed: genOid uses sysrand, HeartbeatThread uses ref type"}
 {"id":"skills-6ae","title":"Create ui-query skill for AT-SPI integration","description":"Create a skill that provides programmatic UI tree access via AT-SPI.\n\n## Context\nAT-SPI is now enabled in dotfiles (services.gnome.at-spi2-core + QT_LINUX_ACCESSIBILITY_ALWAYS_ON).\nThis complements niri-window-capture (visual) with semantic UI data.\n\n## Capabilities\n- Read text from GTK/Qt widgets directly (no OCR)\n- Find UI elements by role (button, text-field, menu)\n- Query element states (focused, enabled, checked)\n- Get element positions for potential input simulation\n- Navigate parent/child relationships\n\n## Suggested structure\nskills/ui-query/\n├── SKILL.md\n├── scripts/\n│   ├── list-windows.py      # Windows with AT-SPI info\n│   ├── get-text.py          # Extract text from window/element\n│   ├── find-element.py      # Find by role/name\n│   └── query-state.py       # Element states\n└── README.md\n\n## Notes\n- Start simple: list windows, get text\n- pyatspi available via python3Packages.pyatspi\n- Use accerciser (now installed) to explore the tree","status":"closed","priority":2,"issue_type":"feature","created_at":"2025-12-29T15:37:55.592793763-05:00","created_by":"dan","updated_at":"2026-01-15T14:19:42.092890404-08:00","closed_at":"2026-01-15T14:19:42.092890404-08:00","close_reason":"Complete: list-windows, get-text, find-element, query-state all implemented","comments":[{"id":17,"issue_id":"skills-6ae","author":"dan","text":"Initial implementation: list-windows.py working. Shows apps, windows, geometry, states. Remaining: get-text.py, find-element.py, query-state.py","created_at":"2026-01-15T19:57:15Z"}]}
 {"id":"skills-6e3","title":"Searchable Claude Code conversation history","description":"## Context\nClaude Code persists full conversations in `~/.claude/projects/\u003cproject\u003e/\u003cuuid\u003e.jsonl`. This is complete but not searchable - can't easily find \"that session where we solved X\".\n\n## Goal\nMake conversation history searchable without requiring manual worklogs.\n\n## Approach\n\n### Index structure\n```\n~/.claude/projects/\u003cproject\u003e/\n  \u003cuuid\u003e.jsonl           # raw conversation (existing)\n  index.jsonl            # session metadata + summaries (new)\n```\n\n### Index entry format\n```json\n{\n  \"uuid\": \"f9a4c161-...\",\n  \"date\": \"2025-12-17\",\n  \"project\": \"/home/dan/proj/skills\",\n  \"summary\": \"Explored Wayland desktop automation, AT-SPI investigation, vision model benchmark\",\n  \"keywords\": [\"wayland\", \"niri\", \"at-spi\", \"automation\", \"seeing-problem\"],\n  \"commits\": [\"906f2bc\", \"0b97155\"],\n  \"duration_minutes\": 90,\n  \"message_count\": 409\n}\n```\n\n### Features needed\n1. **Index builder** - Parse JSONL, extract/generate summary + keywords\n2. **Search CLI** - `claude-search \"AT-SPI wayland\"` → matching sessions\n3. **Auto-index hook** - Update index on session end or compaction\n\n## Questions\n- Generate summaries via AI or extract heuristically?\n- Index per-project or global?\n- How to handle very long sessions (multiple topics)?\n\n## Value\n- Find past solutions without remembering dates\n- Model reflection: include relevant past sessions in context\n- Replace manual worklogs with auto-generated metadata","status":"closed","priority":2,"issue_type":"feature","created_at":"2025-12-17T15:56:50.913766392-08:00","updated_at":"2025-12-29T18:35:56.530154004-05:00","closed_at":"2025-12-29T18:35:56.530154004-05:00","close_reason":"Prototype complete: bin/claude-search indexes 122 sessions, searches by keyword. Future: auto-index hook, full-text search, keyword extraction."}
diff --git a/docs/approach/2026-01-24-session-hygiene.md b/docs/approach/2026-01-24-session-hygiene.md
new file mode 100644
index 0000000..67b9c05
--- /dev/null
+++ b/docs/approach/2026-01-24-session-hygiene.md
@@ -0,0 +1,158 @@
+# Approach: Session Hygiene Extension
+
+## Strategy
+
+**Core philosophy**: Ambient awareness, not active management.
+
+The extension provides a persistent footer widget showing git state. The user glances at it when they want to. A `/commit` command offers a guided flow with auto-drafted messages when they're ready to commit. No nudges, no prompts, no interruptions.
+
+**Key Decisions**:
+
+1. **Widget vs Status**: Widget (multi-character, always visible) vs setStatus (footer slot, subtle)
+   → **Widget** — needs to be glanceable without hunting for it
+
+2. **Polling vs Events**: Poll git status periodically vs hook into tool_result events
+   → **Hook tool_result** — only re-check after bash/write/edit tools that might change files. Avoids polling overhead.
+
+3. **Grouping strategy**: No grouping vs LLM-driven grouping
+   → **LLM-driven grouping** — LLM sees changed files + session context, proposes logical groups with conventional commit messages. Always runs, even for 1-3 files.
+
+4. **Confirmation flow**: Always confirm vs LLM discretion
+   → **LLM discretion** — LLM decides when to ask questions (ambiguous grouping, orphan files) vs proceed. User already invoked `/commit`, so trust the intent.
+
+5. **Orphan files**: Auto-bucket into "misc" vs ask
+   → **Ask** — if a file doesn't fit any logical group, LLM should ask user where it belongs.
+
+6. **Staging**: Auto-stage all vs let user stage manually
+   → **Auto-stage all (`git add -A`)** — matches "just commit everything" simplicity. User can unstage manually before `/commit` if needed.
+
+## Architecture
+
+### New Components
+
+```
+~/.pi/agent/extensions/session-hygiene/
+├── index.ts          # Extension entry point
+└── git.ts            # Git helpers (status, commit, etc.)
+```
+
+### Extension Structure
+
+```typescript
+// index.ts
+export default function(pi: ExtensionAPI) {
+  // State
+  let dirtyCount = 0;
+  
+  // 1. Widget: show dirty count in footer
+  pi.on("session_start", updateWidget);
+  pi.on("tool_result", maybeUpdateWidget);  // Only after bash/write/edit
+  
+  // 2. Command: /commit
+  pi.registerCommand("commit", { handler: commitFlow });
+}
+```
+
+### Data Flow
+
+```
+[tool_result event]
+       │
+       ▼
+  is bash/write/edit?
+       │ yes
+       ▼
+  git status --porcelain
+       │
+       ▼
+  count changed files
+       │
+       ▼
+  ctx.ui.setWidget("hygiene", ["● 14 files"])
+```
+
+```
+[/commit command]
+       │
+       ▼
+  git status --porcelain → list of changed files
+       │
+       ▼
+  extract session context:
+    - recent messages (user prompts, assistant responses)
+    - file touchpoints (which files were read/written/edited when)
+       │
+       ▼
+  LLM prompt:
+    "Here are the changed files and session context.
+     Group into logical commits. For each group:
+     - list files
+     - conventional commit message
+     If a file doesn't fit, ask the user.
+     If grouping is ambiguous, ask.
+     Otherwise, proceed and execute commits."
+       │
+       ▼
+  LLM executes commits via tool calls (git add <files>, git commit -m "...")
+       │
+       ▼
+  update widget (now shows 0 or remaining)
+```
+
+### Commit Tool
+
+The `/commit` command injects context and lets the LLM drive. It needs a `git_commit` tool:
+
+```typescript
+pi.registerTool({
+  name: "git_commit",
+  description: "Stage specific files and commit with a message",
+  parameters: Type.Object({
+    files: Type.Array(Type.String(), { description: "Files to stage (relative paths)" }),
+    message: Type.String({ description: "Commit message (conventional format)" }),
+  }),
+  async execute(toolCallId, params, onUpdate, ctx, signal) {
+    // git add <files...>
+    // git commit -m <message>
+    // return success/failure
+  },
+});
+```
+
+This lets the LLM make multiple commits in sequence, asking questions in between if needed.
+
+## Risks
+
+### Known Unknowns
+
+- **Widget placement**: `setWidget` defaults to above editor. Need to verify `belowEditor` placement looks right for a small status indicator.
+- **LLM latency**: Drafting commit message adds a few seconds. Acceptable? Could show "Drafting..." in UI.
+- **Model availability**: Need a model for commit message drafting. What if user doesn't have API key for it?
+
+### Failure Modes
+
+- **Not a git repo**: `git status` returns non-zero. Extension silently does nothing (no widget, `/commit` shows error).
+- **Detached HEAD / merge conflict**: Unusual git states. `/commit` should detect and warn rather than corrupt state.
+- **Empty commit**: All changes already staged and committed. `/commit` should detect "nothing to commit" and notify.
+
+### Blast Radius
+
+- **Minimal**: Extension only reads git state and runs `git add -A` + `git commit`. No force pushes, no rebase, no destructive operations.
+- **Worst case**: User commits something they didn't want to. Recoverable via `git reset HEAD~1`.
+
+## Phases
+
+### Phase 1: Widget + git_commit Tool
+- Footer widget showing dirty file count
+- Updates after file-mutating tools
+- `git_commit` tool registered (LLM can use it anytime)
+
+### Phase 2: /commit Command
+- `/commit` command injects context and triggers LLM-driven grouping
+- LLM proposes groups, asks questions if uncertain, executes commits
+- Widget updates as commits land
+
+### Phase 3: Polish (Future)
+- Stash support (`/stash`)
+- Undo last commit (`/uncommit`)
+- Integration with worklog skill (prompt to commit after worklog)
diff --git a/docs/intent/2026-01-24-session-hygiene.md b/docs/intent/2026-01-24-session-hygiene.md
new file mode 100644
index 0000000..4a221b8
--- /dev/null
+++ b/docs/intent/2026-01-24-session-hygiene.md
@@ -0,0 +1,38 @@
+# Intent: Session Hygiene Extension
+
+## Motivation
+
+Working in pi across long sessions, it's easy to lose track of what's changed. You finish a session with dozens of uncommitted files, unclear what goes with what, and the commit history becomes a mess of grab-bag commits. The problem isn't catastrophic — nothing is lost — but it erodes organization over time.
+
+## Need
+
+Ambient awareness of git state while working, so commits happen naturally at good moments rather than as panicked cleanup at session end.
+
+## Use-Cases
+
+- **Mid-session glance**: You're deep in a refactor, glance at the footer, see "14 files" — mental note that there's a chunk of work building up. You might commit now, or keep going. Either way, you're aware.
+
+- **Natural stopping point**: You finish a logical unit of work. The footer reminds you there's uncommitted work. You run `/commit`, get a suggested message based on what we discussed, and commit cleanly.
+
+- **Session end**: You're about to close pi. Footer shows dirty state. You either commit, stash, or consciously leave it — but you're not surprised by 48 files later.
+
+## Success Criteria
+
+- Footer widget shows uncommitted file count for current repo at all times
+- `/commit` command triggers guided flow with auto-drafted commit message from conversation context
+- User never feels nagged, blocked, or guilty — just informed
+- Commits end up logical and well-messaged because awareness came early
+
+## Constraints
+
+- Single repo only (the one we're in)
+- Must work as a pi extension (TypeScript, pi extension API)
+- No external dependencies beyond git
+
+## Anti-Goals
+
+- **No auto-commit**: Never commit without explicit user action
+- **No blocking prompts**: Never interrupt flow with modal dialogs
+- **No guilt mechanics**: No "you should commit" nudges, red warnings, or escalating alerts
+- **No multi-repo tracking**: Don't watch repos outside current working directory
+- **No push**: This is about local commits only
diff --git a/docs/work/2026-01-24-session-hygiene.md b/docs/work/2026-01-24-session-hygiene.md
new file mode 100644
index 0000000..d902de5
--- /dev/null
+++ b/docs/work/2026-01-24-session-hygiene.md
@@ -0,0 +1,49 @@
+# Work: Session Hygiene Extension
+
+## Intent
+Link to: [docs/intent/2026-01-24-session-hygiene.md](../intent/2026-01-24-session-hygiene.md)
+
+## Approach
+Link to: [docs/approach/2026-01-24-session-hygiene.md](../approach/2026-01-24-session-hygiene.md)
+
+## Checklist
+
+### Phase 1: Widget + git_commit Tool
+
+- [x] **W001**: Create extension directory structure
+  - Verification: `ls ~/.pi/agent/extensions/session-hygiene/index.ts`
+
+- [x] **W002**: Implement git status helper
+  - Verification: `pi -e ~/.pi/agent/extensions/session-hygiene -p "test" 2>&1 | head -5` (no syntax errors)
+
+- [ ] **W003**: Implement footer widget showing dirty file count
+  - Verification: Start pi in a dirty repo, observe widget shows file count
+
+- [ ] **W004**: Hook tool_result to update widget after bash/write/edit
+  - Verification: In pi, write a file, observe widget count increases
+
+- [ ] **W005**: Implement git_commit tool (stage files + commit)
+  - Verification: `pi -p "Use git_commit to commit README.md with message 'test: verify tool'"` in test repo
+
+### Phase 2: /commit Command
+
+- [ ] **W006**: Implement session context extraction (recent messages, file touchpoints)
+  - Verification: `/commit` in pi shows context being gathered (log or notify)
+
+- [ ] **W007**: Implement /commit command that injects context and triggers LLM
+  - Verification: `/commit` in dirty repo triggers LLM response with grouping proposal
+
+- [ ] **W008**: Verify full flow: /commit → LLM groups → git_commit calls → widget updates
+  - Verification: End-to-end test in a repo with 5+ changed files across different paths
+
+## Verification Evidence
+
+- (2026-01-24 23:xx) W001: `ls ~/.pi/agent/extensions/session-hygiene/index.ts` → exists
+- (2026-01-24 23:xx) W002: jiti load fails on missing module (expected) — syntax valid
+
+## Notes
+
+- Extension location: `~/.pi/agent/extensions/session-hygiene/`
+- Will use `belowEditor` widget placement — need to verify it looks right
+- For /commit context injection, use `pi.sendUserMessage()` or `before_agent_start` message injection
+- Model for grouping: use whatever model is currently active (no separate API key needed)