From f24b2bb51820af0776792367781a84233194db6a Mon Sep 17 00:00:00 2001 From: dan Date: Fri, 9 Jan 2026 11:03:37 -0800 Subject: [PATCH] feat: add emes plugin structure to orch skill - Add .claude-plugin/plugin.json with metadata - Copy SKILL.md to skills/orch.md for auto-discovery - Keep original SKILL.md for Nix backward compat - Add emes-conversion-guide.md documenting the pattern Part of skills-6x1 (emes plugin architecture epic) Co-Authored-By: Claude Opus 4.5 --- .beads/issues.jsonl | 5 + docs/emes-conversion-guide.md | 124 ++++++++++ skills/orch/.claude-plugin/plugin.json | 16 ++ skills/orch/skills/orch.md | 314 +++++++++++++++++++++++++ 4 files changed, 459 insertions(+) create mode 100644 docs/emes-conversion-guide.md create mode 100644 skills/orch/.claude-plugin/plugin.json create mode 100644 skills/orch/skills/orch.md diff --git a/.beads/issues.jsonl b/.beads/issues.jsonl index 899ac09..6b99f94 100644 --- a/.beads/issues.jsonl +++ b/.beads/issues.jsonl @@ -23,6 +23,7 @@ {"id":"skills-6e3","title":"Searchable Claude Code conversation history","description":"## Context\nClaude Code persists full conversations in `~/.claude/projects/\u003cproject\u003e/\u003cuuid\u003e.jsonl`. This is complete but not searchable - can't easily find \"that session where we solved X\".\n\n## Goal\nMake conversation history searchable without requiring manual worklogs.\n\n## Approach\n\n### Index structure\n```\n~/.claude/projects/\u003cproject\u003e/\n \u003cuuid\u003e.jsonl # raw conversation (existing)\n index.jsonl # session metadata + summaries (new)\n```\n\n### Index entry format\n```json\n{\n \"uuid\": \"f9a4c161-...\",\n \"date\": \"2025-12-17\",\n \"project\": \"/home/dan/proj/skills\",\n \"summary\": \"Explored Wayland desktop automation, AT-SPI investigation, vision model benchmark\",\n \"keywords\": [\"wayland\", \"niri\", \"at-spi\", \"automation\", \"seeing-problem\"],\n \"commits\": [\"906f2bc\", \"0b97155\"],\n \"duration_minutes\": 90,\n \"message_count\": 409\n}\n```\n\n### Features needed\n1. **Index builder** - Parse JSONL, extract/generate summary + keywords\n2. **Search CLI** - `claude-search \"AT-SPI wayland\"` → matching sessions\n3. **Auto-index hook** - Update index on session end or compaction\n\n## Questions\n- Generate summaries via AI or extract heuristically?\n- Index per-project or global?\n- How to handle very long sessions (multiple topics)?\n\n## Value\n- Find past solutions without remembering dates\n- Model reflection: include relevant past sessions in context\n- Replace manual worklogs with auto-generated metadata","status":"closed","priority":2,"issue_type":"feature","created_at":"2025-12-17T15:56:50.913766392-08:00","updated_at":"2025-12-29T18:35:56.530154004-05:00","closed_at":"2025-12-29T18:35:56.530154004-05:00","close_reason":"Prototype complete: bin/claude-search indexes 122 sessions, searches by keyword. Future: auto-index hook, full-text search, keyword extraction."} {"id":"skills-6gw","title":"Add artifact provenance to traces","description":"Current: files_created lists paths only.\nProblem: Can't detect regressions or validate outputs.\n\nAdd:\n- Content hash (sha256)\n- File size\n- For modifications: git_diff_summary (files changed, line counts)\n\nExample:\n outputs:\n artifacts:\n - path: docs/worklogs/...\n sha256: abc123...\n size: 1234\n action: created|modified\n\nEnables: diff traces, regression testing, validation.","status":"closed","priority":3,"issue_type":"task","created_at":"2025-12-23T19:49:48.654952533-05:00","updated_at":"2025-12-29T13:55:35.827778174-05:00","closed_at":"2025-12-29T13:55:35.827778174-05:00","close_reason":"Parked with ADR-001: skills-molecules integration deferred. Current simpler approach (skills as standalone) works well. Revisit when complex orchestration needed."} {"id":"skills-6jw","title":"spec-review: Add severity labeling to prompts and reviews","description":"Reviews produce flat lists mixing blockers with minor nits. Hard to make decisions.\n\nAdd to prompts:\n- Require severity labels: Blocker / High / Medium / Low\n- Sort output by severity\n- Include impact and likelihood for each issue","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-15T00:23:23.334156366-08:00","updated_at":"2025-12-15T13:00:32.678573181-08:00","closed_at":"2025-12-15T13:00:32.678573181-08:00"} +{"id":"skills-6x1","title":"Epic: emes plugin architecture alignment","description":"Convert skills to emes-style plugin architecture for portability across Claude Code, Gemini, and VPS deployment (ops-jrz1).\n\n**emes tools (evil-mind-evil-sword org):**\n- tissue: Git-native issue tracking (machine-first)\n- idle: Quality gate (blocks exit until reviewer approves)\n- jwz: Async messaging with identity/git context\n- marketplace: Plugin distribution registry\n\n**Conversion work:**\n1. Add .claude-plugin/plugin.json to each skill\n2. Restructure: SKILL.md → skills/\u003cname\u003e.md (auto-discovery)\n3. Add hooks/ where applicable (quality gates)\n4. Create marketplace.json registry\n5. Test with ops-jrz1 deployment\n\n**Key principles from emes:**\n- Pull context on-demand (not big upfront injections)\n- Mechanical enforcement via hooks (not prompts)\n- References over inline content\n- Machine-first interfaces (JSON output)\n\n**Candidates for conversion:**\n- orch (simple CLI wrapper)\n- worklog (scripts + templates)\n- code-review (has lenses, might want hooks)\n- ops-review (same pattern)","status":"open","priority":2,"issue_type":"epic","created_at":"2026-01-09T10:59:12.291560832-08:00","created_by":"dan","updated_at":"2026-01-09T10:59:12.291560832-08:00"} {"id":"skills-7bu","title":"Add atomic file operations to update scripts","description":"Files affected:\n- skills/update-opencode/scripts/update-nix-file.sh\n- .specify/scripts/bash/update-agent-context.sh\n\nIssues:\n- Uses sed -i which can corrupt on error\n- No rollback mechanism despite creating backups\n- Unsafe regex patterns with complex escaping\n\nFix:\n- Write to temp file, then atomic mv\n- Validate output before replacing original\n- Add rollback on failure\n\nSeverity: MEDIUM","status":"closed","priority":3,"issue_type":"task","created_at":"2025-12-24T02:51:02.334416215-05:00","updated_at":"2026-01-03T12:08:56.822659199-08:00","closed_at":"2026-01-03T12:08:56.822659199-08:00","close_reason":"Implemented atomic updates using temp files and traps in update-nix-file.sh, update-agent-context.sh, and deploy-skill.sh. Added validation before replacing original files."} {"id":"skills-7s0","title":"Compare STATIC_DATA.md with upstream","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-03T20:15:55.193704589-08:00","updated_at":"2025-12-03T20:19:29.659256809-08:00","closed_at":"2025-12-03T20:19:29.659256809-08:00","dependencies":[{"issue_id":"skills-7s0","depends_on_id":"skills-ebh","type":"discovered-from","created_at":"2025-12-03T20:15:55.195160705-08:00","created_by":"daemon","metadata":"{}"}]} {"id":"skills-7sh","title":"Set up bd-issue-tracking Claude Code skill from beads repo","description":"Install the beads Claude Code skill from https://github.com/steveyegge/beads/tree/main/examples/claude-code-skill\n\nThis skill teaches Claude how to effectively use beads for issue tracking across multi-session coding workflows. It provides strategic guidance on when/how to use beads, not just command syntax.\n\nFiles to install to ~/.claude/skills/bd-issue-tracking/:\n- SKILL.md - Core workflow patterns and decision criteria\n- BOUNDARIES.md - When to use beads vs markdown alternatives\n- CLI_REFERENCE.md - Complete command documentation\n- DEPENDENCIES.md - Relationship types and patterns\n- WORKFLOWS.md - Step-by-step procedures\n- ISSUE_CREATION.md - Quality guidelines\n- RESUMABILITY.md - Making work resumable across sessions\n- STATIC_DATA.md - Using beads as reference databases\n\nCan symlink or copy the files. Restart Claude Code after install.","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-03T17:53:43.254007992-08:00","updated_at":"2025-12-03T20:04:53.416579381-08:00","closed_at":"2025-12-03T20:04:53.416579381-08:00"} @@ -49,11 +50,14 @@ {"id":"skills-9cu.7","title":"Lens: supply-chain","description":"Create supply-chain.md lens for provenance:\n- Unpinned versions (latest tags)\n- Actions not pinned to SHA\n- Missing flake.lock/SRI hashes\n- Unsigned artifacts\n- Untrusted registries","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-01T16:55:49.317966318-05:00","created_by":"dan","updated_at":"2026-01-01T22:03:26.655269107-05:00","closed_at":"2026-01-01T22:03:26.655269107-05:00","close_reason":"Lens created with orch consensus: added Terraform/Tofu, build-time network access, GH Actions permissions, builtins.fetchTarball","dependencies":[{"issue_id":"skills-9cu.7","depends_on_id":"skills-9cu","type":"parent-child","created_at":"2026-01-01T16:55:49.319754113-05:00","created_by":"dan"},{"issue_id":"skills-9cu.7","depends_on_id":"skills-9cu.1","type":"blocks","created_at":"2026-01-01T16:55:49.322943568-05:00","created_by":"dan"}]} {"id":"skills-9cu.8","title":"Lens: observability","description":"Create observability.md lens for visibility:\n- Silent failures\n- Missing health checks\n- Incomplete metrics\n- Missing structured logging\n- No correlation IDs","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-01T16:55:49.562009474-05:00","created_by":"dan","updated_at":"2026-01-01T22:05:03.351508622-05:00","closed_at":"2026-01-01T22:05:03.351508622-05:00","close_reason":"Lens created with orch consensus: added resource visibility, heartbeats, version/build metadata, log rotation","dependencies":[{"issue_id":"skills-9cu.8","depends_on_id":"skills-9cu","type":"parent-child","created_at":"2026-01-01T16:55:49.564394694-05:00","created_by":"dan"},{"issue_id":"skills-9cu.8","depends_on_id":"skills-9cu.1","type":"blocks","created_at":"2026-01-01T16:55:49.571005731-05:00","created_by":"dan"}]} {"id":"skills-9cu.9","title":"Lens: nix-hygiene","description":"Create nix-hygiene.md lens (statix/deadnix-backed):\n- Dead code (unused bindings)\n- Anti-patterns (with lib abuse, IFD)\n- Module boundary violations\n- Overlay issues\n- Missing option types\n\nLinter integration: statix + deadnix JSON","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-01T16:56:00.623672452-05:00","created_by":"dan","updated_at":"2026-01-01T23:58:43.868830539-05:00","closed_at":"2026-01-01T23:58:43.868830539-05:00","close_reason":"Lens created with orch consensus: added lib.mkIf guards, mkDefault/mkForce, reproducibility/purity, build efficiency, expanded false positives","dependencies":[{"issue_id":"skills-9cu.9","depends_on_id":"skills-9cu","type":"parent-child","created_at":"2026-01-01T16:56:00.638729349-05:00","created_by":"dan"},{"issue_id":"skills-9cu.9","depends_on_id":"skills-9cu.1","type":"blocks","created_at":"2026-01-01T16:56:00.643063075-05:00","created_by":"dan"}]} +{"id":"skills-9jk","title":"Research: emes idle quality gate for code-review","description":"Evaluate whether code-review skill should use idle-style quality gate (block exit until review approved). Would enforce review completion mechanically.","status":"open","priority":3,"issue_type":"task","created_at":"2026-01-09T10:59:25.094378206-08:00","created_by":"dan","updated_at":"2026-01-09T10:59:25.094378206-08:00","dependencies":[{"issue_id":"skills-9jk","depends_on_id":"skills-6x1","type":"blocks","created_at":"2026-01-09T10:59:33.267948785-08:00","created_by":"dan"}]} {"id":"skills-a0x","title":"spec-review: Add traceability requirements across artifacts","description":"Prompts don't enforce spec → plan → tasks linkage. Drift can occur without detection.\n\nAdd:\n- Require trace matrix or linkage in reviews\n- Each plan item should reference spec requirement\n- Each task should reference plan item\n- Flag unmapped items and extra scope","status":"closed","priority":3,"issue_type":"task","created_at":"2025-12-15T00:23:25.270581198-08:00","updated_at":"2025-12-15T14:05:48.196356786-08:00","closed_at":"2025-12-15T14:05:48.196356786-08:00"} {"id":"skills-a23","title":"Update main README to list all 9 skills","description":"Main README.md 'Skills Included' section only lists worklog and update-spec-kit. Repo actually has 9 skills: template, worklog, update-spec-kit, screenshot-latest, niri-window-capture, tufte-press, update-opencode, web-research, web-search.","status":"closed","priority":2,"issue_type":"task","created_at":"2025-11-30T11:58:14.042397754-08:00","updated_at":"2025-12-28T22:08:02.074758486-05:00","closed_at":"2025-12-28T22:08:02.074758486-05:00","close_reason":"Updated README with table listing all 14 skills (5 deployed, 8 available, 1 development template)","dependencies":[{"issue_id":"skills-a23","depends_on_id":"skills-4yn","type":"blocks","created_at":"2025-11-30T12:01:30.306742184-08:00","created_by":"daemon","metadata":"{}"}]} {"id":"skills-al5","title":"Consider repo-setup-verification skill","description":"The dotfiles repo has a repo-setup-prompt.md verification checklist that could become a skill.\n\n**Source**: ~/proj/dotfiles/docs/repo-setup-prompt.md\n\n**What it does**:\n- Verifies .envrc has use_api_keys and skills loading\n- Checks .skills manifest exists with appropriate skills\n- Optionally checks beads setup\n- Verifies API keys are loaded\n\n**As a skill it could**:\n- Be invoked to audit any repo's agent setup\n- Offer to fix missing pieces\n- Provide consistent onboarding for new repos\n\n**Questions**:\n- Is this better as a skill vs a slash command?\n- Should it auto-fix or just report?\n- Does it belong in skills repo or dotfiles?","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-06T12:38:32.561337354-08:00","updated_at":"2025-12-28T22:22:57.639520516-05:00","closed_at":"2025-12-28T22:22:57.639520516-05:00","close_reason":"Decided: keep as prompt doc in dotfiles, not a skill. Claude can read it when asked. No wrapper benefit, and it's dotfiles-specific setup (not general skill). ai-tools-doctor handles version checking separately."} {"id":"skills-bcu","title":"Design doc-review skill","description":"# doc-review skill\n\nFight documentation drift with a non-interactive review process that generates patchfiles for human review.\n\n## Problem\n- No consistent documentation system across repos\n- Stale content accumulates\n- Structural inconsistencies (docs not optimized for agents)\n\n## Envisioned Workflow\n\n```bash\n# Phase 1: Generate patches (non-interactive, use spare credits, test models)\ndoc-review scan ~/proj/foo --model claude-sonnet --output /tmp/foo-patches/\n\n# Phase 2: Review patches (interactive session)\ncd ~/proj/foo\nclaude # human reviews patches, applies selectively\n```\n\n## Design Decisions Made\n\n- **Trigger**: Manual invocation (not CI). Use case includes burning extra LLM credits, testing models repeatably.\n- **Source of truth**: Style guide embedded in prompt template. Blessed defaults, overridable per-repo.\n- **Output**: Patchfiles for human review in interactive Claude session.\n- **Chunking**: Based on absolute size, not file count. Logical chunks easy for Claude to review.\n- **Scope detection**: Graph-based discovery starting from README.md or AGENTS.md, not glob-all-markdown.\n\n## Open Design Work\n\n### Agent-Friendly Doc Conventions (needs brainstorming)\nWhat makes docs agent-readable?\n- Explicit context (no \"as mentioned above\")\n- Clear section headers for navigation\n- Self-contained sections\n- Consistent terminology\n- Front-loaded summaries\n- ???\n\n### Prompt Content\nFull design round needed on:\n- What conventions to enforce\n- How to express them in prompt\n- Examples of \"good\" vs \"bad\"\n\n### Graph-Based Discovery\nHow does traversal work?\n- Parse links from README/AGENTS.md?\n- Follow relative markdown links?\n- Depth limit?\n\n## Skill Structure (tentative)\n```\nskills/doc-review/\n├── prompt.md # Core review instructions + style guide\n├── scan.sh # Orchestrates: find docs → invoke claude → emit patches\n└── README.md\n```\n\n## Out of Scope (for now)\n- Cross-repo standardization (broader than skills repo)\n- CI integration\n- Auto-apply without human review","status":"closed","priority":2,"issue_type":"feature","created_at":"2025-12-04T14:01:43.305653729-08:00","updated_at":"2025-12-04T16:44:03.468118288-08:00","closed_at":"2025-12-04T16:44:03.468118288-08:00","dependencies":[{"issue_id":"skills-bcu","depends_on_id":"skills-1ig","type":"blocks","created_at":"2025-12-04T14:02:17.144414636-08:00","created_by":"daemon","metadata":"{}"},{"issue_id":"skills-bcu","depends_on_id":"skills-53k","type":"blocks","created_at":"2025-12-04T14:02:17.164968463-08:00","created_by":"daemon","metadata":"{}"}]} {"id":"skills-be3","title":"Define trace security and redaction policy","description":"Wisps will leak secrets without explicit policy.\n\nRequired:\n- Default-deny for env vars (allowlist: PROJECT, USER, etc.)\n- Redaction rules for sensitive fields\n- No file contents by default\n- Classification field: internal|secret|public\n\nImplementation:\n- redact: [\"env.AWS_SECRET_ACCESS_KEY\", \"inputs.token\"]\n- Sanitization before writing to disk\n- Block elevation if classification=secret\n\nFrom consensus: both models flagged as medium-high severity.","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T19:49:31.041661947-05:00","updated_at":"2025-12-23T20:55:04.446363188-05:00","closed_at":"2025-12-23T20:55:04.446363188-05:00","close_reason":"ADRs revised with orch consensus feedback"} +{"id":"skills-bko","title":"Prototype: Convert orch skill to emes-style","description":"First conversion to validate the pattern. Add .claude-plugin/plugin.json, restructure to skills/ directory, test discovery.","status":"in_progress","priority":2,"issue_type":"task","created_at":"2026-01-09T10:59:24.812152648-08:00","created_by":"dan","updated_at":"2026-01-09T10:59:33.442271984-08:00","dependencies":[{"issue_id":"skills-bko","depends_on_id":"skills-6x1","type":"blocks","created_at":"2026-01-09T10:59:33.182232479-08:00","created_by":"dan"}]} +{"id":"skills-bo8","title":"Gemini skills access: ReadFile path restrictions block .claude/skills/","description":"Gemini agent couldn't read skill files from .claude/skills/orch/SKILL.md due to path restrictions. ReadFile tool restricts paths to workspace directories, so .claude/skills/ (symlinked from home-manager) is blocked. Agent had to fall back to shell cat command. Breaks skills portability across agents. Potential fixes: copy skills into repo, configure allowed paths, use MCP, or document workaround.","status":"open","priority":3,"issue_type":"bug","created_at":"2026-01-09T10:58:04.037329419-08:00","created_by":"dan","updated_at":"2026-01-09T10:58:17.108865761-08:00"} {"id":"skills-bvz","title":"spec-review: Add Definition of Ready checklists for each phase","description":"'Ready for /speckit.plan' and similar are underspecified.\n\nAdd concrete checklists:\n- Spec ready for planning: problem statement, goals, constraints, acceptance criteria, etc.\n- Plan ready for tasks: milestones, risks, dependencies, test strategy, etc.\n- Tasks ready for bd: each task has acceptance criteria, dependencies explicit, etc.","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-15T00:23:24.877531852-08:00","updated_at":"2025-12-15T14:05:26.880419097-08:00","closed_at":"2025-12-15T14:05:26.880419097-08:00"} {"id":"skills-bww","title":"Benchmark AT-SPI overhead and coverage","description":"## Goal\nMeasure AT-SPI's runtime overhead and coverage across apps.\n\n## Prerequisites\n- Enable `services.gnome.at-spi2-core.enable = true` in NixOS\n- Set `QT_LINUX_ACCESSIBILITY_ALWAYS_ON=1` for Qt apps\n- Rebuild and re-login\n\n## Overhead benchmarks\n1. **Startup time**: App launch with/without AT-SPI\n2. **Memory**: RSS delta with AT-SPI enabled\n3. **CPU**: Idle CPU with AT-SPI bus running\n4. **UI latency**: Input-to-paint latency (if measurable)\n\n## Coverage audit\nFor each app, document:\n- Does it expose accessibility tree?\n- How complete is the tree? (all elements vs partial)\n- Are coordinates accurate?\n- Are element types/roles correct?\n\n### Apps to test\n- [ ] Firefox\n- [ ] Ghostty terminal\n- [ ] Nautilus/file manager\n- [ ] VS Code / Electron app\n- [ ] A Qt app (if any installed)\n\n## Query benchmarks\n- Time to enumerate all elements in a window\n- Time to find element by role/name\n- Memory overhead of pyatspi queries\n\n## Depends on\n- skills-pdg (Enable AT-SPI for UI tree access)","status":"open","priority":2,"issue_type":"task","created_at":"2025-12-17T14:13:21.599259773-08:00","updated_at":"2025-12-17T14:13:21.599259773-08:00","dependencies":[{"issue_id":"skills-bww","depends_on_id":"skills-pdg","type":"blocks","created_at":"2025-12-17T14:13:41.633210539-08:00","created_by":"daemon","metadata":"{}"}]} {"id":"skills-cc0","title":"spec-review: Add anti-hallucination constraints to prompts","description":"Models may paraphrase and present as quotes, or invent requirements/risks not in the doc.\n\nAdd:\n- 'Quotes must be verbatim'\n- 'Do not assume technologies/constraints not stated'\n- 'If missing info, list as open questions rather than speculating'","status":"closed","priority":3,"issue_type":"task","created_at":"2025-12-15T00:23:26.045478292-08:00","updated_at":"2025-12-15T14:07:19.556888057-08:00","closed_at":"2025-12-15T14:07:19.556888057-08:00"} @@ -124,3 +128,4 @@ {"id":"skills-x33","title":"Add tests for branch name generation","description":"File: .specify/scripts/bash/create-new-feature.sh (lines 137-181)\n\nCritical logic with NO test coverage:\n- Word filtering with stop-words\n- Acronym detection\n- Unicode/special character handling\n- Max length boundary (244 bytes)\n- Empty/single-word descriptions\n\nRisk: HIGH - affects all branch creation\n\nFix:\n- Create test suite with edge cases\n- Test stop-word filtering accuracy\n- Test boundary conditions\n\nSeverity: HIGH","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-24T02:51:00.311664646-05:00","updated_at":"2026-01-02T00:53:35.147800477-05:00","closed_at":"2026-01-02T00:53:35.147800477-05:00","close_reason":"Created test suite with 27 tests covering stop words, acronyms, word limits, special chars, unicode, edge cases, and fallback logic"} {"id":"skills-ybq","title":"Reorganize lens directory structure","description":"Current structure puts ops lenses as subdirectory of code-review lenses:\n\n```\n~/.config/lenses/ \u003c- code-review lenses\n~/.config/lenses/ops/ \u003c- ops-review lenses\n```\n\nThis is asymmetric. Consider:\n\nOption A: Separate top-level directories\n```\n~/.config/lenses/code-review/\n~/.config/lenses/ops-review/\n```\n\nOption B: Keep flat but with prefixes\n```\n~/.config/lenses/code-*.md\n~/.config/lenses/ops-*.md\n```\n\nOption C: Per-skill lens directories\n```\n~/.claude/skills/code-review/lenses/\n~/.claude/skills/ops-review/lenses/\n```\n\nRequires updating:\n- modules/ai-skills.nix (deployment paths)\n- skills/code-review/SKILL.md (expected paths)\n- skills/ops-review/SKILL.md (expected paths)","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-01T21:57:06.726997606-05:00","created_by":"dan","updated_at":"2026-01-02T00:24:53.647409845-05:00","closed_at":"2026-01-02T00:24:53.647409845-05:00","close_reason":"Reorganized lens directories: code-review → ~/.config/lenses/code/, ops-review → ~/.config/lenses/ops/. Updated ai-skills.nix, SKILL.md, and README references."} {"id":"skills-yxv","title":"worklog: extract hardcoded path to variable","description":"SKILL.md repeats ~/.claude/skills/worklog/ path 4-5 times. Define SKILL_ROOT once, reference throughout. Found by bloat+smells lens review.","status":"closed","priority":3,"issue_type":"task","created_at":"2025-12-25T02:03:15.831699081-05:00","updated_at":"2025-12-27T10:05:51.532722628-05:00","closed_at":"2025-12-27T10:05:51.532722628-05:00","close_reason":"Closed"} +{"id":"skills-zp5","title":"Create skills marketplace.json registry","description":"Central registry of all skills for plugin discovery. Follow emes marketplace pattern.","status":"open","priority":3,"issue_type":"task","created_at":"2026-01-09T10:59:24.933190155-08:00","created_by":"dan","updated_at":"2026-01-09T10:59:24.933190155-08:00","dependencies":[{"issue_id":"skills-zp5","depends_on_id":"skills-6x1","type":"blocks","created_at":"2026-01-09T10:59:33.223533468-08:00","created_by":"dan"}]} diff --git a/docs/emes-conversion-guide.md b/docs/emes-conversion-guide.md new file mode 100644 index 0000000..8432dda --- /dev/null +++ b/docs/emes-conversion-guide.md @@ -0,0 +1,124 @@ +# emes Plugin Conversion Guide + +Converting skills to emes-style plugin architecture for portability. + +## Background + +The [emes org](https://github.com/evil-mind-evil-sword) builds modular AI agent tools: +- **tissue**: Git-native issue tracking (machine-first) +- **idle**: Quality gate (blocks exit until reviewer approves) +- **jwz**: Async messaging with identity/git context +- **marketplace**: Plugin distribution registry + +## Key Principles + +1. **Pull context on-demand** - Don't inject large prompts upfront +2. **Mechanical enforcement** - Use hooks, not prompt instructions +3. **References over inline** - Point to files, don't embed content +4. **Machine-first interfaces** - JSON output, non-interactive + +## Directory Structure + +**Before (current):** +``` +skills/my-skill/ +├── SKILL.md # Instructions with YAML frontmatter +├── README.md # Human docs +└── scripts/ # Supporting scripts (optional) +``` + +**After (emes-style):** +``` +skills/my-skill/ +├── .claude-plugin/ +│ └── plugin.json # Metadata, version, hooks +├── skills/ +│ └── my-skill.md # Auto-discovered skill (copy of SKILL.md) +├── hooks/ +│ └── hooks.json # Optional lifecycle hooks +├── SKILL.md # Keep for Nix deployment (backward compat) +├── README.md +└── scripts/ +``` + +## plugin.json Schema + +```json +{ + "name": "my-skill", + "description": "Brief description", + "version": "1.0.0", + "author": { + "name": "author-name" + }, + "license": "MIT", + "keywords": ["keyword1", "keyword2"], + "hooks": { + "SessionStart": [...], + "PostToolUse": [...] + } +} +``` + +## Conversion Checklist + +- [ ] Create `.claude-plugin/plugin.json` with metadata +- [ ] Copy `SKILL.md` to `skills/.md` +- [ ] Add hooks if skill needs lifecycle events +- [ ] Keep original `SKILL.md` for Nix deployment +- [ ] Test with `/plugin install ./path/to/skill` +- [ ] Register in marketplace.json (optional) + +## Hook Events + +Available hook events: +- `SessionStart` - On session begin +- `SessionEnd` - On session end +- `PreToolUse` - Before tool execution +- `PostToolUse` - After tool execution +- `PreCompact` - Before context compaction +- `UserPromptSubmit` - On user message +- `Stop` - When agent stops +- `SubagentStop` - When subagent stops +- `Notification` - On notifications + +## Example: Quality Gate (idle pattern) + +For skills that need review before completion: + +```json +{ + "hooks": { + "Stop": [ + { + "matcher": "", + "hooks": [ + { + "type": "command", + "command": "check-review-status.sh" + } + ] + } + ] + } +} +``` + +## Testing + +```bash +# Install locally +/plugin install ./skills/my-skill + +# Or add to marketplace and install +/plugin marketplace add owner/repo +/plugin install my-skill@marketplace-name +``` + +## Dual Deployment + +We maintain both: +1. **Nix deployment** (system-level) - Uses `SKILL.md` at root +2. **Plugin discovery** (Claude Code) - Uses `skills/.md` + +This allows skills to work in both environments. diff --git a/skills/orch/.claude-plugin/plugin.json b/skills/orch/.claude-plugin/plugin.json new file mode 100644 index 0000000..70322dd --- /dev/null +++ b/skills/orch/.claude-plugin/plugin.json @@ -0,0 +1,16 @@ +{ + "name": "orch", + "description": "Query multiple AI models for consensus decisions, second opinions, and devil's advocate analysis.", + "version": "1.0.0", + "author": { + "name": "dan" + }, + "license": "MIT", + "keywords": [ + "multi-model", + "consensus", + "ai-orchestration", + "second-opinion", + "devils-advocate" + ] +} diff --git a/skills/orch/skills/orch.md b/skills/orch/skills/orch.md new file mode 100644 index 0000000..c659817 --- /dev/null +++ b/skills/orch/skills/orch.md @@ -0,0 +1,314 @@ +--- +name: orch +description: Query multiple AI models for consensus decisions, second opinions, and devil's advocate analysis using the orch CLI. +--- + +# Orch - Multi-Model Consensus Skill + +Query multiple AI models simultaneously and aggregate their perspectives. Use this when you need: +- A second opinion on your reasoning +- Multiple perspectives on a decision +- Devil's advocate analysis +- Brainstorming from diverse viewpoints + +## When to Use + +Invoke this skill when: +- Making architectural or design decisions ("Should we use X or Y?") +- Reviewing your own proposed solution before presenting to user +- The user asks for multiple AI perspectives +- You want to stress-test an approach with opposing viewpoints +- Complex tradeoffs where different perspectives would help + +## Invocation + +The `orch` CLI is available globally: + +```bash +orch [args...] +``` + +## Commands + +### orch consensus + +Query multiple models for their verdict on a question. + +```bash +orch consensus "PROMPT" MODEL1 MODEL2 [MODEL3...] +``` + +**Model Aliases** (use these): + +| Alias | Model | Notes | +|-------|-------|-------| +| `flash` | gemini-3-flash-preview | Fast, free | +| `gemini` | gemini-3-pro-preview | Strong reasoning, free | +| `gpt` / `gpt5` | gpt-5.2 | Strong reasoning | +| `gpt4` | gpt-4o | Legacy | +| `claude` / `sonnet` | claude-sonnet-4.5 | Balanced (via OpenRouter) | +| `haiku` | claude-haiku-4.5 | Fast, cheap | +| `opus` | claude-opus-4.5 | Strongest, expensive | +| `deepseek` | deepseek-v3.2 | Good value | +| `r1` | deepseek-r1-0528 | Reasoning model, expensive | +| `qwen` | qwen3-235b-a22b | Good value | +| `qwen-fast` | qwen3-8b | Very fast/cheap | +| `glm` | glm-4.7 | Reasoning capable | +| `sonar` | perplexity/sonar | Web search built-in | +| `sonar-pro` | perplexity/sonar-pro | Better web search | + +Use `orch models` to see all available models with pricing and status. + +## Model Selection + +**Quick sanity check**: Use `flash qwen-fast` for fast, cheap validation. Good for "am I missing something obvious?" checks. + +**Standard consensus**: Use `flash gemini deepseek` for balanced perspectives across providers. Default for most decisions. + +**Deep analysis**: Include `r1` or `gpt` when stakes are high or reasoning is complex. These models think longer but cost more. Use `--allow-expensive` for r1/opus. + +**Diverse viewpoints**: Mix providers (Google + DeepSeek + OpenAI + Anthropic) rather than multiple models from one provider. Different training leads to genuinely different perspectives. + +**Cost-conscious**: `flash` and `qwen-fast` are 10-100x cheaper than premium models. Start cheap, escalate if needed. + +**Options**: +- `--mode vote` (default) - Models give Support/Oppose/Neutral verdict +- `--mode brainstorm` - Generate ideas without judgment +- `--mode critique` - Find flaws and weaknesses +- `--mode open` - Freeform responses, no structured output +- `--temperature 0.1` - Lower = more focused (default 0.1) +- `--file PATH` - Include file as context (can use multiple times) +- `--websearch` - Enable web search (Gemini models only) +- `--serial` - Run models in sequence instead of parallel +- `--strategy` - Serial strategy: neutral (default), refine, debate, brainstorm +- `--synthesize MODEL` - Aggregate all responses into summary using MODEL +- `--allow-expensive` - Allow expensive/slow models (opus, r1) +- `--timeout SECS` - Timeout per model (default 300) + +**Stances** (devil's advocate): +Append `:for`, `:against`, or `:neutral` to bias a model's perspective: +```bash +orch consensus "Should we rewrite in Rust?" gpt:for claude:against gemini:neutral +``` + +**Stdin piping**: +```bash +cat code.py | orch consensus "Is this implementation correct?" flash gemini +``` + +### orch chat + +Single-model conversation for deeper exploration: +```bash +orch chat "MESSAGE" --model gemini +``` + +Options: +- `--model MODEL` - Model to use (default: gemini) +- `--session ID` - Continue an existing session +- `--format json` - Return structured output with session_id +- `--file PATH` - Attach file +- `--websearch` / `--no-websearch` - Toggle search (default: on) +- `--allow-expensive` - Allow expensive models + +Use chat instead of consensus when: +- You need iterative refinement through follow-up questions +- The problem requires deeper exploration than a single query +- You want to build on previous responses + +### orch models + +List and inspect available models: +```bash +orch models # List all models with status +orch models resolve # Show details for specific alias +``` + +### orch sessions + +Manage conversation sessions: +```bash +orch sessions list # List all sessions +orch sessions show # Show session details +orch sessions clean 7d # Delete sessions older than 7 days +orch sessions export # Export session as JSON +``` + +## Usage Patterns + +### Quick Second Opinion +When you've reasoned through something and want validation: +```bash +orch consensus "I think we should use SQLite for this because [reasons]. Is this sound?" flash gemini +``` + +### Architecture Decision +When facing a tradeoff: +```bash +orch consensus "Microservices vs monolith for a 3-person team building an e-commerce site?" flash gemini gpt --mode vote +``` + +### Code Review +Include the code as context: +```bash +orch consensus "Is this error handling approach correct and complete?" flash gemini --file src/handler.py +``` + +### Devil's Advocate +Get opposing viewpoints deliberately: +```bash +orch consensus "Should we adopt Kubernetes?" gpt:for claude:against flash:neutral +``` + +### Brainstorm +Generate diverse ideas: +```bash +orch consensus "How could we improve the CI/CD pipeline?" flash gemini deepseek --mode brainstorm +``` + +### Critique Your Work +Find weaknesses before presenting: +```bash +orch consensus "What are the flaws in this API design?" flash gemini --file api-spec.yaml --mode critique +``` + +### Synthesize Responses +Get a unified summary from multiple perspectives: +```bash +orch consensus "Evaluate this architecture" flash gemini gpt --synthesize gemini +``` + +### Use Reasoning Models +For complex analysis requiring deep thinking: +```bash +orch consensus "Analyze the security implications" r1 gemini --allow-expensive +``` + +## Conversational Patterns + +### Session-Based Multi-Turn + +Start a conversation and continue it with follow-ups: + +```bash +# Initial query - capture session ID +RESPONSE=$(orch chat "Analyze this error log" --model gemini --format json --file error.log) +SESSION=$(echo "$RESPONSE" | jq -r '.session_id') + +# Follow up with context preserved +orch chat "What could cause this in a containerized environment?" --model gemini --session "$SESSION" + +# Dig deeper +orch chat "How would I debug this?" --model gemini --session "$SESSION" +``` + +### Session Inspection + +Review what happened in a conversation: + +```bash +# List recent sessions +orch sessions list + +# Show full conversation +orch sessions show "$SESSION" + +# Show just last 2 exchanges +orch sessions show "$SESSION" --last 2 --format text + +# Export for archival +orch sessions export "$SESSION" > conversation.json +``` + +### Cross-Model Dialogue + +Get one model's analysis, then ask another to respond: + +```bash +# Get Claude's take +CLAUDE=$(orch chat "Review this API design" --model claude --format json --file api.yaml) +CLAUDE_SAYS=$(echo "$CLAUDE" | jq -r '.content') + +# Ask Gemini to critique Claude's review +orch chat "Claude reviewed an API and said: + +$CLAUDE_SAYS + +Do you agree? What did Claude miss?" --model gemini +``` + +### Iterative Refinement + +Build up a solution through conversation: + +```bash +# Start with requirements +SESSION=$(orch chat "Design a caching strategy for this service" --model gemini --format json --file service.py | jq -r '.session_id') + +# Add constraints +orch chat "Add constraint: must work in multi-region deployment" --model gemini --session "$SESSION" + +# Request implementation +orch chat "Show me the implementation" --model gemini --session "$SESSION" +``` + +### When to Use Conversations vs Consensus + +| Scenario | Use | Why | +|----------|-----|-----| +| Quick decision validation | `consensus` | Parallel opinions, fast | +| Deep problem exploration | `chat` with sessions | Build context iteratively | +| Multiple perspectives needed | `consensus` | Different viewpoints simultaneously | +| Follow-up questions likely | `chat` with sessions | Preserve conversation state | +| Stress-testing an idea | `consensus` with stances | Devil's advocate pattern | +| Explaining your reasoning | `chat` | Interactive dialogue | +| Complex multi-step analysis | `chat` then `consensus` | Explore, then validate | + +### Combined Pattern: Explore Then Validate + +Use chat to develop an idea, then consensus to validate: + +```bash +# Explore with one model +SESSION=$(orch chat "Help me design error handling for this CLI" --model gemini --format json | jq -r '.session_id') +orch chat "What about retry logic?" --model gemini --session "$SESSION" +DESIGN=$(orch chat "Summarize the design we arrived at" --model gemini --session "$SESSION" --format json | jq -r '.content') + +# Validate with consensus +orch consensus "Is this error handling design sound? + +$DESIGN" flash claude deepseek --mode critique +``` + +## Output Format + +Vote mode returns structured verdicts: +``` +┌─────────────────────────────────────────────────────────────┐ +│ CONSENSUS: MIXED │ +│ SUPPORT: 2 OPPOSE: 1 NEUTRAL: 0 │ +└─────────────────────────────────────────────────────────────┘ + +[flash] gemini-3-flash-preview - SUPPORT +Reasoning: ... + +[gemini] gemini-3-pro-preview - SUPPORT +Reasoning: ... + +[claude] claude-sonnet-4.5 - OPPOSE +Reasoning: ... +``` + +## Guidelines + +1. **Use for genuine uncertainty** - Don't use orch for trivial decisions or to avoid thinking +2. **Provide context** - Better prompts get better consensus; use `--file` when relevant +3. **Choose models wisely** - flash/qwen-fast for quick checks, r1/opus for complex reasoning +4. **Consider stances** - Devil's advocate is powerful for stress-testing ideas +5. **Parse the reasoning** - The verdict matters less than understanding the reasoning +6. **Mind the cost** - opus and r1 require `--allow-expensive`; use cheaper models for iteration + +## Requirements + +- `orch` CLI installed (via home-manager or system packages) +- API keys configured: GEMINI_API_KEY, OPENAI_API_KEY, OPENROUTER_KEY