diff --git a/.beads/issues.jsonl b/.beads/issues.jsonl index 9ca3621..5da5f21 100644 --- a/.beads/issues.jsonl +++ b/.beads/issues.jsonl @@ -27,7 +27,7 @@ {"id":"skills-7sh","title":"Set up bd-issue-tracking Claude Code skill from beads repo","description":"Install the beads Claude Code skill from https://github.com/steveyegge/beads/tree/main/examples/claude-code-skill\n\nThis skill teaches Claude how to effectively use beads for issue tracking across multi-session coding workflows. It provides strategic guidance on when/how to use beads, not just command syntax.\n\nFiles to install to ~/.claude/skills/bd-issue-tracking/:\n- SKILL.md - Core workflow patterns and decision criteria\n- BOUNDARIES.md - When to use beads vs markdown alternatives\n- CLI_REFERENCE.md - Complete command documentation\n- DEPENDENCIES.md - Relationship types and patterns\n- WORKFLOWS.md - Step-by-step procedures\n- ISSUE_CREATION.md - Quality guidelines\n- RESUMABILITY.md - Making work resumable across sessions\n- STATIC_DATA.md - Using beads as reference databases\n\nCan symlink or copy the files. Restart Claude Code after install.","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-03T17:53:43.254007992-08:00","updated_at":"2025-12-03T20:04:53.416579381-08:00","closed_at":"2025-12-03T20:04:53.416579381-08:00"} {"id":"skills-8cc","title":"Remove dead code: unused ARGS variable","description":"File: .specify/scripts/bash/create-new-feature.sh\n\nLine 8: ARGS=() declared but never used\nLine 251: export SPECIFY_FEATURE - unclear if used downstream\n\nFix:\n- Remove unused ARGS declaration\n- Verify SPECIFY_FEATURE is used or remove\n\nSeverity: LOW","status":"open","priority":4,"issue_type":"task","created_at":"2025-12-24T02:50:59.332192076-05:00","updated_at":"2025-12-24T02:50:59.332192076-05:00"} {"id":"skills-8d4","title":"Compare CLI_REFERENCE.md with upstream","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-03T20:15:53.268324087-08:00","updated_at":"2025-12-03T20:17:26.552616779-08:00","closed_at":"2025-12-03T20:17:26.552616779-08:00","dependencies":[{"issue_id":"skills-8d4","depends_on_id":"skills-ebh","type":"discovered-from","created_at":"2025-12-03T20:15:53.27265681-08:00","created_by":"daemon","metadata":"{}"}]} -{"id":"skills-8d9","title":"Add conversational patterns to orch skill","description":"## Context\nThe orch skill currently documents consensus and single-shot chat, but doesn't\nteach agents how to use orch for multi-turn conversations with external AIs.\n\n## Goal\nAdd documentation and patterns for agent-driven conversations where the calling\nagent (Claude Code) orchestrates multi-turn dialogues using orch primitives.\n\n## Patterns to document\n\n### Session-based multi-turn\n```bash\n# Initial query\nRESPONSE=$(orch chat \"Analyze this\" --model claude --format json)\nSESSION=$(echo \"$RESPONSE\" | jq -r .session_id)\n\n# Continue conversation\norch chat \"Elaborate on X\" --model claude --session $SESSION\n\n# Inspect state\norch sessions info $SESSION\norch sessions show $SESSION --last 2 --format text\n```\n\n### Cross-model dialogue\n```bash\n# Get one model's take\nCLAUDE=$(orch chat \"Review this\" --model claude --format json)\nCLAUDE_SAYS=$(echo \"$CLAUDE\" | jq -r '.responses[0].content')\n\n# Ask another model to respond\norch chat \"Claude said: $CLAUDE_SAYS\n\nWhat's your perspective?\" --model gemini\n```\n\n### When to use conversations vs consensus\n- Consensus: quick parallel opinions on a decision\n- Conversation: deeper exploration, follow-up questions, iterative refinement\n\n## Files\n- skills/orch/SKILL.md\n\n## Related\n- orch-c3r: Design: Session introspection for agent-driven conversations (in orch repo)","status":"open","priority":2,"issue_type":"feature","created_at":"2025-12-18T19:57:28.201494288-08:00","updated_at":"2025-12-18T19:57:28.201494288-08:00"} +{"id":"skills-8d9","title":"Add conversational patterns to orch skill","description":"## Context\nThe orch skill currently documents consensus and single-shot chat, but doesn't\nteach agents how to use orch for multi-turn conversations with external AIs.\n\n## Goal\nAdd documentation and patterns for agent-driven conversations where the calling\nagent (Claude Code) orchestrates multi-turn dialogues using orch primitives.\n\n## Patterns to document\n\n### Session-based multi-turn\n```bash\n# Initial query\nRESPONSE=$(orch chat \"Analyze this\" --model claude --format json)\nSESSION=$(echo \"$RESPONSE\" | jq -r .session_id)\n\n# Continue conversation\norch chat \"Elaborate on X\" --model claude --session $SESSION\n\n# Inspect state\norch sessions info $SESSION\norch sessions show $SESSION --last 2 --format text\n```\n\n### Cross-model dialogue\n```bash\n# Get one model's take\nCLAUDE=$(orch chat \"Review this\" --model claude --format json)\nCLAUDE_SAYS=$(echo \"$CLAUDE\" | jq -r '.responses[0].content')\n\n# Ask another model to respond\norch chat \"Claude said: $CLAUDE_SAYS\n\nWhat's your perspective?\" --model gemini\n```\n\n### When to use conversations vs consensus\n- Consensus: quick parallel opinions on a decision\n- Conversation: deeper exploration, follow-up questions, iterative refinement\n\n## Files\n- skills/orch/SKILL.md\n\n## Related\n- orch-c3r: Design: Session introspection for agent-driven conversations (in orch repo)","status":"in_progress","priority":2,"issue_type":"feature","created_at":"2025-12-18T19:57:28.201494288-08:00","updated_at":"2025-12-29T15:33:28.962245864-05:00"} {"id":"skills-8v0","title":"Consolidate skill list definitions (flake.nix + ai-skills.nix)","description":"Skill list duplicated in:\n- flake.nix (lines 15-27)\n- modules/ai-skills.nix (lines 8-18)\n\nIssues:\n- Manual sync required when adding skills\n- No validation that referenced skills exist\n\nFix:\n- Single source of truth for skill list\n- Consider generating one from the other\n\nSeverity: MEDIUM","status":"open","priority":3,"issue_type":"task","created_at":"2025-12-24T02:51:14.432158871-05:00","updated_at":"2025-12-24T02:51:14.432158871-05:00"} {"id":"skills-8y6","title":"Define skill versioning strategy","description":"Git SHA alone is insufficient. Need tuple approach:\n\n- skill_source_rev: git SHA (if available)\n- skill_content_hash: hash of SKILL.md + scripts\n- runtime_ref: flake.lock hash or Nix store path\n\nQuestions to resolve:\n- Do Protos pin to versions (stable but maintenance) or float on latest (risky)?\n- How to handle breaking changes in skills?\n- Record in wisp trace vs proto definition?\n\nFrom consensus: both models flagged versioning instability as high severity.","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T19:49:30.839064445-05:00","updated_at":"2025-12-23T20:55:04.439779336-05:00","closed_at":"2025-12-23T20:55:04.439779336-05:00","close_reason":"ADRs revised with orch consensus feedback"} {"id":"skills-9af","title":"spec-review: Add spike/research task handling","description":"Tasks like 'Investigate X' can linger without clear outcomes.\n\nAdd to REVIEW_TASKS:\n- Flag research/spike tasks\n- Require timebox and concrete outputs (decision record, prototype, risks)\n- Pattern for handling unknowns","status":"closed","priority":3,"issue_type":"task","created_at":"2025-12-15T00:23:26.887719136-08:00","updated_at":"2025-12-15T14:08:13.441095034-08:00","closed_at":"2025-12-15T14:08:13.441095034-08:00"} @@ -50,7 +50,7 @@ {"id":"skills-e8h","title":"Investigate waybar + niri integration improvements","description":"Look into waybar configuration and niri compositor integration.\n\nPotential areas:\n- Waybar modules for niri workspaces\n- Status indicators\n- Integration with existing niri-window-capture skill\n- Custom scripts in pkgs/waybar-scripts\n\nRelated: dotfiles has home/waybar.nix (196 lines) and pkgs/waybar-scripts/","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-28T20:11:23.115445797-05:00","created_by":"dan","updated_at":"2025-12-28T20:37:16.465731945-05:00","closed_at":"2025-12-28T20:37:16.465731945-05:00","close_reason":"Moved to dotfiles repo - waybar config lives there"} {"id":"skills-e96","title":"skill: semantic-grep using LSP","description":"Use workspace/symbol, documentSymbol, and references instead of ripgrep.\n\nExample: 'Find all places where we handle User objects but only where we modify the email field directly'\n- LSP references finds all User usages\n- Filter by AST analysis for .email assignments\n- Return hit list for bead or further processing\n\nBetter than regex for Go interfaces, Rust traits, TS types.","status":"open","priority":3,"issue_type":"task","created_at":"2025-12-24T02:29:57.119983837-05:00","updated_at":"2025-12-24T02:29:57.119983837-05:00","dependencies":[{"issue_id":"skills-e96","depends_on_id":"skills-gga","type":"blocks","created_at":"2025-12-24T02:30:06.632906383-05:00","created_by":"daemon"}]} {"id":"skills-ebh","title":"Compare bd-issue-tracking skill files with upstream","description":"Fetch upstream beads skill files and compare with our condensed versions to identify differences","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-03T20:14:07.886535859-08:00","updated_at":"2025-12-03T20:19:37.579815337-08:00","closed_at":"2025-12-03T20:19:37.579815337-08:00"} -{"id":"skills-ebl","title":"Benchmark vision model UI understanding","description":"## Goal\nMeasure how well vision models can answer UI questions from screenshots.\n\n## Test cases\n1. **Element location**: \"Where is the Save button?\" → coordinates\n2. **Element identification**: \"What buttons are visible?\" → list\n3. **State detection**: \"Is the checkbox checked?\" → boolean\n4. **Text extraction**: \"What does the error message say?\" → text\n5. **Layout understanding**: \"What's in the sidebar?\" → structure\n\n## Metrics\n- Accuracy: Does the answer match ground truth?\n- Precision: How close are coordinates to actual element centers?\n- Latency: Time from query to response\n- Cost: Tokens consumed per query\n\n## Prompt engineering questions\n- Does adding a grid overlay help coordinate precision?\n- What prompt format gives most actionable coordinates?\n- Can we get bounding boxes vs point coordinates?\n\n## Comparison baseline\n- Manual annotation of test screenshots\n- AT-SPI data (once enabled) as ground truth\n\n## Depends on\n- Test screenshots from real apps\n- Ground truth annotations","status":"in_progress","priority":2,"issue_type":"task","created_at":"2025-12-17T14:13:10.038933798-08:00","updated_at":"2025-12-29T15:07:46.686104229-05:00"} +{"id":"skills-ebl","title":"Benchmark vision model UI understanding","description":"## Goal\nMeasure how well vision models can answer UI questions from screenshots.\n\n## Test cases\n1. **Element location**: \"Where is the Save button?\" → coordinates\n2. **Element identification**: \"What buttons are visible?\" → list\n3. **State detection**: \"Is the checkbox checked?\" → boolean\n4. **Text extraction**: \"What does the error message say?\" → text\n5. **Layout understanding**: \"What's in the sidebar?\" → structure\n\n## Metrics\n- Accuracy: Does the answer match ground truth?\n- Precision: How close are coordinates to actual element centers?\n- Latency: Time from query to response\n- Cost: Tokens consumed per query\n\n## Prompt engineering questions\n- Does adding a grid overlay help coordinate precision?\n- What prompt format gives most actionable coordinates?\n- Can we get bounding boxes vs point coordinates?\n\n## Comparison baseline\n- Manual annotation of test screenshots\n- AT-SPI data (once enabled) as ground truth\n\n## Depends on\n- Test screenshots from real apps\n- Ground truth annotations","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-17T14:13:10.038933798-08:00","updated_at":"2025-12-29T15:26:19.822655148-05:00","closed_at":"2025-12-29T15:26:19.822655148-05:00","close_reason":"Benchmark complete. Vision models excellent for semantic understanding, approximate for coordinates. Recommend hybrid AT-SPI + vision. See docs/research/vision-ui-benchmark-2025-12-29.md"} {"id":"skills-f2p","title":"Skills + Molecules Integration","description":"Integrate skills with beads molecules system.\n\nDesign work tracked in dotfiles (dotfiles-jjb).\n\nComponents:\n- Checklist support (lightweight skills)\n- Audit integration (bd audit for skill execution)\n- Skill frontmatter for triggers/tracking\n- Proto packaging alongside skills\n\nSee: ~/proj/dotfiles ADR work","status":"closed","priority":2,"issue_type":"epic","created_at":"2025-12-23T17:58:55.999438985-05:00","updated_at":"2025-12-23T19:22:38.577280129-05:00","closed_at":"2025-12-23T19:22:38.577280129-05:00","close_reason":"Superseded by skills-4u0 (migrated from dotfiles)","dependencies":[{"issue_id":"skills-f2p","depends_on_id":"skills-vpy","type":"blocks","created_at":"2025-12-23T17:59:17.976956454-05:00","created_by":"daemon"},{"issue_id":"skills-f2p","depends_on_id":"skills-u3d","type":"blocks","created_at":"2025-12-23T17:59:18.015216054-05:00","created_by":"daemon"}]} {"id":"skills-fo3","title":"Compare WORKFLOWS.md with upstream","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-03T20:15:54.283175561-08:00","updated_at":"2025-12-03T20:19:28.897037199-08:00","closed_at":"2025-12-03T20:19:28.897037199-08:00","dependencies":[{"issue_id":"skills-fo3","depends_on_id":"skills-ebh","type":"discovered-from","created_at":"2025-12-03T20:15:54.286009672-08:00","created_by":"daemon","metadata":"{}"}]} {"id":"skills-fvc","title":"Code Review: {{target}}","description":"Multi-lens code review workflow for {{target}}.\n\n## Philosophy\nThe LLM stays in the loop at every step - this is agent-assisted review, not automated parsing. The agent applies judgment about what's worth filing, how to prioritize, and what context to include.\n\n## Variables\n- target: File or directory to review\n\n## Workflow\n1. Explore codebase to find candidates (if target is directory)\n2. Run lenses via orch consensus for multi-model perspective\n3. Analyze findings - LLM synthesizes across lenses and models\n4. File issues with judgment - group related, set priorities, add context\n5. Summarize for digest\n\n## Lenses Available\n- bloat: size, complexity, SRP violations\n- smells: readability, naming, control flow\n- dead-code: unused, unreachable, obsolete\n- redundancy: duplication, YAGNI, parallel systems","status":"closed","priority":2,"issue_type":"epic","created_at":"2025-12-25T10:10:57.652098447-05:00","updated_at":"2025-12-26T23:22:41.408582818-05:00","closed_at":"2025-12-26T23:22:41.408582818-05:00","close_reason":"Replaced by /code-review skill","labels":["template"]} diff --git a/skills/orch/SKILL.md b/skills/orch/SKILL.md index a499ff3..c659817 100644 --- a/skills/orch/SKILL.md +++ b/skills/orch/SKILL.md @@ -98,18 +98,24 @@ cat code.py | orch consensus "Is this implementation correct?" flash gemini ### orch chat -Single-model conversation (when you don't need consensus): +Single-model conversation for deeper exploration: ```bash orch chat "MESSAGE" --model gemini ``` Options: - `--model MODEL` - Model to use (default: gemini) -- `--session ID` - Continue a session +- `--session ID` - Continue an existing session +- `--format json` - Return structured output with session_id - `--file PATH` - Attach file - `--websearch` / `--no-websearch` - Toggle search (default: on) - `--allow-expensive` - Allow expensive models +Use chat instead of consensus when: +- You need iterative refinement through follow-up questions +- The problem requires deeper exploration than a single query +- You want to build on previous responses + ### orch models List and inspect available models: @@ -178,6 +184,102 @@ For complex analysis requiring deep thinking: orch consensus "Analyze the security implications" r1 gemini --allow-expensive ``` +## Conversational Patterns + +### Session-Based Multi-Turn + +Start a conversation and continue it with follow-ups: + +```bash +# Initial query - capture session ID +RESPONSE=$(orch chat "Analyze this error log" --model gemini --format json --file error.log) +SESSION=$(echo "$RESPONSE" | jq -r '.session_id') + +# Follow up with context preserved +orch chat "What could cause this in a containerized environment?" --model gemini --session "$SESSION" + +# Dig deeper +orch chat "How would I debug this?" --model gemini --session "$SESSION" +``` + +### Session Inspection + +Review what happened in a conversation: + +```bash +# List recent sessions +orch sessions list + +# Show full conversation +orch sessions show "$SESSION" + +# Show just last 2 exchanges +orch sessions show "$SESSION" --last 2 --format text + +# Export for archival +orch sessions export "$SESSION" > conversation.json +``` + +### Cross-Model Dialogue + +Get one model's analysis, then ask another to respond: + +```bash +# Get Claude's take +CLAUDE=$(orch chat "Review this API design" --model claude --format json --file api.yaml) +CLAUDE_SAYS=$(echo "$CLAUDE" | jq -r '.content') + +# Ask Gemini to critique Claude's review +orch chat "Claude reviewed an API and said: + +$CLAUDE_SAYS + +Do you agree? What did Claude miss?" --model gemini +``` + +### Iterative Refinement + +Build up a solution through conversation: + +```bash +# Start with requirements +SESSION=$(orch chat "Design a caching strategy for this service" --model gemini --format json --file service.py | jq -r '.session_id') + +# Add constraints +orch chat "Add constraint: must work in multi-region deployment" --model gemini --session "$SESSION" + +# Request implementation +orch chat "Show me the implementation" --model gemini --session "$SESSION" +``` + +### When to Use Conversations vs Consensus + +| Scenario | Use | Why | +|----------|-----|-----| +| Quick decision validation | `consensus` | Parallel opinions, fast | +| Deep problem exploration | `chat` with sessions | Build context iteratively | +| Multiple perspectives needed | `consensus` | Different viewpoints simultaneously | +| Follow-up questions likely | `chat` with sessions | Preserve conversation state | +| Stress-testing an idea | `consensus` with stances | Devil's advocate pattern | +| Explaining your reasoning | `chat` | Interactive dialogue | +| Complex multi-step analysis | `chat` then `consensus` | Explore, then validate | + +### Combined Pattern: Explore Then Validate + +Use chat to develop an idea, then consensus to validate: + +```bash +# Explore with one model +SESSION=$(orch chat "Help me design error handling for this CLI" --model gemini --format json | jq -r '.session_id') +orch chat "What about retry logic?" --model gemini --session "$SESSION" +DESIGN=$(orch chat "Summarize the design we arrived at" --model gemini --session "$SESSION" --format json | jq -r '.content') + +# Validate with consensus +orch consensus "Is this error handling design sound? + +$DESIGN" flash claude deepseek --mode critique +``` + ## Output Format Vote mode returns structured verdicts: