# ADR-001: Skills and Molecules Integration ## Status Parked (2025-12-28) **Rationale:** Current simpler approach is working well: - Skills as standalone entrypoints (not molecule steps) - Agent judgment from description/SKILL.md sufficient for invocation - Molecules/protos not actively used for workflow orchestration Revisit when: - Complex multi-agent orchestration becomes needed - Steve Yegge's orchestration work provides new patterns - Programmatic skill invocation has clear use cases ## Context We have two complementary systems for agent-assisted work: 1. **Skills** (this repo): Procedural knowledge deployed via Nix/direnv. Skills define HOW to do things - scripts, prompts, and workflows that agents can invoke. 2. **Molecules** (beads 0.35+): Work tracking templates in beads. Molecules define WHAT work needs to be done - DAGs of issues that can be instantiated, tracked, and completed. These systems evolved independently but have natural integration points. The question is: how should they connect? ### Current State **Skills system:** - Skills are directories under `~/.claude/skills/` (deployed via Nix) - Each skill has a `SKILL.md` with frontmatter + prompt/instructions - Skills are invoked by agents via `/skill-name` or automatically based on triggers - No execution tracking beyond what the agent logs **Molecules system (beads 0.35):** - **Proto**: Template epic with `template` label, uses `{{var}}` placeholders - **Mol**: Instantiated work from a proto (permanent, git-synced) - **Wisp**: Ephemeral mol for operational work (gitignored, `.beads-wisp/`) - **Hook**: Agent's attachment point for assigned work - **Pin**: Assign mol to agent's hook Key molecule commands: ``` bd mol spawn # Create mol from proto bd pour # Spawn persistent mol bd wisp create # Spawn ephemeral mol bd pin --for me # Assign to self bd mol squash # Compress mol → digest bd mol distill # Extract proto from ad-hoc epic ``` ### Problem Statement 1. Skills have no execution history - we can't replay, debug, or learn from past runs 2. Molecules track work but don't know which skills were used to complete them 3. Successful ad-hoc work patterns can't be easily promoted to reusable skills 4. No connection between "what was done" (mol) and "how it was done" (skill) ## Decision Link skills and molecules via three mechanisms: ### 1. Skill References in Molecules Add a `skill:` field to molecule nodes that references skills used during execution: ```yaml # In a proto template - title: "Generate worklog for {{session}}" skill: worklog description: "Document the work session" ``` When an agent works on a mol step that has a `skill:` reference, it knows which skill to invoke. ### 2. Wisp Execution Traces Use wisps to capture skill execution traces. When a skill runs within a molecule context: ```yaml # Wisp execution trace format skill_ref: worklog skill_version: "abc123" # git SHA of skill inputs: context: "session context..." env: PROJECT: "skills" tool_calls: - cmd: "extract-metrics.sh" args: ["--session", "2025-12-23"] exit_code: 0 duration_ms: 1234 checkpoints: - step: "metrics_extracted" summary: "Found 5 commits, 12 file changes" timestamp: "2025-12-23T19:30:00Z" outputs: files_created: - "docs/worklogs/2025-12-23-session.org" ``` This enables: - Replay: Re-run a skill with the same inputs - Diff: Compare two executions of the same skill - Debug: Understand what happened when something fails - Regression testing: Detect when skill behavior changes ### 3. Elevation Pipeline When a molecule completes successfully, offer to "elevate" it to a skill: ``` bd mol squash # Compress execution history bd elevate # Analyze and generate skill draft ``` The elevation pipeline: 1. Analyze squashed trace for generalizable patterns 2. Extract variable inputs (things that changed between runs) 3. Generate SKILL.md draft with: - Frontmatter from mol metadata - Steps derived from trace checkpoints - Scripts extracted from tool_calls 4. Human approval gate before deployment ### Phase Transitions (Chemistry Metaphor) ``` Proto (solid) → pour → Mol (liquid) → squash → Digest (solid) ↓ Wisp (vapor) ← create ← Proto ↓ execute → Trace ↓ elevate → Skill draft ``` - **Solid**: Static templates (protos, digests, skills) - **Liquid**: Active work being tracked (mols) - **Vapor**: Ephemeral execution (wisps, traces) ## Consequences ### Positive - **Traceability**: Know exactly how work was completed - **Reusability**: Successful patterns become skills automatically - **Debugging**: Execution traces make failures understandable - **Learning**: System improves as more work is tracked ### Negative - **Overhead**: Capturing traces adds complexity - **Storage**: Wisp traces need cleanup strategy - **Coupling**: Skills and beads become interdependent ### Neutral - Skills remain usable without molecules (standalone invocation) - Molecules remain usable without skills (manual work) - Integration is opt-in per-proto via `skill:` field ## Implementation Plan 1. **Phase 1** (this ADR): Document the design 2. **Phase 2**: Define wisp execution trace format (skills-jeb) 3. **Phase 3**: Prototype elevation pipeline (skills-3em) 4. **Phase 4**: Test on worklog skill (skills-rex) ## Anti-Patterns to Avoid 1. **Over-instrumentation**: Don't trace every shell command. Focus on meaningful checkpoints. 2. **Forced coupling**: Don't require molecules to use skills or vice versa. 3. **Premature elevation**: Don't auto-generate skills from single executions. Wait for patterns. 4. **Trace bloat**: Wisps are ephemeral for a reason. Squash or burn, don't accumulate. ## Open Questions 1. How granular should skill_version be? Git SHA? Flake hash? Both? 2. Should traces capture stdout/stderr or just exit codes? 3. What's the minimum number of similar executions before suggesting elevation? 4. How do we handle skills that span multiple mol steps? ## References - beads 0.35 molecule commands: `bd mol --help`, `bd wisp --help`, `bd pour --help` - Skills repo: `~/proj/skills/` - Existing skills: worklog, orch, niri-window-capture, spec-review, etc.