# ADR-002: Skill Manifest Format ## Status Draft (Revised) ## Context Skills currently have minimal frontmatter (name, description). For molecule integration, we need skills to declare their interface so: - Beads can validate proto→skill bindings before spawning - Agents know what inputs to provide - Traces can record what was expected vs actual - Errors are clear when requirements aren't met ## Decision Extend SKILL.md frontmatter with optional manifest fields. All new fields are optional for backward compatibility. ### Proposed Schema ```yaml --- manifest_version: "1.0" name: worklog description: Create org-mode worklogs documenting work sessions. # Version info for reproducibility version: 1.0.0 # What the skill needs to run inputs: required: - name: session_date description: Date of the session (YYYY-MM-DD) schema: type: string pattern: "^\\d{4}-\\d{2}-\\d{2}$" optional: - name: topic description: Brief topic descriptor for filename schema: type: string default: "session" - name: output_dir description: Directory for worklog output (relative to repo root) schema: type: string default: "docs/worklogs" - name: api_key description: API key for external service sensitive: true schema: type: string # Environment requirements env: required: [] optional: - name: PROJECT description: Project name for context - name: API_TOKEN description: Authentication token sensitive: true # Tools/commands that must be available preconditions: commands: - cmd: git min_version: "2.40" - cmd: date files: - path: scripts/extract-metrics.sh base: skill_root description: Metrics extraction script - path: templates/worklog-template.org base: skill_root description: Worklog template # What the skill produces outputs: files: - pattern: "{{output_dir}}/{{session_date}}-{{topic}}.org" base: repo_root description: The generated worklog file artifacts: [] # Execution characteristics (hints for scheduling, not enforcement) execution: idempotent: false # Safe to re-run? destructive: false # Modifies existing files? network: false # Requires network access? interactive: false # Requires user input during execution? timeout: 30 # Maximum execution time in seconds # For security classification sensitive: false # Handles sensitive data? --- ``` ### Field Definitions #### `manifest_version` Schema version for the manifest format itself. Allows parsers to handle breaking changes. - Current version: `"1.0"` - Semantic versioning: major version changes indicate breaking schema changes #### `inputs` Declares what the skill needs from the caller. Uses JSON Schema subset for type definitions. - `required`: Must be provided or skill fails - `optional`: Has defaults, can be overridden - Each input has: `name`, `description`, `schema` (JSON Schema), optional `sensitive` flag - `sensitive: true`: Mark inputs for redaction in traces **JSON Schema Subset Supported:** - `type`: string, number, integer, boolean, array, object - `pattern`: regex for string validation - `minimum`, `maximum`: numeric bounds - `items`: array item schema - `properties`: object property schemas - `default`: default value - `enum`: allowed values #### `env` Environment variables the skill reads. - `required`: Skill fails if not set - `optional`: Used if available, not fatal if missing - `sensitive: true`: Mark for redaction #### `preconditions` What must exist before skill can run. - `commands`: CLI tools that must be in PATH - `cmd`: command name (required) - `min_version`: minimum version (optional, e.g., "2.40") - `max_version`: maximum version (optional) - `files`: Files that must exist - `path`: relative path (required) - `base`: path base - `skill_root`, `repo_root`, or `cwd` (default: `skill_root`) - `description`: what this file is for (optional) **Path Resolution:** - `skill_root`: Directory containing SKILL.md - `repo_root`: Git repository root - `cwd`: Current working directory - Absolute paths (starting with `/` or `~`) are forbidden #### `outputs` What the skill produces. - `files`: File patterns with `{{var}}` substitution - `pattern`: file path pattern - `base`: path base (default: `repo_root`) - `description`: what this output is for - `artifacts`: Other outputs (stdout, state changes, etc.) #### `execution` Characteristics for scheduling and safety. **These are hints, not enforcement.** - `idempotent`: Can run multiple times safely - `destructive`: Modifies or deletes existing data - `network`: Requires internet access - `interactive`: Needs human input during run - `timeout`: Maximum execution time in seconds The runtime may use these hints to: - Schedule non-destructive skills in parallel - Warn before running destructive operations - Set execution timeouts These flags do NOT enforce sandboxing or isolation. #### `sensitive` Whether skill handles sensitive data. - `false` (default): Standard tracing, eligible for elevation - `true`: Aggressive redaction, elevation blocked without review ### Validation Rules When beads spawns a molecule with `skill:` reference: 1. Check skill exists 2. Verify `manifest_version` is supported 3. Validate all required inputs are mapped 4. Validate input schemas using JSON Schema 5. Check preconditions (commands with versions, files) 6. Check required env vars are set 7. Warn if optional inputs have no mapping 8. Apply redaction rules for `sensitive` fields ### Backward Compatibility - All new fields are optional - Skills with only `name`/`description` continue to work - Missing manifest = no validation (current behavior) - Missing `manifest_version` = assume legacy format ## Consequences ### Positive - Clear contracts between molecules and skills - Better error messages (fail before execution, not during) - JSON Schema provides standard, well-understood type system - Sensitive data handling built into manifest - Version constraints prevent runtime errors - Explicit path resolution prevents portability issues ### Negative - More frontmatter to maintain - Risk of manifest drift from actual behavior - JSON Schema subset may not cover all validation needs ### Neutral - Existing skills unaffected until they opt-in - Execution flags are hints, not enforcement ## Examples ### Minimal (current, still valid) ```yaml --- name: simple-skill description: Does something simple --- ``` ### With inputs only ```yaml --- manifest_version: "1.0" name: greeter description: Greets the user inputs: required: - name: username description: Name of the user to greet schema: type: string --- ``` ### With sensitive data ```yaml --- manifest_version: "1.0" name: deploy description: Deploy application to production inputs: required: - name: deploy_key description: SSH deployment key sensitive: true schema: type: string env: required: - name: DEPLOY_TOKEN description: Authentication token sensitive: true execution: network: true destructive: true timeout: 300 sensitive: true --- ``` ### With version constraints ```yaml --- manifest_version: "1.0" name: analyze-git description: Analyze git repository history preconditions: commands: - cmd: git min_version: "2.40" - cmd: python3 min_version: "3.11" files: - path: scripts/analyze.py base: skill_root inputs: optional: - name: since description: Analyze commits since this date schema: type: string pattern: "^\\d{4}-\\d{2}-\\d{2}$" default: "2025-01-01" outputs: files: - pattern: "reports/git-analysis-{{since}}.md" base: repo_root execution: idempotent: true timeout: 60 --- ``` ## Open Questions 1. Should `outputs` be validated after execution? 2. How to handle skills that produce variable outputs? 3. Should we allow custom JSON Schema extensions? 4. Where does skill version come from - git tag, manual, or derived?