skills/docs/adr/002-skill-manifest-format.md

# ADR-002: Skill Manifest Format

## Status

Draft (Revised)

## Context

Skills currently have minimal frontmatter (name, description). For molecule integration, we need skills to declare their interface so:
- Beads can validate proto→skill bindings before spawning
- Agents know what inputs to provide
- Traces can record what was expected vs actual
- Errors are clear when requirements aren't met

## Decision

Extend SKILL.md frontmatter with optional manifest fields. All new fields are optional for backward compatibility.

### Proposed Schema

```yaml
---
manifest_version: "1.0"
name: worklog
description: Create org-mode worklogs documenting work sessions.

# Version info for reproducibility
version: 1.0.0

# What the skill needs to run
inputs:
  required:
    - name: session_date
      description: Date of the session (YYYY-MM-DD)
      schema:
        type: string
        pattern: "^\\d{4}-\\d{2}-\\d{2}$"
  optional:
    - name: topic
      description: Brief topic descriptor for filename
      schema:
        type: string
        default: "session"
    - name: output_dir
      description: Directory for worklog output (relative to repo root)
      schema:
        type: string
        default: "docs/worklogs"
    - name: api_key
      description: API key for external service
      sensitive: true
      schema:
        type: string

# Environment requirements
env:
  required: []
  optional:
    - name: PROJECT
      description: Project name for context
    - name: API_TOKEN
      description: Authentication token
      sensitive: true

# Tools/commands that must be available
preconditions:
  commands:
    - cmd: git
      min_version: "2.40"
    - cmd: date
  files:
    - path: scripts/extract-metrics.sh
      base: skill_root
      description: Metrics extraction script
    - path: templates/worklog-template.org
      base: skill_root
      description: Worklog template

# What the skill produces
outputs:
  files:
    - pattern: "{{output_dir}}/{{session_date}}-{{topic}}.org"
      base: repo_root
      description: The generated worklog file
  artifacts: []

# Execution characteristics (hints for scheduling, not enforcement)
execution:
  idempotent: false          # Safe to re-run?
  destructive: false         # Modifies existing files?
  network: false             # Requires network access?
  interactive: false         # Requires user input during execution?
  timeout: 30                # Maximum execution time in seconds

# For security classification
sensitive: false             # Handles sensitive data?
---
```

### Field Definitions

#### `manifest_version`
Schema version for the manifest format itself. Allows parsers to handle breaking changes.

- Current version: `"1.0"`
- Semantic versioning: major version changes indicate breaking schema changes

#### `inputs`
Declares what the skill needs from the caller. Uses JSON Schema subset for type definitions.

- `required`: Must be provided or skill fails
- `optional`: Has defaults, can be overridden
- Each input has: `name`, `description`, `schema` (JSON Schema), optional `sensitive` flag
- `sensitive: true`: Mark inputs for redaction in traces

**JSON Schema Subset Supported:**
- `type`: string, number, integer, boolean, array, object
- `pattern`: regex for string validation
- `minimum`, `maximum`: numeric bounds
- `items`: array item schema
- `properties`: object property schemas
- `default`: default value
- `enum`: allowed values

#### `env`
Environment variables the skill reads.

- `required`: Skill fails if not set
- `optional`: Used if available, not fatal if missing
- `sensitive: true`: Mark for redaction

#### `preconditions`
What must exist before skill can run.

- `commands`: CLI tools that must be in PATH
  - `cmd`: command name (required)
  - `min_version`: minimum version (optional, e.g., "2.40")
  - `max_version`: maximum version (optional)
- `files`: Files that must exist
  - `path`: relative path (required)
  - `base`: path base - `skill_root`, `repo_root`, or `cwd` (default: `skill_root`)
  - `description`: what this file is for (optional)

**Path Resolution:**
- `skill_root`: Directory containing SKILL.md
- `repo_root`: Git repository root
- `cwd`: Current working directory
- Absolute paths (starting with `/` or `~`) are forbidden

#### `outputs`
What the skill produces.

- `files`: File patterns with `{{var}}` substitution
  - `pattern`: file path pattern
  - `base`: path base (default: `repo_root`)
  - `description`: what this output is for
- `artifacts`: Other outputs (stdout, state changes, etc.)

#### `execution`
Characteristics for scheduling and safety. **These are hints, not enforcement.**

- `idempotent`: Can run multiple times safely
- `destructive`: Modifies or deletes existing data
- `network`: Requires internet access
- `interactive`: Needs human input during run
- `timeout`: Maximum execution time in seconds

The runtime may use these hints to:
- Schedule non-destructive skills in parallel
- Warn before running destructive operations
- Set execution timeouts

These flags do NOT enforce sandboxing or isolation.

#### `sensitive`
Whether skill handles sensitive data.

- `false` (default): Standard tracing, eligible for elevation
- `true`: Aggressive redaction, elevation blocked without review

### Validation Rules

When beads spawns a molecule with `skill:` reference:

1. Check skill exists
2. Verify `manifest_version` is supported
3. Validate all required inputs are mapped
4. Validate input schemas using JSON Schema
5. Check preconditions (commands with versions, files)
6. Check required env vars are set
7. Warn if optional inputs have no mapping
8. Apply redaction rules for `sensitive` fields

### Backward Compatibility

- All new fields are optional
- Skills with only `name`/`description` continue to work
- Missing manifest = no validation (current behavior)
- Missing `manifest_version` = assume legacy format

## Consequences

### Positive

- Clear contracts between molecules and skills
- Better error messages (fail before execution, not during)
- JSON Schema provides standard, well-understood type system
- Sensitive data handling built into manifest
- Version constraints prevent runtime errors
- Explicit path resolution prevents portability issues

### Negative

- More frontmatter to maintain
- Risk of manifest drift from actual behavior
- JSON Schema subset may not cover all validation needs

### Neutral

- Existing skills unaffected until they opt-in
- Execution flags are hints, not enforcement

## Examples

### Minimal (current, still valid)
```yaml
---
name: simple-skill
description: Does something simple
---
```

### With inputs only
```yaml
---
manifest_version: "1.0"
name: greeter
description: Greets the user
inputs:
  required:
    - name: username
      description: Name of the user to greet
      schema:
        type: string
---
```

### With sensitive data
```yaml
---
manifest_version: "1.0"
name: deploy
description: Deploy application to production
inputs:
  required:
    - name: deploy_key
      description: SSH deployment key
      sensitive: true
      schema:
        type: string
env:
  required:
    - name: DEPLOY_TOKEN
      description: Authentication token
      sensitive: true
execution:
  network: true
  destructive: true
  timeout: 300
sensitive: true
---
```

### With version constraints
```yaml
---
manifest_version: "1.0"
name: analyze-git
description: Analyze git repository history
preconditions:
  commands:
    - cmd: git
      min_version: "2.40"
    - cmd: python3
      min_version: "3.11"
  files:
    - path: scripts/analyze.py
      base: skill_root
inputs:
  optional:
    - name: since
      description: Analyze commits since this date
      schema:
        type: string
        pattern: "^\\d{4}-\\d{2}-\\d{2}$"
        default: "2025-01-01"
outputs:
  files:
    - pattern: "reports/git-analysis-{{since}}.md"
      base: repo_root
execution:
  idempotent: true
  timeout: 60
---
```

## Open Questions

1. Should `outputs` be validated after execution?
2. How to handle skills that produce variable outputs?
3. Should we allow custom JSON Schema extensions?
4. Where does skill version come from - git tag, manual, or derived?