skills/docs/adr/003-skill-versioning-strategy.md
dan c1f644e6a6 ADRs: add skill manifest, versioning, and trace security designs
- ADR-002: Skill manifest format with JSON Schema, path bases, preconditions
- ADR-003: Versioning with Nix store paths, lockfiles, interface contracts
- ADR-004: Trace security with HMAC redaction, entropy detection, trace modes

Refined based on orch consensus feedback from GPT and Gemini.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 20:55:18 -05:00

8.1 KiB

ADR-003: Skill Versioning Strategy

Status

Draft (Revised)

Context

Skills are deployed via Nix/direnv, which means:

  • The "installed" version is a build artifact, not just source code
  • Git SHA may not exist or match deployed content
  • Skills can reference external scripts/binaries
  • Protos and molecules need stable references

A single version identifier is insufficient. We need to answer:

  1. How do we identify what version of a skill ran?
  2. How do protos reference skills (pin vs float)?
  3. How do we handle breaking changes?

Decision

Version Tuple

Every skill execution records a version tuple:

skill_version:
  # Primary identity - Nix store path (immutable, content-addressed)
  nix_store_path: "/nix/store/abc123-worklog-1.0.0"

  # Source identity (where it came from)
  source_ref: "git+file:///home/dan/proj/skills#worklog"
  source_rev: "abc123def"  # git SHA, null if not in git

  # Content identity (what was actually deployed)
  content_hash: "sha256:789xyz..."  # hash of skill content per algorithm below

  # Semantic version from manifest (optional)
  version: "1.0.0"

  # Deployment metadata
  deployed_at: "2025-12-23T10:00:00Z"

Identity Selection by Context

Context Primary Identity Rationale
Nix-deployed skills nix_store_path Immutable, content-addressed by Nix
Development/local content_hash No Nix path available
Trace replay nix_store_path or content_hash Exact reproducibility
Proto pinning content_hash or version Portable across machines

Computing content_hash

Hash computation must be deterministic and portable:

#!/usr/bin/env bash
# skill-content-hash.sh <skill-dir>
set -euo pipefail

SKILL_DIR="${1:-.}"
SKILL_DIR="$(cd "$SKILL_DIR" && pwd)"

# Use .skillignore if present, otherwise default exclusions
if [[ -f "$SKILL_DIR/.skillignore" ]]; then
    EXCLUDE_FILE="$SKILL_DIR/.skillignore"
else
    EXCLUDE_FILE=""
fi

# Find files, convert to relative paths, sort, hash
(
    cd "$SKILL_DIR"
    find . -type f \
        ! -path './.git/*' \
        ! -path './.skillignore' \
        ! -name '*.pyc' \
        ! -name '.DS_Store' \
        ! -name '__pycache__' \
        ${EXCLUDE_FILE:+-not -path "$(cat "$EXCLUDE_FILE" | grep -v '^#' | tr '\n' '|' | sed 's/|$//')"} \
        -print0 | \
        sort -z | \
        xargs -0 -I {} sh -c 'echo "{}"; cat "{}"' | \
        sha256sum | \
        cut -d' ' -f1
)

Critical requirements:

  • Use relative paths (not absolute) for portability
  • Include filename in hash stream (not just content)
  • Sort files deterministically before hashing
  • Exclude non-functional files via .skillignore

.skillignore Format

Skills can exclude files from content hash (like .gitignore):

# .skillignore - files excluded from content_hash
README.md
CHANGELOG.md
docs/
tests/
*.test.js

This allows documentation changes without invalidating version pins.

Proto Reference Modes

1. Float (default, development)

skill: worklog

Uses whatever version is currently deployed. Simple but unstable.

2. Pin to content hash (CI/automation)

skill:
  id: worklog
  content_hash: "sha256:789xyz..."

Fails if deployed skill doesn't match. Most stable for automation.

3. Pin to minimum version (published templates)

skill:
  id: worklog
  min_version: "1.0.0"

Requires skill manifest to declare version field with semantic versioning.

Lockfile Workflow

For reproducible proto execution, use proto.lock:

# my-proto.lock
# Auto-generated - do not edit manually
# Regenerate with: bd proto lock my-proto

generated_at: "2025-12-23T10:00:00Z"
beads_version: "0.35.0"

skills:
  worklog:
    content_hash: "sha256:789xyz..."
    nix_store_path: "/nix/store/abc123-worklog-1.0.0"
    version: "1.0.0"
    source_rev: "abc123def"

  deploy:
    content_hash: "sha256:456abc..."
    nix_store_path: "/nix/store/def456-deploy-2.1.0"
    version: "2.1.0"
    source_rev: "def456ghi"

Workflow:

# Development: float freely
bd mol spawn my-proto

# CI/production: lock versions
bd proto lock my-proto        # Generate/update lockfile
bd mol spawn my-proto --locked  # Fail if versions don't match lock

Lockfile should be committed to version control for reproducible builds.

Breaking Change Handling

Interface Contracts

For semantic versioning to be meaningful, skills should declare their interface contract:

# In SKILL.md manifest
interface:
  inputs:
    - session_date    # Required inputs are part of contract
    - topic           # Optional inputs with defaults
  outputs:
    - pattern: "docs/worklogs/*.org"
  env:
    - PROJECT         # Required env vars

Breaking changes (bump major version):

  • Renamed/removed required inputs
  • Changed required input types
  • Changed output patterns
  • Added new required inputs without defaults
  • Removed required env vars

Non-breaking changes (bump minor/patch):

  • Added optional inputs with defaults
  • Documentation changes
  • Bug fixes
  • Performance improvements

Version Validation

# When spawning a proto with pinned skill
bd mol spawn my-proto --var x=1
# → Validates skill content_hash or version matches pin
# → Fails early if mismatch

# Check for breaking changes
bd skill check-compat worklog@1.0.0 worklog@2.0.0
# → Reports interface differences

Path Sanitization

Traces should sanitize paths to avoid leaking local structure:

# Before sanitization
skill_version:
  source_ref: "git+file:///home/dan/proj/skills#worklog"
  nix_store_path: "/nix/store/abc123-worklog-1.0.0"

# After sanitization (for sharing/elevation)
skill_version:
  source_ref: "git+file://LOCAL/skills#worklog"
  nix_store_path: "/nix/store/abc123-worklog-1.0.0"  # Already safe

Sanitization patterns:

  • /home/<username>/LOCAL/
  • /Users/<username>/LOCAL/
  • Nix store paths are already content-addressed and safe

Recording in Traces

Wisp traces always record the full version tuple:

execution:
  skill_version:
    nix_store_path: "/nix/store/abc123-worklog-1.0.0"
    source_ref: "git+file://LOCAL/skills#worklog"  # Sanitized
    source_rev: "abc123def"
    content_hash: "sha256:789xyz..."
    version: "1.0.0"

This enables:

  • Replay with exact version
  • Diff between executions
  • Debugging "it worked before" issues
  • Portable sharing (sanitized paths)

Recommendations

Use Case Mode Identity Why
Active development Float N/A Iterate quickly
Local testing Float or pin content_hash Reproducible locally
Shared proto Pin + lock content_hash Portable across machines
Published template Pin to version min_version Semantic compatibility
CI/automation Locked content_hash Exact reproducibility

Consequences

Positive

  • Full traceability of what ran
  • Reproducible executions via lockfile
  • Clear failure when version mismatch
  • Supports gradual adoption (float first, pin later)
  • Portable hashing (relative paths)
  • Interface contracts enable meaningful SemVer

Negative

  • Content hash computation adds overhead
  • Pinned protos need updates when skills change
  • More fields to manage
  • Lockfile adds another file to maintain

Neutral

  • Float mode preserves current behavior
  • Version tuple is metadata, not enforcement
  • Nix store path available only in Nix-deployed environments

Implementation Checklist

  • Implement deterministic content hash script
  • Add .skillignore support to hash computation
  • Add nix_store_path capture for Nix-deployed skills
  • Implement bd proto lock command
  • Implement bd mol spawn --locked validation
  • Add path sanitization to trace writer
  • Add interface contract validation
  • Implement bd skill check-compat command

Open Questions

  1. Should lockfile include transitive dependencies (skills that call other skills)?
  2. How to handle skills that shell out to system binaries (git, curl)? Version those too?
  3. Cache content_hash or compute on every invocation?
  4. Should we support nix flake references directly? (e.g., github:user/skills#worklog)