dan/skills

dan c1f644e6a6 ADRs: add skill manifest, versioning, and trace security designs

- ADR-002: Skill manifest format with JSON Schema, path bases, preconditions
- ADR-003: Versioning with Nix store paths, lockfiles, interface contracts
- ADR-004: Trace security with HMAC redaction, entropy detection, trace modes

Refined based on orch consensus feedback from GPT and Gemini.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2025-12-23 20:55:18 -05:00

8.1 KiB

Raw Blame History

ADR-003: Skill Versioning Strategy

Status

Draft (Revised)

Context

Skills are deployed via Nix/direnv, which means:

The "installed" version is a build artifact, not just source code
Git SHA may not exist or match deployed content
Skills can reference external scripts/binaries
Protos and molecules need stable references

A single version identifier is insufficient. We need to answer:

How do we identify what version of a skill ran?
How do protos reference skills (pin vs float)?
How do we handle breaking changes?

Decision

Version Tuple

Every skill execution records a version tuple:

skill_version:
  # Primary identity - Nix store path (immutable, content-addressed)
  nix_store_path: "/nix/store/abc123-worklog-1.0.0"

  # Source identity (where it came from)
  source_ref: "git+file:///home/dan/proj/skills#worklog"
  source_rev: "abc123def"  # git SHA, null if not in git

  # Content identity (what was actually deployed)
  content_hash: "sha256:789xyz..."  # hash of skill content per algorithm below

  # Semantic version from manifest (optional)
  version: "1.0.0"

  # Deployment metadata
  deployed_at: "2025-12-23T10:00:00Z"

Identity Selection by Context

Context	Primary Identity	Rationale
Nix-deployed skills	`nix_store_path`	Immutable, content-addressed by Nix
Development/local	`content_hash`	No Nix path available
Trace replay	`nix_store_path` or `content_hash`	Exact reproducibility
Proto pinning	`content_hash` or `version`	Portable across machines

Computing `content_hash`

Hash computation must be deterministic and portable:

#!/usr/bin/env bash
# skill-content-hash.sh <skill-dir>
set -euo pipefail

SKILL_DIR="${1:-.}"
SKILL_DIR="$(cd "$SKILL_DIR" && pwd)"

# Use .skillignore if present, otherwise default exclusions
if [[ -f "$SKILL_DIR/.skillignore" ]]; then
    EXCLUDE_FILE="$SKILL_DIR/.skillignore"
else
    EXCLUDE_FILE=""
fi

# Find files, convert to relative paths, sort, hash
(
    cd "$SKILL_DIR"
    find . -type f \
        ! -path './.git/*' \
        ! -path './.skillignore' \
        ! -name '*.pyc' \
        ! -name '.DS_Store' \
        ! -name '__pycache__' \
        ${EXCLUDE_FILE:+-not -path "$(cat "$EXCLUDE_FILE" | grep -v '^#' | tr '\n' '|' | sed 's/|$//')"} \
        -print0 | \
        sort -z | \
        xargs -0 -I {} sh -c 'echo "{}"; cat "{}"' | \
        sha256sum | \
        cut -d' ' -f1
)

Critical requirements:

Use relative paths (not absolute) for portability
Include filename in hash stream (not just content)
Sort files deterministically before hashing
Exclude non-functional files via .skillignore

`.skillignore` Format

Skills can exclude files from content hash (like .gitignore):

# .skillignore - files excluded from content_hash
README.md
CHANGELOG.md
docs/
tests/
*.test.js

This allows documentation changes without invalidating version pins.

Proto Reference Modes

1. Float (default, development)

skill: worklog

Uses whatever version is currently deployed. Simple but unstable.

2. Pin to content hash (CI/automation)

skill:
  id: worklog
  content_hash: "sha256:789xyz..."

Fails if deployed skill doesn't match. Most stable for automation.

3. Pin to minimum version (published templates)

skill:
  id: worklog
  min_version: "1.0.0"

Requires skill manifest to declare version field with semantic versioning.

Lockfile Workflow

For reproducible proto execution, use proto.lock:

# my-proto.lock
# Auto-generated - do not edit manually
# Regenerate with: bd proto lock my-proto

generated_at: "2025-12-23T10:00:00Z"
beads_version: "0.35.0"

skills:
  worklog:
    content_hash: "sha256:789xyz..."
    nix_store_path: "/nix/store/abc123-worklog-1.0.0"
    version: "1.0.0"
    source_rev: "abc123def"

  deploy:
    content_hash: "sha256:456abc..."
    nix_store_path: "/nix/store/def456-deploy-2.1.0"
    version: "2.1.0"
    source_rev: "def456ghi"

Workflow:

# Development: float freely
bd mol spawn my-proto

# CI/production: lock versions
bd proto lock my-proto        # Generate/update lockfile
bd mol spawn my-proto --locked  # Fail if versions don't match lock

Lockfile should be committed to version control for reproducible builds.

Breaking Change Handling

Interface Contracts

For semantic versioning to be meaningful, skills should declare their interface contract:

# In SKILL.md manifest
interface:
  inputs:
    - session_date    # Required inputs are part of contract
    - topic           # Optional inputs with defaults
  outputs:
    - pattern: "docs/worklogs/*.org"
  env:
    - PROJECT         # Required env vars

Breaking changes (bump major version):

Renamed/removed required inputs
Changed required input types
Changed output patterns
Added new required inputs without defaults
Removed required env vars

Non-breaking changes (bump minor/patch):

Added optional inputs with defaults
Documentation changes
Bug fixes
Performance improvements

Version Validation

# When spawning a proto with pinned skill
bd mol spawn my-proto --var x=1
# → Validates skill content_hash or version matches pin
# → Fails early if mismatch

# Check for breaking changes
bd skill check-compat worklog@1.0.0 worklog@2.0.0
# → Reports interface differences

Path Sanitization

Traces should sanitize paths to avoid leaking local structure:

# Before sanitization
skill_version:
  source_ref: "git+file:///home/dan/proj/skills#worklog"
  nix_store_path: "/nix/store/abc123-worklog-1.0.0"

# After sanitization (for sharing/elevation)
skill_version:
  source_ref: "git+file://LOCAL/skills#worklog"
  nix_store_path: "/nix/store/abc123-worklog-1.0.0"  # Already safe

Sanitization patterns:

/home/<username>/ → LOCAL/
/Users/<username>/ → LOCAL/
Nix store paths are already content-addressed and safe

Recording in Traces

Wisp traces always record the full version tuple:

execution:
  skill_version:
    nix_store_path: "/nix/store/abc123-worklog-1.0.0"
    source_ref: "git+file://LOCAL/skills#worklog"  # Sanitized
    source_rev: "abc123def"
    content_hash: "sha256:789xyz..."
    version: "1.0.0"

This enables:

Replay with exact version
Diff between executions
Debugging "it worked before" issues
Portable sharing (sanitized paths)

Recommendations

Use Case	Mode	Identity	Why
Active development	Float	N/A	Iterate quickly
Local testing	Float or pin	`content_hash`	Reproducible locally
Shared proto	Pin + lock	`content_hash`	Portable across machines
Published template	Pin to version	`min_version`	Semantic compatibility
CI/automation	Locked	`content_hash`	Exact reproducibility

Consequences

Positive

Full traceability of what ran
Reproducible executions via lockfile
Clear failure when version mismatch
Supports gradual adoption (float first, pin later)
Portable hashing (relative paths)
Interface contracts enable meaningful SemVer

Negative

Content hash computation adds overhead
Pinned protos need updates when skills change
More fields to manage
Lockfile adds another file to maintain

Neutral

Float mode preserves current behavior
Version tuple is metadata, not enforcement
Nix store path available only in Nix-deployed environments

Implementation Checklist

Implement deterministic content hash script
Add .skillignore support to hash computation
Add nix_store_path capture for Nix-deployed skills
Implement bd proto lock command
Implement bd mol spawn --locked validation
Add path sanitization to trace writer
Add interface contract validation
Implement bd skill check-compat command

Open Questions

Should lockfile include transitive dependencies (skills that call other skills)?
How to handle skills that shell out to system binaries (git, curl)? Version those too?
Cache content_hash or compute on every invocation?
Should we support nix flake references directly? (e.g., github:user/skills#worklog)

8.1 KiB Raw Blame History