skills/docs/adr/003-skill-versioning-strategy.md
dan c1f644e6a6 ADRs: add skill manifest, versioning, and trace security designs
- ADR-002: Skill manifest format with JSON Schema, path bases, preconditions
- ADR-003: Versioning with Nix store paths, lockfiles, interface contracts
- ADR-004: Trace security with HMAC redaction, entropy detection, trace modes

Refined based on orch consensus feedback from GPT and Gemini.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 20:55:18 -05:00

310 lines
8.1 KiB
Markdown

# ADR-003: Skill Versioning Strategy
## Status
Draft (Revised)
## Context
Skills are deployed via Nix/direnv, which means:
- The "installed" version is a build artifact, not just source code
- Git SHA may not exist or match deployed content
- Skills can reference external scripts/binaries
- Protos and molecules need stable references
A single version identifier is insufficient. We need to answer:
1. How do we identify what version of a skill ran?
2. How do protos reference skills (pin vs float)?
3. How do we handle breaking changes?
## Decision
### Version Tuple
Every skill execution records a version tuple:
```yaml
skill_version:
# Primary identity - Nix store path (immutable, content-addressed)
nix_store_path: "/nix/store/abc123-worklog-1.0.0"
# Source identity (where it came from)
source_ref: "git+file:///home/dan/proj/skills#worklog"
source_rev: "abc123def" # git SHA, null if not in git
# Content identity (what was actually deployed)
content_hash: "sha256:789xyz..." # hash of skill content per algorithm below
# Semantic version from manifest (optional)
version: "1.0.0"
# Deployment metadata
deployed_at: "2025-12-23T10:00:00Z"
```
#### Identity Selection by Context
| Context | Primary Identity | Rationale |
|---------|------------------|-----------|
| Nix-deployed skills | `nix_store_path` | Immutable, content-addressed by Nix |
| Development/local | `content_hash` | No Nix path available |
| Trace replay | `nix_store_path` or `content_hash` | Exact reproducibility |
| Proto pinning | `content_hash` or `version` | Portable across machines |
### Computing `content_hash`
Hash computation must be deterministic and portable:
```bash
#!/usr/bin/env bash
# skill-content-hash.sh <skill-dir>
set -euo pipefail
SKILL_DIR="${1:-.}"
SKILL_DIR="$(cd "$SKILL_DIR" && pwd)"
# Use .skillignore if present, otherwise default exclusions
if [[ -f "$SKILL_DIR/.skillignore" ]]; then
EXCLUDE_FILE="$SKILL_DIR/.skillignore"
else
EXCLUDE_FILE=""
fi
# Find files, convert to relative paths, sort, hash
(
cd "$SKILL_DIR"
find . -type f \
! -path './.git/*' \
! -path './.skillignore' \
! -name '*.pyc' \
! -name '.DS_Store' \
! -name '__pycache__' \
${EXCLUDE_FILE:+-not -path "$(cat "$EXCLUDE_FILE" | grep -v '^#' | tr '\n' '|' | sed 's/|$//')"} \
-print0 | \
sort -z | \
xargs -0 -I {} sh -c 'echo "{}"; cat "{}"' | \
sha256sum | \
cut -d' ' -f1
)
```
**Critical requirements:**
- Use relative paths (not absolute) for portability
- Include filename in hash stream (not just content)
- Sort files deterministically before hashing
- Exclude non-functional files via `.skillignore`
#### `.skillignore` Format
Skills can exclude files from content hash (like `.gitignore`):
```
# .skillignore - files excluded from content_hash
README.md
CHANGELOG.md
docs/
tests/
*.test.js
```
This allows documentation changes without invalidating version pins.
### Proto Reference Modes
#### 1. Float (default, development)
```yaml
skill: worklog
```
Uses whatever version is currently deployed. Simple but unstable.
#### 2. Pin to content hash (CI/automation)
```yaml
skill:
id: worklog
content_hash: "sha256:789xyz..."
```
Fails if deployed skill doesn't match. Most stable for automation.
#### 3. Pin to minimum version (published templates)
```yaml
skill:
id: worklog
min_version: "1.0.0"
```
Requires skill manifest to declare `version` field with semantic versioning.
### Lockfile Workflow
For reproducible proto execution, use `proto.lock`:
```yaml
# my-proto.lock
# Auto-generated - do not edit manually
# Regenerate with: bd proto lock my-proto
generated_at: "2025-12-23T10:00:00Z"
beads_version: "0.35.0"
skills:
worklog:
content_hash: "sha256:789xyz..."
nix_store_path: "/nix/store/abc123-worklog-1.0.0"
version: "1.0.0"
source_rev: "abc123def"
deploy:
content_hash: "sha256:456abc..."
nix_store_path: "/nix/store/def456-deploy-2.1.0"
version: "2.1.0"
source_rev: "def456ghi"
```
**Workflow:**
```bash
# Development: float freely
bd mol spawn my-proto
# CI/production: lock versions
bd proto lock my-proto # Generate/update lockfile
bd mol spawn my-proto --locked # Fail if versions don't match lock
```
Lockfile should be committed to version control for reproducible builds.
### Breaking Change Handling
#### Interface Contracts
For semantic versioning to be meaningful, skills should declare their interface contract:
```yaml
# In SKILL.md manifest
interface:
inputs:
- session_date # Required inputs are part of contract
- topic # Optional inputs with defaults
outputs:
- pattern: "docs/worklogs/*.org"
env:
- PROJECT # Required env vars
```
**Breaking changes** (bump major version):
- Renamed/removed required inputs
- Changed required input types
- Changed output patterns
- Added new required inputs without defaults
- Removed required env vars
**Non-breaking changes** (bump minor/patch):
- Added optional inputs with defaults
- Documentation changes
- Bug fixes
- Performance improvements
#### Version Validation
```bash
# When spawning a proto with pinned skill
bd mol spawn my-proto --var x=1
# → Validates skill content_hash or version matches pin
# → Fails early if mismatch
# Check for breaking changes
bd skill check-compat worklog@1.0.0 worklog@2.0.0
# → Reports interface differences
```
### Path Sanitization
Traces should sanitize paths to avoid leaking local structure:
```yaml
# Before sanitization
skill_version:
source_ref: "git+file:///home/dan/proj/skills#worklog"
nix_store_path: "/nix/store/abc123-worklog-1.0.0"
# After sanitization (for sharing/elevation)
skill_version:
source_ref: "git+file://LOCAL/skills#worklog"
nix_store_path: "/nix/store/abc123-worklog-1.0.0" # Already safe
```
Sanitization patterns:
- `/home/<username>/``LOCAL/`
- `/Users/<username>/``LOCAL/`
- Nix store paths are already content-addressed and safe
### Recording in Traces
Wisp traces always record the full version tuple:
```yaml
execution:
skill_version:
nix_store_path: "/nix/store/abc123-worklog-1.0.0"
source_ref: "git+file://LOCAL/skills#worklog" # Sanitized
source_rev: "abc123def"
content_hash: "sha256:789xyz..."
version: "1.0.0"
```
This enables:
- Replay with exact version
- Diff between executions
- Debugging "it worked before" issues
- Portable sharing (sanitized paths)
### Recommendations
| Use Case | Mode | Identity | Why |
|----------|------|----------|-----|
| Active development | Float | N/A | Iterate quickly |
| Local testing | Float or pin | `content_hash` | Reproducible locally |
| Shared proto | Pin + lock | `content_hash` | Portable across machines |
| Published template | Pin to version | `min_version` | Semantic compatibility |
| CI/automation | Locked | `content_hash` | Exact reproducibility |
## Consequences
### Positive
- Full traceability of what ran
- Reproducible executions via lockfile
- Clear failure when version mismatch
- Supports gradual adoption (float first, pin later)
- Portable hashing (relative paths)
- Interface contracts enable meaningful SemVer
### Negative
- Content hash computation adds overhead
- Pinned protos need updates when skills change
- More fields to manage
- Lockfile adds another file to maintain
### Neutral
- Float mode preserves current behavior
- Version tuple is metadata, not enforcement
- Nix store path available only in Nix-deployed environments
## Implementation Checklist
- [ ] Implement deterministic content hash script
- [ ] Add `.skillignore` support to hash computation
- [ ] Add `nix_store_path` capture for Nix-deployed skills
- [ ] Implement `bd proto lock` command
- [ ] Implement `bd mol spawn --locked` validation
- [ ] Add path sanitization to trace writer
- [ ] Add interface contract validation
- [ ] Implement `bd skill check-compat` command
## Open Questions
1. Should lockfile include transitive dependencies (skills that call other skills)?
2. How to handle skills that shell out to system binaries (git, curl)? Version those too?
3. Cache content_hash or compute on every invocation?
4. Should we support nix flake references directly? (e.g., `github:user/skills#worklog`)