dan/skills

dan c1f644e6a6 ADRs: add skill manifest, versioning, and trace security designs

- ADR-002: Skill manifest format with JSON Schema, path bases, preconditions
- ADR-003: Versioning with Nix store paths, lockfiles, interface contracts
- ADR-004: Trace security with HMAC redaction, entropy detection, trace modes

Refined based on orch consensus feedback from GPT and Gemini.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2025-12-23 20:55:18 -05:00

9.6 KiB

Raw Blame History

ADR-004: Trace Security and Redaction Policy

Status

Draft (Revised)

Context

Skill execution traces (wisps) capture:

Environment variables
Input arguments
Tool calls with arguments
File paths and contents
Stdout/stderr

This data often contains secrets:

API keys (AWS, GitHub, OpenAI)
Tokens and passwords
PII (usernames, emails, paths)
Proprietary data

Wisps are gitignored but still risky:

Local machine compromise
Accidental sharing
Squashing into public digests
Elevation to published skills

Decision

Default-Deny Policy

Traces capture minimal information by default. Sensitive data requires explicit opt-in.

HMAC-Based Redaction

Instead of plain [REDACTED], use HMAC hashing to enable correlation without revealing values:

# Format: [REDACTED:hmac:<first-8-chars-of-hmac>]
inputs:
  api_token: "[REDACTED:hmac:a1b2c3d4]"
  other_token: "[REDACTED:hmac:a1b2c3d4]"  # Same value = same hash
  different_token: "[REDACTED:hmac:e5f6g7h8]"  # Different value

Benefits:

Can detect if same secret was used across executions
Can correlate inputs without knowing values
HMAC key is per-session, not stored in trace

Implementation:

import hmac
import hashlib

def redact_sensitive(value: str, session_key: bytes) -> str:
    """Redact value with HMAC for correlation."""
    h = hmac.new(session_key, value.encode(), hashlib.sha256)
    return f"[REDACTED:hmac:{h.hexdigest()[:8]}]"

Environment Variables

Default: Only capture from allowlist.

trace_env_allowlist:
  - USER
  - HOME
  - PROJECT
  - PWD
  - SHELL
  - TERM
  - LANG
  - TZ

Never capture (hardcoded denylist with glob patterns):

trace_env_denylist:
  - "*_KEY"
  - "*_SECRET"
  - "*_TOKEN"
  - "*_PASSWORD"
  - "*_CREDENTIAL*"
  - "AWS_*"
  - "GITHUB_TOKEN"
  - "OPENAI_API_KEY"
  - "ANTHROPIC_API_KEY"

Input Arguments

Default: Inputs are NOT captured unless explicitly marked safe.

Skills must opt-in to capture inputs in manifest:

inputs:
  required:
    - name: api_token
      type: string
      sensitive: true    # Default, will be HMAC-redacted
    - name: project_name
      type: string
      sensitive: false   # Explicitly safe to capture

Rationale: Safer to miss debugging data than leak secrets. Most inputs can be reconstructed from context.

Trace output:

inputs:
  api_token: "[REDACTED:hmac:a1b2c3d4]"
  project: "my-project"  # Only captured because sensitive: false

Tool Calls

Default: Capture command name and exit code. Parse arguments structurally.

Structured Argument Parsing

Instead of regex on raw strings, parse arguments properly:

tool_calls:
  - cmd: "curl"
    parsed_args:
      url: "https://api.example.com/endpoint"
      headers:
        - name: "Authorization"
          value: "[REDACTED:hmac:b2c3d4e5]"
        - name: "Content-Type"
          value: "application/json"
      method: "POST"
    exit_code: 0
    duration_ms: 1234

For commands we don't have parsers for, fall back to pattern redaction:

tool_calls:
  - cmd: "unknown-tool"
    raw_args: "--token [REDACTED:hmac:c3d4e5f6] --output file.txt"
    exit_code: 0

Known Command Parsers

Implement argument parsers for common commands:

curl: Parse -H, --header, -u, --user, -d, --data
git: Parse credentials in URLs, -c config values
aws: Parse --profile, environment-based auth
docker: Parse -e, --env, registry auth

Fallback Redaction Patterns

For unparsed commands, apply pattern matching:

Bearer [^\s]+        → Bearer [REDACTED:hmac:...]
token=[^\s&]+        → token=[REDACTED:hmac:...]
password=[^\s&]+     → password=[REDACTED:hmac:...]
-p [^\s]+            → -p [REDACTED:hmac:...]
--password[= ][^\s]+ → --password [REDACTED:hmac:...]

Entropy Detection

Catch secrets that slip through pattern matching:

import math
from collections import Counter

def entropy(s: str) -> float:
    """Calculate Shannon entropy of string."""
    if not s:
        return 0
    counts = Counter(s)
    probs = [c / len(s) for c in counts.values()]
    return -sum(p * math.log2(p) for p in probs)

def looks_like_secret(value: str) -> bool:
    """Heuristic: high entropy + sufficient length = probably secret."""
    if len(value) < 16:
        return False
    if entropy(value) > 4.5:  # Random strings typically > 4.5
        return True
    return False

Apply entropy detection to:

Unrecognized command arguments
Environment variable values (before allowlist check)
Input values marked sensitive: false (as safety check)

Stdin

Never capture stdin. Sensitive data is often piped:

Passwords via echo $PASS | command
API responses with tokens
File contents with secrets

tool_calls:
  - cmd: "some-command"
    stdin: "[NOT_CAPTURED]"  # Always this value
    exit_code: 0

File Contents

Default: Never capture file contents.

Only capture:

File path (sanitized per ADR-003)
File size
Content hash (sha256)
Action (created/modified/deleted)

outputs:
  artifacts:
    - path: "docs/worklogs/2025-12-23.org"
      size: 2048
      sha256: "abc123..."
      action: created
      # content: NOT CAPTURED

Stdout/Stderr

Default: Not captured.

Opt-in: Skill can enable with automatic redaction:

execution:
  capture_output: true
  output_max_lines: 100

Output is run through:

Pattern redaction
Entropy detection
HMAC replacement

Before storage.

Trace Modes

Support different capture levels for different contexts:

Mode	Use Case	Capture Level
`local`	Debugging on your machine	More permissive, still redacts secrets
`export`	Sharing with others	Aggressive redaction, path sanitization
`elevation`	Promoting to skill	Maximum redaction, human review required

# In trace metadata
trace:
  mode: local
  redaction_version: "1.0"
  session_key_id: "abc123"  # For HMAC correlation within session

Mode transitions:

# Local trace (default)
bd wisp show <id>

# Export for sharing
bd wisp export <id> --mode=export > trace.yaml

# Prepare for elevation
bd wisp export <id> --mode=elevation > trace.yaml
# → Requires manual review before elevation proceeds

Classification Levels

Skills declare classification in manifest:

Level	Description	Trace Policy
`public`	Safe to share externally	Standard redaction, can elevate
`internal`	Normal internal use	Standard redaction, elevation requires review
`secret`	Contains sensitive data	Maximum redaction, elevation blocked

# In SKILL.md frontmatter
classification: internal

Behavior by classification:

public: Standard tracing, eligible for elevation
internal: Standard tracing, elevation requires --force and review
secret:
- All inputs treated as sensitive
- Env vars: Only allowlist
- Tool args: Maximum redaction + entropy detection
- Elevation: Blocked entirely

Elevation Gate

When elevating a molecule to a skill:

Check skill classification
If secret: Block with error
If internal: Warn, require --force, show redaction summary
If public: Proceed with standard review

$ bd elevate mol-123
Error: Molecule used skill with classification=secret
       Cannot elevate without manual review

$ bd elevate mol-456
Warning: Molecule used internal skill.
         Redacted fields: api_token, auth_header, 3 env vars
         Review trace for sensitive data.
         Use --force to proceed.

$ bd elevate mol-456 --force
Elevated to skill draft: skills/new-skill/
Please review before publishing.

Configuration

Users can extend allowlist/denylist in .beads/config.yaml:

trace:
  mode: local  # default mode
  env_allowlist:
    - MY_SAFE_VAR
  env_denylist:
    - MY_SECRET_*
  redact_patterns:
    - "my-api-key-[a-z0-9]+"
  entropy_threshold: 4.5  # Adjust sensitivity

Consequences

Positive

Secrets don't leak into traces by default
HMAC enables correlation without revealing values
Entropy detection catches novel secret patterns
Structured parsing more reliable than regex
Clear mode separation for different contexts
Defense in depth (patterns + entropy + opt-in)

Negative

Less data available for debugging (especially in export mode)
HMAC adds computational overhead
Entropy detection may have false positives
Structured parsers need maintenance per command
Configuration complexity

Neutral

Existing wisps unaffected (new policy applies going forward)
Trade-off between utility and safety favors safety
Local mode still provides reasonable debugging data

Implementation Checklist

Implement HMAC redaction with session keys
Implement env var filtering with allowlist/denylist
Add sensitive field support to manifest parser (default true)
Build structured argument parsers for curl, git, aws, docker
Implement fallback pattern redaction
Implement entropy detection
Add stdin never-capture enforcement
Implement trace modes (local/export/elevation)
Add classification field to manifest
Implement elevation gate with redaction summary
Add config.yaml trace section support
Document patterns, allowlists, and entropy thresholds

Open Questions

Should HMAC keys be derivable from trace metadata for authorized replay?
How to handle secrets in multi-line values (JSON blobs, certificates)?
Should we offer a "paranoid mode" that captures nothing but exit codes?
How to detect and handle base64-encoded secrets?

9.6 KiB Raw Blame History